• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Computational Methodology for Knowledge Discovery

Research Project

Project/Area Number 10143101
Research Category

Grant-in-Aid for Scientific Research on Priority Areas (A)

Allocation TypeSingle-year Grants
Research InstitutionTohoku University

Principal Investigator

MARUOKA Akira  Tohoku Univ., Graduate School of Information Sciences, Professor, 大学院・情報科学研究科, 教授 (50005427)

Co-Investigator(Kenkyū-buntansha) SHINOHARA Ayumi  Kyushu Univ., Dept. of Informatics, Associate Professor, 大学院・システム情報科学研究科, 助教授 (00226151)
IMAI Hiroshi  Univ. of Tokyo, Dept. of Information Science, Associate Professor, 大学院・理学系研究科, 助教授 (80183010)
ABE Naoki  I. B. M. Thomas J. Watson Research Center, Researcher, トーマスワトソン研究センター, 研究員
WATANABE Osamu  Tokyo Institute of Technology, Dept. of Math. and Comp. Science, Professor, 大学院・情報理工学研究科, 教授 (80158617)
TAKASU Atsuhiro  National Institute of Informatics, Data Engineering Research, Software Research Division, Associate Professor, ソフトウェア研究系・データ工学研究部門, 助教授 (90216648)
Project Period (FY) 1998 – 2000
Project Status Completed (Fiscal Year 2001)
Budget Amount *help
¥79,700,000 (Direct Cost: ¥79,700,000)
Fiscal Year 2000: ¥21,800,000 (Direct Cost: ¥21,800,000)
Fiscal Year 1999: ¥21,600,000 (Direct Cost: ¥21,600,000)
Fiscal Year 1998: ¥36,300,000 (Direct Cost: ¥36,300,000)
Keywordslearning / sampling / boosting / linear classifier / search for subsequence patterns / text categorization / MDL-based compression / semi-structured data / 特徴空間の幾何学構造 / 学習可能性 / エキスパートオンラインモデル / 決定リスト / 適応型サンプリング / 質問学習 / 能動学習 / クラスタリング / 枝刈り / 方向選択性 / 強化学習
Research Abstract

The amount of data collected from various fields is growing exponentially and the task of analyzing data to extract useful information behind it is becoming more and more difficult accordingly. To extract useful information from data, there must be certain appropriate interaction between the extraction process and data. Through the interaction various processes, such as memorizing certain information, Iearning, evolution, and possibly discovering knowledge will be performed. The major hurdles to automatically extracting knowledge from huge amount of data is the limitations on computational resources. Group A03 aims to propose and develop computational models and methodologies for knowledge discovery. To achieve the purpose we explore various topics including algorithms dealing with heterogeneous data which may be strongly structured or poorly structured.
Among the results of this project, the ones concerning computational mechanisms to find efficiently effective rules from very large databases are as follows : Efficient mining from large databases by query learning ; A modification of AdaBoost for adaptive sampling methods ; Tree-based boosting using linear classifier ; The minimax strategy for Gaussian density estimation. Furthermore, algorithms to solve certain concrete problems are developed ; A practical algorithm to find the best subsequence patterns ; Biological sequence compression algorithms - Learning via compression schemes ; Effect of sample size in text categorization ; Knowledge discovery by using both experimental and theoretical methods ; Discovery of commonality among definition sentences by MDL-based compression.

Report

(4 results)
  • 2001 Final Research Report Summary
  • 2000 Annual Research Report
  • 1999 Annual Research Report
  • 1998 Annual Research Report
  • Research Products

    (31 results)

All Other

All Publications (31 results)

  • [Publications] A.Maruoka: "Predicting nearly as well as the best pruning of a decision tree through dynamic programming scheme"Theoretical Computer Science. 261(1). 179-209 (2001)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2001 Final Research Report Summary
  • [Publications] N.Abe: "Efficient mining from large databases by query learning"The 17^<th> International Conference on Machine Learning. 17. 575-582 (2000)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2001 Final Research Report Summary
  • [Publications] H.Imai: "Variance-Based k-Clustering Algorithms by Voronoi Diagrams and Randomization"IEICE Trans.Information and Systems. E83-D. 1199-1206 (2000)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2001 Final Research Report Summary
  • [Publications] A.Shinohara: "A Practical Algorithm to Find the Best Subsequence Patterns"Proc.3rd International Conference on Discovery Science(DS2000). LNAI 1967. 141-154 (2000)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2001 Final Research Report Summary
  • [Publications] 高須 淳宏: "学術文献画像の書誌情報の近似マッチング法"情報処理学会論文誌:データベース. 42,SIG-1. 148-158 (2001)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2001 Final Research Report Summary
  • [Publications] O.Watanabe: "Adaptive Sampling Methods for Scaling Up Knowledge Discovery Algorithms"Data Mining Knowledge and Discovery. 6(2)(to appear). (2002)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2001 Final Research Report Summary
  • [Publications] A.Maruoka, E.Takimoto: "Encyclopedia of Computer Science and Technology Vol.45"Marcel Dekker,Inc.. 448 (2002)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2001 Final Research Report Summary
  • [Publications] A. Maruoka: "Predicting nearly as well as the best pruning of a decision tree through dynamic programming scheme"Theoretical Computer Science. 261(1). 179-209 (2001)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2001 Final Research Report Summary
  • [Publications] N. Abe: "Efficient mining from large databases by query learning"The 17th International Conference on Machine Learning. 17. 575-582 (2000)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2001 Final Research Report Summary
  • [Publications] H. Imai: "Variance-Based k-Clustering Algorithm by Voronoi Diagrams and Randomization"IEICE Trans. Information and Systems. E83-D. 1199-1206 (2000)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2001 Final Research Report Summary
  • [Publications] A. Shinohara: "A Practical Algorithm to Find the Best Subsequence Patterns"Proc. 3rd International Conference on Discovery Science (DS2000), LNAI 1967. 141-154 (2000)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2001 Final Research Report Summary
  • [Publications] A. Takasu: "An Approximate Matching Method for Bibliographic Data in Academic Article Images"IPSJ Transactions on Databases. Vol.42, No.SIG01. 148-158 (2001)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2001 Final Research Report Summary
  • [Publications] O. Watanabe: "Adaptive Sampling Methods for Scaling Up Knowledge Discovery Algorithms"Data Mining Knowledge and Discovery. (to appear), Vol.6, No.2. (2002)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2001 Final Research Report Summary
  • [Publications] Maruoka Akira: "On-line Estimation of Hidden Markov Model Parameters"Lecture Notes in Artificial Intelligence. 1967. 155-169 (2000)

    • Related Report
      2000 Annual Research Report
  • [Publications] Abe Naoki: "Efficient mining from large databases by query learning"The 17th International Conference on Machine Learning. Vol.17. 575-582 (2000)

    • Related Report
      2000 Annual Research Report
  • [Publications] Imai Hiroshi: "Variance-Based k-Clustering Algorithms by Voronoi Diagrams and Randomization"IEICE Trans. Information and Systems. Vol.E83-D. 1199-1206 (2000)

    • Related Report
      2000 Annual Research Report
  • [Publications] Shinohara Ayumi: "A practical algorithm to find the best subsequence patterns"Proc. 3rd International Conference on Discovery Science. LNAI1967. 141-154 (2000)

    • Related Report
      2000 Annual Research Report
  • [Publications] Takasu Atsuhiro: "学術文献画像の書誌情報の近似マッチング法"情報処理学会論文誌:データベース. Vol.42. 148-158 (2001)

    • Related Report
      2000 Annual Research Report
  • [Publications] Watanabe Osamu: "MadaBoost : A modification of Ada Boost"Proc. of the 13th Conference on Computational Learning Theory. Vol.13. 180-189 (2000)

    • Related Report
      2000 Annual Research Report
  • [Publications] Maruoka Akira: "Proper Learning Algorithm for Functions of k Terms under Smooth Distributions"Information and Computation. 152. 188-204 (1999)

    • Related Report
      1999 Annual Research Report
  • [Publications] Abe Naoki: "Associative Reinforcement Learning with Linear Probabilistic Concepts"Proceedings of the 16th International Conference on Machine Learning. 3-11 (1999)

    • Related Report
      1999 Annual Research Report
  • [Publications] Imai Hiroshi: "Finding Meaningful Regions Containing Given Keywords from Large Text Collections"Lecture Notes in Artificial Intelligence. 1721. 353-354 (1999)

    • Related Report
      1999 Annual Research Report
  • [Publications] Shinohara Ayumi: "Shift-And approach to pattern matching in LZW compressed text"Lecture Notes in Computer Scienc. 1645. 1-13 (1999)

    • Related Report
      1999 Annual Research Report
  • [Publications] Takasu Atsuhiro: "Music Structure Analysis and Its Application to Theme Phrase Extraction"Proceedings on the Third European Conference on Research and Advanced Technology for Digital Libraries. 92-105 (1999)

    • Related Report
      1999 Annual Research Report
  • [Publications] Watanabe Osamu: "From computational learning theory to discovery science"Lecture Notes in Computer Scienc. 1644. 134-148 (1999)

    • Related Report
      1999 Annual Research Report
  • [Publications] Maruola Akira: "Structured Weight-Based Prediction Algorithms" Lecture Notes in Artificial Intelligence. 1501. 127-142 (1998)

    • Related Report
      1998 Annual Research Report
  • [Publications] Abe Naoki: "Empirical Comparison of Competing Query Learning Strategies" Lecture Notes in Artificial Intelligence. 1532. 387-388 (1998)

    • Related Report
      1998 Annual Research Report
  • [Publications] Imai Hiroshi: "Geometric Clustering Models in Feature Space" Lecture Notes in Artificial Intelligence. 1532. 421-422 (1998)

    • Related Report
      1998 Annual Research Report
  • [Publications] Shinohara Ayumi: "Uniform Characterizations of Polynomial-query Learnabilities" Lecture Notes in Artificial Intelligence. 1532. 84-92 (1998)

    • Related Report
      1998 Annual Research Report
  • [Publications] Takasu Atsuhiro: "On the number of clusters in cluster analysis" Lecture Notes in Artificial Intelligence. 1532. 419-420 (1998)

    • Related Report
      1998 Annual Research Report
  • [Publications] Watanabe Osamu: "A Role of Constraint in Self-Organization" Proceedings of the 2nd International Workshop. 307-318 (1998)

    • Related Report
      1998 Annual Research Report

URL: 

Published: 1998-04-01   Modified: 2016-04-21  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi