2011 Fiscal Year Final Research Report
Research on statistical discovery of a wide var i ety of patterns with low frequencies and its applications
Project/Area Number |
21650031
|
Research Category |
Grant-in-Aid for Challenging Exploratory Research
|
Allocation Type | Single-year Grants |
Research Field |
Intelligent informatics
|
Research Institution | Kyushu University |
Principal Investigator |
IKEDA Daisuke 九州大学, 大学院・システム情報科学研究院, 准教授 (00294992)
|
Co-Investigator(Kenkyū-buntansha) |
NAKATOH Tetsuya 九州大学, 情報基盤研究開発センター, 助教 (20253502)
YAMADA Yasuhiro 島根大学, 総合理工学部, 助教 (50529609)
|
Co-Investigator(Renkei-kenkyūsha) |
BABA Kensuke 九州大学, 附属図書館, 准教授 (70380681)
|
Project Period (FY) |
2009 – 2011
|
Keywords | 知識発見とデータマイニング / テキストマイニング / パターン発見 |
Research Abstract |
The goal of this research is to develop a framework to, given large text data, discover patterns which do not appear frequently. To achieve this goal, we review our existing researches from the following two viewpoints : Mapping of letters for pattern discovery : Using an approximate pattern matching, we have proposed a pattern discovery and evaluated by experiments. In this method, we have found that mapping from several letters into one digit plays an important role. Metric space for pattern discovery : The goal of this topic is to distinguish patterns from non用atterns. Instead of a rigid metric space, we first find usual substructures, and then we find a pattern as a combination of usual substructures. We have evaluated this method by experiments on genome sequences and Web documents.
|