2011 Fiscal Year Final Research Report

Information Navigation using Statistical Rhymes

Research Project

Project/Area Number	22700150
Research Category	Grant-in-Aid for Young Scientists (B)
Allocation Type	Single-year Grants
Research Field	Intelligent informatics
Research Institution	Nagasaki University
Principal Investigator	MASADA Tomonari 長崎大学, 大学院・工学研究科, 准教授 (60413928)
Project Period (FY)	2010 – 2011
Keywords	データマイニング / 確率モデル / ベイズ理論 / トピックモデル / 並列化
Research Abstract	This project is based on the following assumption : Words that co-occur in statistically significant frequency can be used as a guide in useful information navigation system even when those co-occurrences are not based on semantic similarity or relatedness. We call such co-occurrences statistical rhyme. We have been trying to extract statistical rhymes with Bayesian probabilistic models. We consequently succeeded in proposing a new LDA(latent Dirichlet allocation)-like topic extraction method that can give a segmentation of word token sequences appearing in bibliographic data, which we can observe in references section of academic papers or in publications section of researchers' Web sites. Our method split each bibliographic data into the segments each corresponding to different data field, e. g. authors, paper title, journal, pages, publication year, etc. Further, we improved segmentation accuracy by making the inference semi-supervised.

[Journal Article] 潜在的置換による書誌要素の教師無し分割2011
- Author(s)
  正田備也
- Journal Title
  
  IJOCI
  
  Volume: 第2巻、第2号 Pages: 49-62
- Peer Reviewed
[Presentation] 潜在的置換による書誌要素の半教師付き分割2011
- Author(s)
  正田備也、高須淳宏、柴田裕一郎、小栗清
- Organizer
  シュプリンガー・レクチャー・ノーツ・イン・コンピュータ・サイエンス
- Year and Date
  2011-10-25
[Presentation] 潜在的置換による書誌要素の教師無し分割2010
- Author(s)
  正田備也、柴田裕一郎、小栗清
- Organizer
  シュプリンガー・レクチャー・ノーツ・イン・コンピュータ・サイエンス
- Year and Date
  2010-12-12
[Remarks] 以下は、本研究の成果を含む内容が表示されている、研究代表者のWebサイトである。
- URL
  http://diversity-mining-lab.wikispaces.com/