• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Hierarchical Discovery of Sub-structures and Rare Patterns of Them in Large Text Data

Research Project

Project/Area Number 24300059
Research Category

Grant-in-Aid for Scientific Research (B)

Allocation TypePartial Multi-year Fund
Section一般
Research Field Intelligent informatics
Research InstitutionKyushu University

Principal Investigator

IKEDA Daisuke  九州大学, システム情報科学研究科(研究院, 准教授 (00294992)

Co-Investigator(Kenkyū-buntansha) NAKATOH Tetsuya  九州大学, 情報基盤研究開発センター, 助教 (20253502)
YAMADA Yasuhiro  島根大学, 大学院総合理工学研究科, 助教 (50529609)
Co-Investigator(Renkei-kenkyūsha) BABA Kensuke  九州大学, 附属図書館, 准教授 (70380681)
Project Period (FY) 2012-04-01 – 2015-03-31
Project Status Completed (Fiscal Year 2014)
Budget Amount *help
¥9,230,000 (Direct Cost: ¥7,100,000、Indirect Cost: ¥2,130,000)
Fiscal Year 2014: ¥2,860,000 (Direct Cost: ¥2,200,000、Indirect Cost: ¥660,000)
Fiscal Year 2013: ¥3,640,000 (Direct Cost: ¥2,800,000、Indirect Cost: ¥840,000)
Fiscal Year 2012: ¥2,730,000 (Direct Cost: ¥2,100,000、Indirect Cost: ¥630,000)
Keywords例外文字列パタン / 純度の高いパタン / purity measure / テキストマイニング / 稀少パタン発見 / 例外パタン / 近似文字列照合 / purity / 希少パタン発見
Outline of Final Research Achievements

This research is devoted to finding infrequent patterns of frequent sub-patterns from large text data. Because the text data follows Zipf's law, there exist so many infrequent patterns. Therefore, the goal is quite challenging. Among so many candidates of infrequent patterns, we try to find relatively many, but absolutely few, composite patterns of frequent patterns.
To do so, our two basic approaches are to extend the framework of peculiar patterns we have already developed and to create a new framework based on pure patterns. For both approaches, we evaluated their effectiveness using bacterial genome sequences. In addition to them, we developed mining methods for data in various fields, such as clustering geotagged blogs, context-aware information retrieval, and query expansion for academic theses.

Report

(4 results)
  • 2014 Annual Research Report   Final Research Report ( PDF )
  • 2013 Annual Research Report
  • 2012 Annual Research Report
  • Research Products

    (16 results)

All 2015 2014 2013 2012

All Journal Article (8 results) (of which Peer Reviewed: 8 results) Presentation (8 results)

  • [Journal Article] テキストに対するPurity尺度の適用と改良2014

    • Author(s)
      谷口雄太,池田大輔
    • Journal Title

      システム情報科学紀要

      Volume: 19 Pages: 1-6

    • NAID

      120005475449

    • Related Report
      2013 Annual Research Report
    • Peer Reviewed
  • [Journal Article] The Purity Measure for Genomic Regions Leads to Horizontally Transferred Genes2013

    • Author(s)
      Yuta Taniguchi, Yasuhiro Yamada, Osamu Maruyama, Satru Kuhara, and Daisuke Ikeda
    • Journal Title

      Journal of Bioinformatics and Computational Biology

      Volume: 11(6):1343002 Issue: 06 Pages: 1343002-1343002

    • DOI

      10.1142/s0219720013430026

    • Related Report
      2013 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Finding Peculiar Compositions of Two Frequent Strings with Background Texts2013

    • Author(s)
      Daisuke Ikeda and Einoshin Suzuki
    • Journal Title

      Journal of Knowledge and Information Systems

      Volume: Online First Issue: 2 Pages: 499-530

    • DOI

      10.1007/s10115-013-0688-9

    • Related Report
      2013 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Preliminary Results for Discovering Related Words from Logs of Scholarly Repositories2013

    • Author(s)
      Takehiro Shiraishi, Toshihiro Aoyama, Kazutsuna Yamaji, Takao Namiki, and Daisuke Ikeda
    • Journal Title

      Proceedings of IIAI International Conference on Advanced Information Technologies

      Volume: CDROM

    • Related Report
      2013 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Speed Improvement of the Plagiarism Detection Method2013

    • Author(s)
      Tetsuya Nakatoh, Kensuke Baba, Yasuhiro Yamada, and Daisuke Ikeda
    • Journal Title

      Proceedings of IIAI International Conference on Advanced Information Technologies

      Volume: CDROM

    • Related Report
      2013 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Mining Infrequent Patterns of Two Frequent Substrings from a Single Set of Biological Sequences2013

    • Author(s)
      Daisuke Ikeda
    • Journal Title

      Proceedings of the 2013 International Conference on Parallel and Distributed Processing Techniques and Applications

      Volume: I Pages: 136-142

    • Related Report
      2013 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Infrequent, Unexpected, and Contrast Pattern Discovery from Bacterial Genomes by Genome-wide Comparative Analysis2013

    • Author(s)
      Daisuke Ikeda. Osamu Maruyama and Satoru Kuhara
    • Journal Title

      Proceedings of the 4th International Conference on Bioinformatics Models, Methods and Algorithms

      Pages: 308-311

    • Related Report
      2012 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Mining Pure Patterns in Texts2012

    • Author(s)
      Yasuhiro Yamada, Tetsuya Nakatoh, Kensuke Baba and Daisuke Ikeda
    • Journal Title

      Proceedings of the 2012 IIAI International Conference on Advanced Applied Informatics

      Pages: 285-290

    • DOI

      10.1109/iiai-aai.2012.75

    • Related Report
      2012 Annual Research Report
    • Peer Reviewed
  • [Presentation] Unique Links as Weak Ties2015

    • Author(s)
      Yasuhiro Yamada, Daisuke Ikeda and Sachio Hirokawa
    • Organizer
      4th International Congress on Advanced Applied Informatics
    • Place of Presentation
      岡山
    • Year and Date
      2015-07-12 – 2015-07-16
    • Related Report
      2014 Annual Research Report
  • [Presentation] Discover Overlapping Topical Regions by Geo-semantic Clustering of Tweets'', Proceedings of the Eighth International Symposium on Mining and Web2015

    • Author(s)
      Yuta Taniguchi, Daiki Monzen, Sari Ariestien Lutfiana, Daisuke Ikeda
    • Organizer
      Workshop of 29th International Conference on Advanced Information Networking and Applications
    • Place of Presentation
      Gwangju, Korea
    • Year and Date
      2015-03-25 – 2015-03-27
    • Related Report
      2014 Annual Research Report
  • [Presentation] Probabilistic Model for Purity Values of Bacterial Genome Sequences2015

    • Author(s)
      Y. Taniguchi, R. Masui, T. Aoyama and D. Ikeda
    • Organizer
      3rd International Conference on Bioinformatics and Computational Biology
    • Place of Presentation
      Hong Kong
    • Year and Date
      2015-03-12 – 2015-03-13
    • Related Report
      2014 Annual Research Report
  • [Presentation] 近似文字列照合を用いた剽窃検出手法の評価2014

    • Author(s)
      中藤 哲也, 山田 泰寛, 馬場 健介, 池田 大輔, 廣川 佐千男
    • Organizer
      平成25年度電気関係学会九州支部連合大会(第66回連合大会)
    • Place of Presentation
      鹿児島大学
    • Year and Date
      2014-09-18
    • Related Report
      2014 Annual Research Report
  • [Presentation] Infrequent, Unexpected, and Contrast Pattern Discovery from Bacterial Genomes by Genome-wide Comparative Analysis2013

    • Author(s)
      池田大輔
    • Organizer
      International Conference on Bioinformatics Models, Methods, Algorithms
    • Place of Presentation
      バルセロナ(スペイン)
    • Year and Date
      2013-02-12
    • Related Report
      2012 Annual Research Report
  • [Presentation] The Purity Measure for Genomic Regions Leads to Horizontally Transferred Genes2013

    • Author(s)
      Yuta Taniguchi, Yasuhiro Yamada, Osamu Maruyama, Satru Kuhara, and Daisuke Ikeda
    • Organizer
      International Conference on Genome Informatics
    • Place of Presentation
      シンガポール
    • Related Report
      2013 Annual Research Report
  • [Presentation] Mining Infrequent Patterns of Two Frequent Substrings from a Single Set of Biological Sequences2013

    • Author(s)
      Daisuke Ikeda
    • Organizer
      the 2013 International Conference on Parallel and Distributed Processing Techniques and Applications
    • Place of Presentation
      ラスベガス
    • Related Report
      2013 Annual Research Report
  • [Presentation] Mining Pure Patterns in Texts2012

    • Author(s)
      山田泰寛
    • Organizer
      2012 IIAI International Conference on Advanced Applied Informatics
    • Place of Presentation
      福岡
    • Year and Date
      2012-09-21
    • Related Report
      2012 Annual Research Report

URL: 

Published: 2012-04-24   Modified: 2019-07-29  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi