• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Knowledge Discovery from Numbers in Text

Research Project

Project/Area Number 22700137
Research Category

Grant-in-Aid for Young Scientists (B)

Allocation TypeSingle-year Grants
Research Field Intelligent informatics
Research InstitutionThe University of Tokyo

Principal Investigator

YOSHIDA Minoru  東京大学, 情報基盤センター, 助教 (40361688)

Project Period (FY) 2010 – 2011
Project Status Completed (Fiscal Year 2011)
Budget Amount *help
¥2,860,000 (Direct Cost: ¥2,200,000、Indirect Cost: ¥660,000)
Fiscal Year 2011: ¥1,040,000 (Direct Cost: ¥800,000、Indirect Cost: ¥240,000)
Fiscal Year 2010: ¥1,820,000 (Direct Cost: ¥1,400,000、Indirect Cost: ¥420,000)
Keywords自然言語処理 / 数値情報 / テキストマイニング / 接尾辞配列 / クラスタリング / 数値検索 / ディリクレ過程混合モデル
Research Abstract

We studied a method for processing numbers written in text to discover relations between words and numbers. We indexed texts using suffix arrays augmented with functions for searching digits as numbers with the queries being able to include range of numbers. The search function can be performed in reasonable time for large text, which enabled us to obtain the relations between words and numbers interactively from such texts. We also studied methods for mining the texts that contain many numbers.

Report

(3 results)
  • 2011 Annual Research Report   Final Research Report ( PDF )
  • 2010 Annual Research Report
  • Research Products

    (20 results)

All 2012 2011 2010

All Journal Article (6 results) (of which Peer Reviewed: 4 results) Presentation (12 results) Book (2 results)

  • [Journal Article] 二段階クラスタリングを単語重み付与に応用した人名曖昧性解消2010

    • Author(s)
      吉田稔、池田雅紀、小野真吾、佐藤一誠、中川裕志
    • Journal Title

      日本データベース学会論文誌

      Volume: Vol.9, No.2 Pages: 19-24

    • NAID

      40017420150

    • Related Report
      2011 Final Research Report
    • Peer Reviewed
  • [Journal Article] テキストマイニングの活用2010

    • Author(s)
      吉田稔, 中川裕志
    • Journal Title

      情報の科学と技術

      Volume: 60巻6号 Pages: 230-235

    • Related Report
      2011 Final Research Report
  • [Journal Article] Person Name Disambiguation by Bootstrapping2010

    • Author(s)
      Minoru Yoshida, Masaki Ikeda, Shingo Ono, Issei Sato, and Hiroshi Nakagawa
    • Journal Title

      Proceedings of SIGIR-2010

      Pages: 10-17

    • Related Report
      2011 Final Research Report
    • Peer Reviewed
  • [Journal Article] Mining Numbers in Text Using Suffix Arrays and Clustering Based on Dirichlet Process Mixture Models2010

    • Author(s)
      Minoru Yoshida, Issei Sato, Hiroshi Nakagawa, Akira Terada
    • Journal Title

      Proceedings of PAKDD-2010

      Pages: 230-237

    • NAID

      120007131162

    • Related Report
      2011 Final Research Report
    • Peer Reviewed
  • [Journal Article] 二段階クラスタリングを単語重み付与に応用した人名曖昧性解消2010

    • Author(s)
      吉田稔, 池田雅紀, 小野真吾, 佐藤一誠, 中川裕志
    • Journal Title

      日本データベース学会論文誌

      Volume: 9(2) Pages: 19-24

    • NAID

      40017420150

    • Related Report
      2010 Annual Research Report
    • Peer Reviewed
  • [Journal Article] テキストマイニングの活用(解説記事)2010

    • Author(s)
      吉田稔, 中川裕志
    • Journal Title

      情報の科学と技術

      Volume: 60(6) Pages: 230-235

    • Related Report
      2010 Annual Research Report
  • [Presentation] ニュース記事クラスタリングによる取引高予測の試み2012

    • Author(s)
      吉田稔, 中川裕志, 石田智也, 中嶋啓浩, 松井藤五郎, 和泉潔, 池田翔, 本多隆虎
    • Organizer
      人工知能学会第25回全国大会
    • Place of Presentation
      盛岡
    • Year and Date
      2012-06-02
    • Related Report
      2011 Annual Research Report
  • [Presentation] ソーシャルメディアによる風邪流行の予測2012

    • Author(s)
      谷田和章,荒牧英治,佐藤一誠,吉田稔,中川裕志
    • Organizer
      言語処理学会第18回年次大会
    • Place of Presentation
      広島
    • Year and Date
      2012-03-15
    • Related Report
      2011 Final Research Report
  • [Presentation] ソーシャルメディアによる風邪流行の予測2012

    • Author(s)
      谷田和章, 荒牧英治, 佐藤一誠, 吉田稔, 中川裕志
    • Organizer
      言語処理学会 第18回年次大会
    • Place of Presentation
      広島
    • Year and Date
      2012-03-15
    • Related Report
      2011 Annual Research Report
  • [Presentation] テキストマイニングによる機器異常診断支援の試み2012

    • Author(s)
      吉田稔,中川裕志,渋谷久恵,前田俊二
    • Organizer
      第4回データ工学と情報マネジメントに関するフォーラム
    • Place of Presentation
      神戸
    • Year and Date
      2012-03-04
    • Related Report
      2011 Final Research Report
  • [Presentation] テキストマイニングによる機器異常診断支援の試み2012

    • Author(s)
      吉田稔, 中川裕志, 渋谷久恵, 前田俊二
    • Organizer
      第4回データ工学と情報マネジメントに関するフォーラム(DEIM 2012)
    • Place of Presentation
      神戸
    • Year and Date
      2012-03-04
    • Related Report
      2011 Annual Research Report
  • [Presentation] ニュース記事クラスタリングによる取引高予測の試み2011

    • Author(s)
      吉田稔,中川裕志,石田智也,中嶋啓浩,松井藤五郎,和泉潔,池田翔,本多隆虎
    • Organizer
      人工知能学会第25回全国大会
    • Place of Presentation
      盛岡
    • Year and Date
      2011-06-02
    • Related Report
      2011 Final Research Report
  • [Presentation] Web People Search2010

    • Author(s)
      Minoru Yoshida, Hiroshi Nakagawa
    • Organizer
      Person Name Disambiguation and Other Problems(Tutorial), the 2nd Asian Conference on Machine Learning(ACML 2010)
    • Year and Date
      2010-11-08
    • Related Report
      2011 Final Research Report
  • [Presentation] Web People Search : Person Name Disambiguation and Other Problems (Tutorial)2010

    • Author(s)
      Minoru Yoshida, Hiroshi Nakagawa
    • Organizer
      The 2nd Asian Conference on Machine Learning (ACML 2010)
    • Place of Presentation
      Tokyo Tech Front, Tokyo
    • Year and Date
      2010-11-08
    • Related Report
      2010 Annual Research Report
  • [Presentation] ITC-UT2010

    • Author(s)
      Minoru Yoshida, Shin Matsushima, Shingo Ono, Hiroshi Nakagawa
    • Organizer
      Tweet Categorization by Query Categrization for On-line Reputation management. WePS-3, CLEF 2010 Labs
    • Year and Date
      2010-09-23
    • Related Report
      2011 Final Research Report
  • [Presentation] ITC-UT : Tweet Categorization by Query Categorization for On-line Reputation management2010

    • Author(s)
      Minoru Yoshida, Shin Matsushima, Shingo Ono, Issei Sato, Hiroshi Nakagawa
    • Organizer
      WePS-3, CLEF 2010 Labs
    • Place of Presentation
      Padua, Italy
    • Year and Date
      2010-09-23
    • Related Report
      2010 Annual Research Report
  • [Presentation] Person Name Disambiguation by Bootstrapping2010

    • Author(s)
      Minoru Yoshida, Masaki Ikeda, Shingo Ono, Issei Sato, Hiroshi Nakagawa
    • Organizer
      SIGIR-2010 (the 33rd Annual ACM SIGIR Conference)
    • Place of Presentation
      Geneva, Swiss
    • Year and Date
      2010-07-20
    • Related Report
      2010 Annual Research Report
  • [Presentation] Mining Numbers in Text Using Suffix Arrays and Clustering Based on Dirichlet Process Mixture Models2010

    • Author(s)
      Minoru Yoshida, Issei Sato, Hiroshi Nakagawa, Akira Terada
    • Organizer
      PAKDD-2010 (The 14th Pacific-Asia Conference on Knowledge Discovery and Data Mining)
    • Place of Presentation
      Hyderabad, India
    • Year and Date
      2010-06-23
    • Related Report
      2010 Annual Research Report
  • [Book] Information Extraction from the Internet2011

    • Author(s)
      Minoru Yoshida, Hiroshi Nakagawa, AkiraTerada
    • Publisher
      On-demand Synonym Extraction Using Suffix Arrays, Chapter in Book
    • Related Report
      2011 Final Research Report
  • [Book] Information Extraction from the Internet (Chapter 5 : On-demand Synonym Extraction Using Suffix Arrays)2011

    • Author(s)
      Minoru Yoshida, Hiroshi Nakagawa, Akira Terada
    • Total Pages
      256
    • Publisher
      iConcept Press
    • Related Report
      2011 Annual Research Report

URL: 

Published: 2010-08-23   Modified: 2016-04-21  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi