• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Webにおける対象物の曖昧性解消に関する研究

Research Project

Project/Area Number 07J01864
Research Category

Grant-in-Aid for JSPS Fellows

Allocation TypeSingle-year Grants
Section国内
Research Field Intelligent informatics
Research InstitutionThe University of Tokyo

Principal Investigator

ボッレーガラ ダヌシカ  The University of Tokyo, 大学院・情報理工学系研究科, 特別研究員(PD)

Project Period (FY) 2007 – 2010
Project Status Completed (Fiscal Year 2009)
Budget Amount *help
¥2,800,000 (Direct Cost: ¥2,800,000)
Fiscal Year 2009: ¥900,000 (Direct Cost: ¥900,000)
Fiscal Year 2008: ¥900,000 (Direct Cost: ¥900,000)
Fiscal Year 2007: ¥1,000,000 (Direct Cost: ¥1,000,000)
Keywords関係抽出 / ウェブマイニング / クラスタリング / 共クラスタリング / エンティティ / 外延的定義 / 内包的定義 / アルゴリズム / 類似性 / 類似度尺度 / 関係類似性 / analogy / 曖昧性解消 / 機械学習 / Web Mining / 類似度計算 / 別名問題 / referential ambiguity / polysemy / 情報抽出 / Web検索
Research Abstract

二つの対象物(エンティティ)間の関係Rを定義するためには2種類の方法がある。一つの方法はその関係にあるエンティティのペアを挙げることである(外延的定義,extensional definition)。もう一方の方法は関係Rを語彙パターンで表現することである(内包的定義,intensional definition)。本研究では、この双対となる関係の定義に基づくクラスタリング手法を提案し、それを用い関係抽出を行う。提案するクラスタリング手法の一つの特徴としては語彙パターンとentityペアを「同時に」クラスタリングすることであり、このように「お互い何らかの制約を満たしている二つの量を同時にクラスタリングする」クラスタリングアルゴリズムは統一的にco-clustering(共クラスタリング)アルゴリズムと呼ばれている。本研究もこのco-clusteringアルゴリズムの一種であり、関係の異なる定義の双対性という制約に基づいて実現する点に特徴がある。教師なし学習であるクラスタリングによるので、訓練用データを必要としない。co-clusteringによりentityペアの関係種別クラスタリングに使う特徴量となる語彙パターンも同時にクラスタリングするので、特徴次元を圧縮し安定的なクラスタリングを可能にする特徴をゆうする。Webのような膨大なテキストコーパスからエンティティ間の関係を抽出する際に、膨大な数のエンティティペアと語彙パターンを同時にco-clusteringする必要があるため計算量の小さいアルゴリズムが重要である。本研究ではオーダー0(nlogn)の計算量でco-clusteringできるsequential co-clusteringアルゴリズムを提案し評価した。

Report

(3 results)
  • 2009 Annual Research Report
  • 2008 Annual Research Report
  • 2007 Annual Research Report
  • Research Products

    (17 results)

All 2010 2009 2008 2007 Other

All Journal Article (1 results) (of which Peer Reviewed: 1 results) Presentation (14 results) Remarks (2 results)

  • [Journal Article] A bottom up approach to Sentence Ordering for Multi-document Summarization2010

    • Author(s)
      Danushka Bollegala, Naoaki Okazaki, Mitsuru Ishizuka
    • Journal Title

      Information Processing and Management 46

      Pages: 89-109

    • Related Report
      2009 Annual Research Report
    • Peer Reviewed
  • [Presentation] A Sequential Model for Discourse Segmentation2010

    • Author(s)
      Hugo Hernault, Danushka Bollegala, Mitsuru Ishizuka
    • Organizer
      International Conference on Intelligent Text Processing and Computational Linguistics(CICLing)
    • Place of Presentation
      Romania, Iasi
    • Year and Date
      2010-03-21
    • Related Report
      2009 Annual Research Report
  • [Presentation] A Relational Model of Semantic Similarity between Words using Automatically Extracted Lexical Pattern Clusters from the Web2009

    • Author(s)
      Danushka Bollegala, Yutaka Matsuo, Mitsuru Ishizuka
    • Organizer
      Empirical Methods in Natural Language Processing
    • Place of Presentation
      Singapore, Singapore
    • Year and Date
      2009-08-06
    • Related Report
      2009 Annual Research Report
  • [Presentation] Measuring the Similarity between Implicit Semantic Relations from the Web2009

    • Author(s)
      Danushka Bollegala, Yutaka Matsuo, Mitsuru Ishizuka
    • Organizer
      International World Wide Web Conference
    • Place of Presentation
      Spain, Madrid
    • Year and Date
      2009-04-21
    • Related Report
      2009 Annual Research Report
  • [Presentation] Measuring the Similarity between Implicit Semantic Relations using Web Search En2009

    • Author(s)
      D. Bollegala, Y. Matsuo, M. Ishizu
    • Organizer
      2nd ACM Int'l Conf. on Web Search and Data Mining (WSDM)
    • Place of Presentation
      Barcelona, Spain
    • Year and Date
      2009-02-11
    • Related Report
      2008 Annual Research Report
  • [Presentation] Social Network Mining from the Web2008

    • Author(s)
      Y. Masuo, D. Bollegala, H. Tomob
    • Organizer
      NSF Sponsered Symposium on Semantic Knowledge Discovery「ポスター」
    • Place of Presentation
      New York, U.S.A.
    • Year and Date
      2008-11-14
    • Related Report
      2008 Annual Research Report
  • [Presentation] Automatically Extracting Personal Name Aliases from the Web2008

    • Author(s)
      D. Bollegala, T. Honma, Y. Matsuo, M. Ishiz
    • Organizer
      6th International Conference on Natural Language Processing (GoTA)
    • Place of Presentation
      Gothenburg, Sweden
    • Year and Date
      2008-08-25
    • Related Report
      2008 Annual Research Report
  • [Presentation] WWW sits the SAT : Measuring Relational Similarity from the Web2008

    • Author(s)
      D. Bollegala, Y. Matsuo, M. Ishizu
    • Organizer
      18th European Conference on Artificial Intelligence (ECAI)
    • Place of Presentation
      Patras, Greece
    • Year and Date
      2008-07-21
    • Related Report
      2008 Annual Research Report
  • [Presentation] Mining for Personal Name Aliases on the Web2008

    • Author(s)
      D.Bollegala, T.Honma, Y.Matsuo, M.Ishizuka
    • Organizer
      International WorId Wide Web Conference
    • Place of Presentation
      Beijing,China
    • Year and Date
      2008-04-23
    • Related Report
      2007 Annual Research Report
  • [Presentation] Identification of Personal Name Aliases on the Web2008

    • Author(s)
      D.Bollegala, T.Honma, Y.Matsuo, M.Ishizuka
    • Organizer
      Workshop on Social Web Search and Mining, Intl.World Wide Web Conference
    • Place of Presentation
      Beijing,China
    • Year and Date
      2008-04-22
    • Related Report
      2007 Annual Research Report
  • [Presentation] Mining for Personal Name Aliases on the Web2008

    • Author(s)
      D. Bollegala, T. Honma Y. Matsuo, M. Ishiz
    • Organizer
      17th Int'l World Wide Web Conference(WWW)[ボスター]
    • Place of Presentation
      Beijing, China
    • Year and Date
      2008-04-21
    • Related Report
      2008 Annual Research Report
  • [Presentation] A Co-occurrence Graph-based Approach for Personal Name Alias Extraetion from Anchor Texts2008

    • Author(s)
      D.Bollegala, Y.Matsuo, M.Ishizuka
    • Organizer
      International Joint Conferences on Natural Language Processing(IJCNLP)
    • Place of Presentation
      Hyderabad,India
    • Year and Date
      2008-01-07
    • Related Report
      2007 Annual Research Report
  • [Presentation] WebSim:A Web-based Semantic Similarity Measure2007

    • Author(s)
      D.Bollegala, Y.Matsuo, M.Ishizuka
    • Organizer
      人工知能学会全国大会
    • Place of Presentation
      宮崎県、日本
    • Year and Date
      2007-06-20
    • Related Report
      2007 Annual Research Report
  • [Presentation] Measuring Semantic Similarity between Words Using Web Seareh Engines2007

    • Author(s)
      D.Bollegala, Y.Matsuo, M.Ishizuka
    • Organizer
      International World Wide Web Conference
    • Place of Presentation
      Banff,Canada
    • Year and Date
      2007-05-11
    • Related Report
      2007 Annual Research Report
  • [Presentation] An Integrated Approach to Measuring Semantic Similarity between Words Using Information Available on the Web2007

    • Author(s)
      D.Bollegala, Y.Matsuo, M.Ishizuka
    • Organizer
      Human Language Technologies:Annual Conference of the North American Chapter of the Association for Computational Linguistics
    • Place of Presentation
      Rochester NY.U.S.A.
    • Year and Date
      2007-04-24
    • Related Report
      2007 Annual Research Report
  • [Remarks]

    • URL

      http://www.iba.t.u-tokyo.ac.jp/~danushka/publications.html

    • Related Report
      2009 Annual Research Report
  • [Remarks]

    • URL

      http://www.miv.t.u-tokyo.ac.jp/danushka/publications.html

    • Related Report
      2008 Annual Research Report

URL: 

Published: 2007-04-01   Modified: 2024-03-26  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi