• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

A Text Organization Method Based on Maximal Analogy

Research Project

Project/Area Number 16300039
Research Category

Grant-in-Aid for Scientific Research (B)

Allocation TypeSingle-year Grants
Section一般
Research Field Intelligent informatics
Research InstitutionHokkaido University

Principal Investigator

HARAGUCHI Makoto  Hokkaido Univ, Grapduate School of Inf.Sci.and Tech., Prof., 大学院・情報科学研究科, 教授 (40128450)

Co-Investigator(Kenkyū-buntansha) TANAKA Yuzuru  Hokkaido Univ., Grapduate School of Inf.Sci.and Tech., Prof., 大学院・情報科学研究科, 教授 (60002309)
KAKUTA Tokuyasu  Nagoya Univ., Graduate School of Law, Assoc.Prof., 大学院・院情報科学研究科, 助教授 (80292001)
YOSHIOKA Masaharu  Hokkaido Univ., Grapduate School of Inf.Sci.and Tech., Assoc.Prof., 大学院・情報科学研究科, 助教授 (40290879)
OKUBO Yoshiaki  Hokkaido Univ., Grapduate School of Inf.Sci.and Tech., Inst., 大学院・情報科学研究科, 助手 (40271639)
Project Period (FY) 2004 – 2005
Project Status Completed (Fiscal Year 2005)
Budget Amount *help
¥8,600,000 (Direct Cost: ¥8,600,000)
Fiscal Year 2005: ¥5,800,000 (Direct Cost: ¥5,800,000)
Fiscal Year 2004: ¥2,800,000 (Direct Cost: ¥2,800,000)
KeywordsMaximal Analogy / Text Summarization / Similarity / Text Segmentation / Event Sequence / 文書構造 / 物語の構造解析 / コーパス / 特異値分解 / トピック・文脈解析 / ストーリー
Research Abstract

In the research project, an algorithm for extracting common abstract event sequences, given two or more documents, is presented. In order to avoid combinatorial explosions in matching more than two documents, the algorithm consists of two phases.
The first phase is basically a text summarization system taking balances between the importance of sentences in each segment in a given document and the importance of sentences to connect several segments. The latter importance is used to extract contextual sentences involving contextual words. In order to separate the notion of importance into the two as in the above, we compute a chunk of core sentences in each segment by a clique finding algorithm, and then calculate the degree of latter importance of sentences from the chunk just in the way used in KeyGraph. Finally, the overall importance is determined by a scheme very similar to topic-sensitive PageRank. We have made some experiments for newspaper articles, and verified its effectiveness.
In the second phase, we use the summarized document obtained in the first phase as a kind of source document. The summarized document preserves the structure of the original document at more abstract level. Therefore, for each event in the source, it suffices to find a similar event in a given target document. This drastically reduces the computational complexity need to build correspondence between the two documents. As a result, several document set is now summarized from a viewpoint of the source document by means of analogy.

Report

(3 results)
  • 2005 Annual Research Report   Final Research Report Summary
  • 2004 Annual Research Report
  • Research Products

    (27 results)

All 2006 2005 2004

All Journal Article (25 results) Book (2 results)

  • [Journal Article] An Extended Branch-and-Bound Search Algorithm for Finding Top-$N$ Formal Concepts of Documents2006

    • Author(s)
      M.Haraguchi
    • Journal Title

      Proceedings of the 4th Workshop on Learning with Logics and Logics for Learning-LLLL'06 (印刷中)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] A Method for Pinpoint Clustering of Web Pages with Pseudo-Clique Search2006

    • Author(s)
      M.Haraguchi
    • Journal Title

      Springer LNAI, Federation over the Web, International Workshop 3847

      Pages: 59-78

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] A Method for Pinpoint Clustering of Web Pages with Pseudo-Clique Search.2006

    • Author(s)
      Makoto Haraguchi, Yoshiaki Okubo
    • Journal Title

      Federation over the Web, Int'l Workshop, Dagstuhl Castle, Germany, May 1 - 6, 2005, Revised Selected Papers, Lecture Notes in Artificial Intelligence(Springer) 3847

      Pages: 59-78

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] An Extended Branch-and-Bound Search Algorithm for Finding Top-$N$ Formal Concepts of Documents.2006

    • Author(s)
      Makoto Haraguchi, Yoshiaki Okubo
    • Journal Title

      Proceedings of the 4th Workshop on Learning with Logics and Logics for Learning - LLLL'06 (to appear)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] A Method for Pinpoint Clustering of Web Pages with Pseudo-Clique Search2006

    • Author(s)
      M.Haraguchi, Y.Okubo
    • Journal Title

      Federation over the Web, International Workshop(Springer-LNAI) 3847

      Pages: 59-78

    • Related Report
      2005 Annual Research Report
  • [Journal Article] An Algorithm for Mining Implicit Itemset Pairs Based on Differences of Correlations,2005

    • Author(s)
      T.Taniguchi
    • Journal Title

      Proceedings of the 8th International Conference on Discovery Science - DS'05, Springer-LNAI 3735

      Pages: 227-240

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Towards Constructing Story Databases Using Maximal Analogies Between Stories2005

    • Author(s)
      M.Yoshioka
    • Journal Title

      Springer LNAI, In Intuitive Human Interfaces for Organizing and Accessing Intellectual Assets 3359

      Pages: 243-255

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] On a Combination of Probabilistic and Boolean IR Models for WWW Document Retrieval2005

    • Author(s)
      M.Yoshioka
    • Journal Title

      ACM Transactions on Asian Language Information Processing 4

      Pages: 340-356

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Finding Significant Web Pages with Lower Ranks by Pseudo-Clique Search2005

    • Author(s)
      Y.Okubo
    • Journal Title

      Proceedings of the 8th International Conference on Discovery Science - DS'05, Springer-LNAI 3735

      Pages: 346-353

    • NAID

      120000954272

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Discovery of Hidden Correlations in a Local Transaction Database Based on Differences of Correlations2005

    • Author(s)
      Tsuyoshi Taniguchi, Makoto Haraguchi, Yoshiaki Okubo
    • Journal Title

      Proceedings of the 4th International Conference on Machine Learning and Data Mining in Pattern Recognition - MLDM'05(Springer-LNAI) 3587

      Pages: 537-548

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] An Algorithm for Mining Implicit Itemset Pairs Based on Differences of Correlations2005

    • Author(s)
      Tsuyoshi Taniguchi, Makoto Haraguchi
    • Journal Title

      Proceedings of the 8th International Conference on Discovery Science -DS'05(Springer-LNAI) 3735

      Pages: 227-240

    • NAID

      120000956717

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Towards Constructing Story Databases Using Maximal Analogies Between Stories.2005

    • Author(s)
      Masaharu Yoshioka, Makoto Haraguchi, Akihito Mizoe
    • Journal Title

      Intuitive Human Interfaces for Organizing and Accessing Intellectual Assets : International Workshop, Dagstuhl Castle, Germany, March 1-5, 2004, Revised Selected Papers, Gunter Grieser(Yuzuru Tanaka (eds))(Springer-Verlag GmbH, LNAI) 3359

      Pages: 243-255

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] On a Combination of Probabilistic and Boolean IR Models for WWW Document Retrieval.2005

    • Author(s)
      Masaharu Yoshioka, Makoto Haraguchi
    • Journal Title

      ACM Transactions on Asian Language Information Processing(TALIP) Vol.4

      Pages: 340-356

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Finding Significant Web Pages with Lower Ranks by Pseudo-Clique Search2005

    • Author(s)
      Yoshiaki Okubo, Makoto Haraguchi
    • Journal Title

      Proceedings of the 8th International Conference on Discovery Science - DS'05(Springer-LNAI) 3735

      Pages: 346-353

    • NAID

      120000954272

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Finding Significant Web Pages with Lower Ranks by Pseudo-Clique Search2005

    • Author(s)
      Y.Okubo, M.Haraguchi
    • Journal Title

      Proceedings of the 8th International Conference on Discovery Science(Springer-LNAI) 3735

      Pages: 346-353

    • NAID

      120000954272

    • Related Report
      2005 Annual Research Report
  • [Journal Article] An Algorithm for Mining Implicit Itemset Pairs Based on Differences of Correlations2005

    • Author(s)
      T.Taniguchi, M.Haraguchi
    • Journal Title

      Proceedings of the 8th International Conference on Discovery Science(Springer-LNAI) 3735

      Pages: 227-240

    • NAID

      120000956717

    • Related Report
      2005 Annual Research Report
  • [Journal Article] Discovery of Hidden Correlations in a Local Transaction Database Based on Differences of Correlations2005

    • Author(s)
      T.Taniguchi, M.Haraguchi, Y.Okubo
    • Journal Title

      4th International Conference on Machin Learning and Data Mining in Pattern Recognition(Springer-LNAI) 3587

      Pages: 537-548

    • Related Report
      2005 Annual Research Report
  • [Journal Article] Towards Constructing Story Databases Using Maximal Analogies Between Stories2005

    • Author(s)
      M.Yoshioka, M.Haraguchi, A.Mizoe
    • Journal Title

      In Intuitive Human Interfaces for Organizing and Accessing Intellectual Assets(Springer-LNAI) 3359

      Pages: 243-255

    • Related Report
      2005 Annual Research Report
  • [Journal Article] 検索語の網羅性に注目した汎化概念により検索語選択支援を行う情報検索システムの研究2005

    • Author(s)
      吉岡真治, 原口誠
    • Journal Title

      人工知能学会論文誌 20・4

      Pages: 270-280

    • NAID

      10022005347

    • Related Report
      2005 Annual Research Report
  • [Journal Article] Towards Constructing Story Databases Using Maximal Analogies Between Stories2005

    • Author(s)
      M.Yoshioka, M.Haraguchi, A.Mizoe
    • Journal Title

      Intuitive Human Interfaces for Organizing and Accessing Intellectual Assets (Springer-LNCS) LNCS3359

      Pages: 243-255

    • Related Report
      2004 Annual Research Report
  • [Journal Article] 検索後の網羅性に注目した汎化概念により検索語選択支援を行う情報検索システムの研究2005

    • Author(s)
      吉岡 真治, 原口 誠
    • Journal Title

      人工知能学会誌 Vol.20No.4(印刷中)

    • Related Report
      2004 Annual Research Report
  • [Journal Article] Appropriate Boolean Query Reformulation Interface for Information Retrieval based on Adaptive Generalization2005

    • Author(s)
      M.Yoshioka, M.Haraguchi
    • Journal Title

      Proc.of the International Workshop on Challenges in Web Information Retrieval and Integration (発表予定)

    • Related Report
      2004 Annual Research Report
  • [Journal Article] Multiple News Articles Summarization based on Event Reference Information2004

    • Author(s)
      M.Yoshioka, M.Haraguchi
    • Journal Title

      In Working Notes of the Fourth NTCIR Workshop Meeting

      Pages: 467-473

    • Related Report
      2004 Annual Research Report
  • [Journal Article] Study on the Combination of Probabilistic and Boolean IR Models for WWW Documents Retrieval2004

    • Author(s)
      M.Yoshioka, M.Haraguchi
    • Journal Title

      In Working Notes of the Fourth NTCIR Workshop Meeting Supplement Vol.

      Pages: 9-16

    • Related Report
      2004 Annual Research Report
  • [Journal Article] 様々な特徴のキャラクタに対する同化動作生成手法2004

    • Author(s)
      本林 正裕, 原口 誠
    • Journal Title

      電子情報通信学会論文誌 Vol.J87-D-II, No.7

      Pages: 1473-1486

    • NAID

      110003171136

    • Related Report
      2004 Annual Research Report
  • [Book] 人工知能学辞典(「類推による学習」の項を執筆)2005

    • Author(s)
      原口 誠(分担執筆)
    • Total Pages
      976
    • Publisher
      共立出版
    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Book] 人工知能学辞典(「類推による学習」の項を執筆)(分担2頁)2005

    • Author(s)
      原口 誠(分担執筆)
    • Total Pages
      972
    • Publisher
      共立出版
    • Related Report
      2005 Annual Research Report

URL: 

Published: 2004-04-01   Modified: 2016-04-21  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi