• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to project page

2005 Fiscal Year Final Research Report Summary

A Text Organization Method Based on Maximal Analogy

Research Project

Project/Area Number 16300039
Research Category

Grant-in-Aid for Scientific Research (B)

Allocation TypeSingle-year Grants
Section一般
Research Field Intelligent informatics
Research InstitutionHokkaido University

Principal Investigator

HARAGUCHI Makoto  Hokkaido Univ, Grapduate School of Inf.Sci.and Tech., Prof., 大学院・情報科学研究科, 教授 (40128450)

Co-Investigator(Kenkyū-buntansha) TANAKA Yuzuru  Hokkaido Univ., Grapduate School of Inf.Sci.and Tech., Prof., 大学院・情報科学研究科, 教授 (60002309)
KAKUTA Tokuyasu  Nagoya Univ., Graduate School of Law, Assoc.Prof., 大学院・院情報科学研究科, 助教授 (80292001)
YOSHIOKA Masaharu  Hokkaido Univ., Grapduate School of Inf.Sci.and Tech., Assoc.Prof., 大学院・情報科学研究科, 助教授 (40290879)
OKUBO Yoshiaki  Hokkaido Univ., Grapduate School of Inf.Sci.and Tech., Inst., 大学院・情報科学研究科, 助手 (40271639)
Project Period (FY) 2004 – 2005
KeywordsMaximal Analogy / Text Summarization / Similarity / Text Segmentation / Event Sequence
Research Abstract

In the research project, an algorithm for extracting common abstract event sequences, given two or more documents, is presented. In order to avoid combinatorial explosions in matching more than two documents, the algorithm consists of two phases.
The first phase is basically a text summarization system taking balances between the importance of sentences in each segment in a given document and the importance of sentences to connect several segments. The latter importance is used to extract contextual sentences involving contextual words. In order to separate the notion of importance into the two as in the above, we compute a chunk of core sentences in each segment by a clique finding algorithm, and then calculate the degree of latter importance of sentences from the chunk just in the way used in KeyGraph. Finally, the overall importance is determined by a scheme very similar to topic-sensitive PageRank. We have made some experiments for newspaper articles, and verified its effectiveness.
In the second phase, we use the summarized document obtained in the first phase as a kind of source document. The summarized document preserves the structure of the original document at more abstract level. Therefore, for each event in the source, it suffices to find a similar event in a given target document. This drastically reduces the computational complexity need to build correspondence between the two documents. As a result, several document set is now summarized from a viewpoint of the source document by means of analogy.

  • Research Products

    (14 results)

All 2006 2005

All Journal Article (13 results) Book (1 results)

  • [Journal Article] An Extended Branch-and-Bound Search Algorithm for Finding Top-$N$ Formal Concepts of Documents2006

    • Author(s)
      M.Haraguchi
    • Journal Title

      Proceedings of the 4th Workshop on Learning with Logics and Logics for Learning-LLLL'06 (印刷中)

    • Description
      「研究成果報告書概要(和文)」より
  • [Journal Article] A Method for Pinpoint Clustering of Web Pages with Pseudo-Clique Search2006

    • Author(s)
      M.Haraguchi
    • Journal Title

      Springer LNAI, Federation over the Web, International Workshop 3847

      Pages: 59-78

    • Description
      「研究成果報告書概要(和文)」より
  • [Journal Article] A Method for Pinpoint Clustering of Web Pages with Pseudo-Clique Search.2006

    • Author(s)
      Makoto Haraguchi, Yoshiaki Okubo
    • Journal Title

      Federation over the Web, Int'l Workshop, Dagstuhl Castle, Germany, May 1 - 6, 2005, Revised Selected Papers, Lecture Notes in Artificial Intelligence(Springer) 3847

      Pages: 59-78

    • Description
      「研究成果報告書概要(欧文)」より
  • [Journal Article] An Extended Branch-and-Bound Search Algorithm for Finding Top-$N$ Formal Concepts of Documents.2006

    • Author(s)
      Makoto Haraguchi, Yoshiaki Okubo
    • Journal Title

      Proceedings of the 4th Workshop on Learning with Logics and Logics for Learning - LLLL'06 (to appear)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Journal Article] An Algorithm for Mining Implicit Itemset Pairs Based on Differences of Correlations,2005

    • Author(s)
      T.Taniguchi
    • Journal Title

      Proceedings of the 8th International Conference on Discovery Science - DS'05, Springer-LNAI 3735

      Pages: 227-240

    • Description
      「研究成果報告書概要(和文)」より
  • [Journal Article] Towards Constructing Story Databases Using Maximal Analogies Between Stories2005

    • Author(s)
      M.Yoshioka
    • Journal Title

      Springer LNAI, In Intuitive Human Interfaces for Organizing and Accessing Intellectual Assets 3359

      Pages: 243-255

    • Description
      「研究成果報告書概要(和文)」より
  • [Journal Article] On a Combination of Probabilistic and Boolean IR Models for WWW Document Retrieval2005

    • Author(s)
      M.Yoshioka
    • Journal Title

      ACM Transactions on Asian Language Information Processing 4

      Pages: 340-356

    • Description
      「研究成果報告書概要(和文)」より
  • [Journal Article] Finding Significant Web Pages with Lower Ranks by Pseudo-Clique Search2005

    • Author(s)
      Y.Okubo
    • Journal Title

      Proceedings of the 8th International Conference on Discovery Science - DS'05, Springer-LNAI 3735

      Pages: 346-353

    • Description
      「研究成果報告書概要(和文)」より
  • [Journal Article] Discovery of Hidden Correlations in a Local Transaction Database Based on Differences of Correlations2005

    • Author(s)
      Tsuyoshi Taniguchi, Makoto Haraguchi, Yoshiaki Okubo
    • Journal Title

      Proceedings of the 4th International Conference on Machine Learning and Data Mining in Pattern Recognition - MLDM'05(Springer-LNAI) 3587

      Pages: 537-548

    • Description
      「研究成果報告書概要(欧文)」より
  • [Journal Article] An Algorithm for Mining Implicit Itemset Pairs Based on Differences of Correlations2005

    • Author(s)
      Tsuyoshi Taniguchi, Makoto Haraguchi
    • Journal Title

      Proceedings of the 8th International Conference on Discovery Science -DS'05(Springer-LNAI) 3735

      Pages: 227-240

    • Description
      「研究成果報告書概要(欧文)」より
  • [Journal Article] Towards Constructing Story Databases Using Maximal Analogies Between Stories.2005

    • Author(s)
      Masaharu Yoshioka, Makoto Haraguchi, Akihito Mizoe
    • Journal Title

      Intuitive Human Interfaces for Organizing and Accessing Intellectual Assets : International Workshop, Dagstuhl Castle, Germany, March 1-5, 2004, Revised Selected Papers, Gunter Grieser(Yuzuru Tanaka (eds))(Springer-Verlag GmbH, LNAI) 3359

      Pages: 243-255

    • Description
      「研究成果報告書概要(欧文)」より
  • [Journal Article] On a Combination of Probabilistic and Boolean IR Models for WWW Document Retrieval.2005

    • Author(s)
      Masaharu Yoshioka, Makoto Haraguchi
    • Journal Title

      ACM Transactions on Asian Language Information Processing(TALIP) Vol.4

      Pages: 340-356

    • Description
      「研究成果報告書概要(欧文)」より
  • [Journal Article] Finding Significant Web Pages with Lower Ranks by Pseudo-Clique Search2005

    • Author(s)
      Yoshiaki Okubo, Makoto Haraguchi
    • Journal Title

      Proceedings of the 8th International Conference on Discovery Science - DS'05(Springer-LNAI) 3735

      Pages: 346-353

    • Description
      「研究成果報告書概要(欧文)」より
  • [Book] 人工知能学辞典(「類推による学習」の項を執筆)2005

    • Author(s)
      原口 誠(分担執筆)
    • Total Pages
      976
    • Publisher
      共立出版
    • Description
      「研究成果報告書概要(和文)」より

URL: 

Published: 2007-12-13  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi