• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

A Study on Integration of Bibliographic Information from Multiple Information Sources

Research Project

Project/Area Number 15300084
Research Category

Grant-in-Aid for Scientific Research (B)

Allocation TypeSingle-year Grants
Section一般
Research Field 情報図書館学・人文社会情報学
Research InstitutionNational Institute of Informatics

Principal Investigator

TAKASU Atsuhiro  National Institute of Informatics, Research Center for Testbeds and Prototyping, Professor, President, 実証研究センター, 教授 (90216648)

Co-Investigator(Kenkyū-buntansha) ADACHI Jun  National Institute of Informatics, Software Research Division, Professor, ソフトウェア研究系, 教授 (80143551)
OYAMA Keizou  National Institute of Informatics, Human and Social Information Research Division, Professor, 人間・社会情報研究系, 教授 (90177022)
AIZAWA Akiko  National Institute of Informatics, Research Center for Information Resources, Professor, 情報学資源研究センター, 教授 (90222447)
Project Period (FY) 2003 – 2005
Project Status Completed (Fiscal Year 2005)
Budget Amount *help
¥13,300,000 (Direct Cost: ¥13,300,000)
Fiscal Year 2005: ¥3,100,000 (Direct Cost: ¥3,100,000)
Fiscal Year 2004: ¥5,300,000 (Direct Cost: ¥5,300,000)
Fiscal Year 2003: ¥4,900,000 (Direct Cost: ¥4,900,000)
KeywordsDigital Library / Bibliographic Matching / Record Linkage / Document Image Analysis / Approximate String Matching / Information Extraction
Research Abstract

This study aims at developing a bibliographic information integration system which provides with an analysis method for bibliographic information obtained from multiple information sources, robust bibliographic matching function, and efficient information access. In this study, we achieved the following research results.
(1)We developed a statistical model for analyzing various kinds of bibliographic strings. The proposed model is based on hidden Markov model and it enables to extract bibliographic components from refer strings. The model has ability to describe error patterns strings, therefore it can be applied reference strings obtained by OCR. We showed that the model can make matching of references strings with the accuracy of about 95% experimentally.
(2)We developed an indexing method for searching records from large bibliographic databases. This method uses frequent string patterns appearing in the database and extracts variable n-grams adaptively. By this index, we can merge multiple bibliographic databases efficiently.
(3)We developed a method to gather bibliographic data existing in a distributed and autonomous information network. In the proposed method, autonomous systems exchange meta data about bibliographic to discover the cite that holds the desired bibliographic information. In this method, we realized efficient query processing mechanism in autonomous and distributed environment by changing the query processing route adaptively using the meta data.

Report

(4 results)
  • 2005 Annual Research Report   Final Research Report Summary
  • 2004 Annual Research Report
  • 2003 Annual Research Report
  • Research Products

    (26 results)

All 2005 2004 2003 Other

All Journal Article (20 results) Publications (6 results)

  • [Journal Article] レコード同定問題に関する研究の課題と現状2005

    • Author(s)
      相澤彰子, 大山敬三, 高須淳宏, 安達淳
    • Journal Title

      電子情報通信学会論文誌 J88-D-1・2

      Pages: 576-589

    • NAID

      110003207354

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] A Sequential Labeling Method Using Syntactical and Textual Patterns for Record Linkage2005

    • Author(s)
      Atsuhiro Takasu
    • Journal Title

      Lecture Notes in Computer Science LNCS 3686

      Pages: 801-83

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Annual Research Report 2005 Final Research Report Summary
  • [Journal Article] Techniques and Research Trends in Record Linkage Studies2005

    • Author(s)
      Aizawa, Takasu, Oyama, Adachi
    • Journal Title

      The Transactions on the IEICE D-I(in Japanese) Vol.J88-D-I, No.2

      Pages: 576-589

    • NAID

      110003207354

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] A Sequential Labeling Method Using Syntactical and Textual Patterns for Record Linkage2005

    • Author(s)
      Atsuhiro Takasu
    • Journal Title

      Lecture Notes in computer Vol.LNCS3686,

      Pages: 199-208

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Bibliographic Component Extraction from References Based on a Text Recognition Error Model2005

    • Author(s)
      Atsuhiro Takasu, Kenro Aihara
    • Journal Title

      Systems and Computers in Japan 36・7

      Pages: 1-12

    • Related Report
      2005 Annual Research Report
  • [Journal Article] Adaptive Replication Method Based on Peer Behavior Pattern in Unstructured Peer-to-Peer Systems2005

    • Author(s)
      Yamada, Aihara, Takasu, Adachi
    • Journal Title

      International Special Workshop on Databases for Next Generation Researchers

      Pages: 80-83

    • Related Report
      2005 Annual Research Report
  • [Journal Article] Link-Based Clustering for Finding Subrelevant Web Pages2005

    • Author(s)
      Masada, Takasu, Adachi
    • Journal Title

      International Workshop on Web Document Analysis

    • Related Report
      2005 Annual Research Report
  • [Journal Article] レコード同定問題に関する研究の課題と現状2005

    • Author(s)
      相澤彰子, 大山敬三, 高須淳宏, 安達淳
    • Journal Title

      電子情報通信学会論文誌 J88-D-I・2

      Pages: 576-589

    • NAID

      110003207354

    • Related Report
      2004 Annual Research Report
  • [Journal Article] A Topic-Based Index mechanism using Usefulness of Peers in Unstructured Peer-to-Peer Networks2005

    • Author(s)
      T.Yamada, K.Aihara, A.Takasu, J.Adachi
    • Journal Title

      Proc. 23rd International Multi-Conference on Database and Applications

      Pages: 134-139

    • Related Report
      2004 Annual Research Report
  • [Journal Article] テキスト認識エラーモデルによる引用文献文字列からの書誌要素の抽出2004

    • Author(s)
      高須淳宏, 相原健郎
    • Journal Title

      電子情報通信学会論文誌 J87-D-11・6

      Pages: 1298-1308

    • NAID

      110003171120

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Bibliographic Component Extraction Using Support Vector Machines and Hidden Markov Models2004

    • Author(s)
      T.Okada, A.Takasu, J.Adachi
    • Journal Title

      Proc.European Conf.Research and Advanced Technology for Digital Libraries

      Pages: 501-512

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Peer-to-Peerシステム上での効率的なデータ配置による問い合わせ処理とロードバランスへの寄与2004

    • Author(s)
      山田太造, 相原健郎, 高須淳宏, 安達淳
    • Journal Title

      情報処理学会論文誌データベース 45・SIG7

      Pages: 93-101

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary 2004 Annual Research Report
  • [Journal Article] Bibliographic Attribute Extraction from References Based on Text Recognit ion Error Model2004

    • Author(s)
      Atsuhiro Takasu, Kenro Aihara
    • Journal Title

      The Transactions on the IEICE(in Japanese) Vol.J87-D-II, No.6

      Pages: 1298-1308

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] An Efficient Query Processing and Load-Balancing by Efficient Data Placement on Peer-to-peer Systems2004

    • Author(s)
      Yamada, Aihara, Takasu, Adachi
    • Journal Title

      IPSJ Transactions on Databases(in Japanese) Vol.45, No.SIG 7

      Pages: 93-101

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] テキスト認識エラーモデルによる引用文献文字列からの書誌要素の抽出2004

    • Author(s)
      高須淳宏, 相原健郎
    • Journal Title

      電子情報通信学会論文誌 J87-D-II・6

      Pages: 1298-1308

    • NAID

      110003171120

    • Related Report
      2004 Annual Research Report
  • [Journal Article] Replica Placement for Effective Document Sharing Mechanisms in Peer-to=Peer Networks2004

    • Author(s)
      T.Yamada, K.Aihara, A.Takasu, J.Adachi
    • Journal Title

      Proc.Intl.Conf. Internet and Multimedia Systems and Applications

      Pages: 144-149

    • Related Report
      2004 Annual Research Report
  • [Journal Article] Bibliographic Component Extraction Using Support Vector Machines and Hidden Markov Models2004

    • Author(s)
      T.Okada, A.Takasu, J.Adachi
    • Journal Title

      Proc. European Conf. Research and Advanced Technology for Digital Libraries

      Pages: 501-512

    • Related Report
      2004 Annual Research Report
  • [Journal Article] Bibliographic Attribute Extraction from Erroneous References Based on a Statitical Model2003

    • Author(s)
      Atsuhiro Takasu
    • Journal Title

      Proc. 3^rd ACM&IEEE Joint Conference on Digital Libraries

      Pages: 49-60

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Bibliographic Attribute Extraction from Erroneous References Based on a Statistical Model2003

    • Author(s)
      Atsuhiro Takasu
    • Journal Title

      Proc.3^<rd>, ACM & IEEE Joint Conference on Digital Libraries

      Pages: 49-60

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Bibliographic Component Extraction Using Support Vector Machines and Hidden Markov Models

    • Author(s)
      Okada, Takasu, Adachi
    • Journal Title

      Proc.European Conf. on Research and Advanced Technology for Digital Libraries

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Publications] 高須淳宏, 相原健郎: "テキスト認識エラーモデルによる引用文献文字列からの書誌要素の抽出"電子情報通信学会論文誌. J87-D-II,6. (2004)

    • Related Report
      2003 Annual Research Report
  • [Publications] 山田太造, 相原健郎, 高須淳宏, 安達淳: "Peer-to-peerシステム上での効率的なデータ配置による問い合わせ処理とロードバランシング"情報処理学会論文誌 データベース. TOD23. (2004)

    • Related Report
      2003 Annual Research Report
  • [Publications] 相澤彰子, 高須淳宏, 大山敬三, 安達淳: "異種データベース間でのレコード照合に関する研究動向"NII Journal. No.8. 43-51 (2004)

    • Related Report
      2003 Annual Research Report
  • [Publications] Tomonari Masada, Atsuhiro Takasu, Jun Adachi: "Decomposing the Web Graph into Parameterized Connected Components"IEICE Transactions on Information and Systems. E87-D,2. 380-388 (2004)

    • Related Report
      2003 Annual Research Report
  • [Publications] Atsuhiro Takasu: "Bibliographic Attribute Extraction from Erroneous References Based on a Statitical Model"Proc.3^<rd> ACM & IEEE Joint Conference on Digital Libraries. 49-60 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] Atsuhiro Takasu: "A Statistical Model for Flexible String Similarity"Proc.18^<th> International Joint Conference on Artificial Intelligence. 1420-1421 (2003)

    • Related Report
      2003 Annual Research Report

URL: 

Published: 2003-04-01   Modified: 2021-12-10  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi