2005 Fiscal Year Final Research Report Summary
A Study on Integration of Bibliographic Information from Multiple Information Sources
Project/Area Number |
15300084
|
Research Category |
Grant-in-Aid for Scientific Research (B)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
情報図書館学・人文社会情報学
|
Research Institution | National Institute of Informatics |
Principal Investigator |
TAKASU Atsuhiro National Institute of Informatics, Research Center for Testbeds and Prototyping, Professor, President, 実証研究センター, 教授 (90216648)
|
Co-Investigator(Kenkyū-buntansha) |
ADACHI Jun National Institute of Informatics, Software Research Division, Professor, ソフトウェア研究系, 教授 (80143551)
OYAMA Keizou National Institute of Informatics, Human and Social Information Research Division, Professor, 人間・社会情報研究系, 教授 (90177022)
AIZAWA Akiko National Institute of Informatics, Research Center for Information Resources, Professor, 情報学資源研究センター, 教授 (90222447)
|
Project Period (FY) |
2003 – 2005
|
Keywords | Digital Library / Bibliographic Matching / Record Linkage / Document Image Analysis / Approximate String Matching / Information Extraction |
Research Abstract |
This study aims at developing a bibliographic information integration system which provides with an analysis method for bibliographic information obtained from multiple information sources, robust bibliographic matching function, and efficient information access. In this study, we achieved the following research results. (1)We developed a statistical model for analyzing various kinds of bibliographic strings. The proposed model is based on hidden Markov model and it enables to extract bibliographic components from refer strings. The model has ability to describe error patterns strings, therefore it can be applied reference strings obtained by OCR. We showed that the model can make matching of references strings with the accuracy of about 95% experimentally. (2)We developed an indexing method for searching records from large bibliographic databases. This method uses frequent string patterns appearing in the database and extracts variable n-grams adaptively. By this index, we can merge multiple bibliographic databases efficiently. (3)We developed a method to gather bibliographic data existing in a distributed and autonomous information network. In the proposed method, autonomous systems exchange meta data about bibliographic to discover the cite that holds the desired bibliographic information. In this method, we realized efficient query processing mechanism in autonomous and distributed environment by changing the query processing route adaptively using the meta data.
|
Research Products
(12 results)