2007 Fiscal Year Final Research Report Summary
Producing and Evaluating Encyclopedic Content by Reorganizing Heterogeneous Information
Project/Area Number |
17300028
|
Research Category |
Grant-in-Aid for Scientific Research (B)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Media informatics/Database
|
Research Institution | University of Tsukuba |
Principal Investigator |
FUJII Atsushi University of Tsukuba, Graduate School of Library, Information and Media Studies, Associate Professor (30302433)
|
Co-Investigator(Kenkyū-buntansha) |
ISHIKAWA Tetsuya University of Tokyo, Historiographical Institute, Professor (20041808)
ITOU Katunobu Hosei University, Faculty of Computer and Information Sciences, Professor (30356472)
AKIBA Tomoyosi Toyohashi University of Technology, Department of Information and Computer Sciences, Associate Professor (00356346)
|
Project Period (FY) |
2005 – 2007
|
Keywords | World Wide Web / Encyclopedias / Multimedia / Natural language processing / Information retrieval / Speech recoenition / User interfaces / Content production |
Research Abstract |
We proposed an automatic method to extract term descriptions from the World Wide Web and have built a Web search site called "Cyclone" (http://cycbne.slis.tsukuba.ac.jp), where users can efficiently obtain encyclopedic term descriptions fir specific word Senses. Approximately 750, 000 Japanese terms have been indexed as headwords. However, to explain certain headwords, specifically those related to entities such as devices and animals, it is useful to present a picture of the entity, in addition to a textual description Hand-crafted multimedia encyclopedias, such as Encarta, integrate ext, sound, usage, and video data to describe a single headword from different perspectives. However; due to the limitations of manual compilation, existing encyclopedias often lack new terms and new definitions for existing terms. In view of the above problem, the objective of this research was to produce encyclopedic content, for which we reorganized heterogeneous information in the World Wide Web and TV broadcasting. We proposed a method for integrating images on the Web and textual descriptions in Cyclone. Our method resolves any ambiguity in the meaning of an image by text analysis, so that images for a polysemous word, such as "hub (network device and center of wheel)", are classified using word senses. We also proposed a method to associate text and video information, for which we integrated information retrieval and speech recognition technologies. In addition, to associate information across languages, we proposed lemmatization and transliteration methods. Our research is a step toward the automatic compilation of multimedia encyclopedias.
|
Research Products
(8 results)