Budget Amount *help |
¥10,150,000 (Direct Cost: ¥9,100,000、Indirect Cost: ¥1,050,000)
Fiscal Year 2007: ¥4,550,000 (Direct Cost: ¥3,500,000、Indirect Cost: ¥1,050,000)
Fiscal Year 2006: ¥3,100,000 (Direct Cost: ¥3,100,000)
Fiscal Year 2005: ¥2,500,000 (Direct Cost: ¥2,500,000)
|
Research Abstract |
We proposed an automatic method to extract term descriptions from the World Wide Web and have built a Web search site called "Cyclone" (http://cycbne.slis.tsukuba.ac.jp), where users can efficiently obtain encyclopedic term descriptions fir specific word Senses. Approximately 750, 000 Japanese terms have been indexed as headwords. However, to explain certain headwords, specifically those related to entities such as devices and animals, it is useful to present a picture of the entity, in addition to a textual description Hand-crafted multimedia encyclopedias, such as Encarta, integrate ext, sound, usage, and video data to describe a single headword from different perspectives. However; due to the limitations of manual compilation, existing encyclopedias often lack new terms and new definitions for existing terms. In view of the above problem, the objective of this research was to produce encyclopedic content, for which we reorganized heterogeneous information in the World Wide Web and TV broadcasting. We proposed a method for integrating images on the Web and textual descriptions in Cyclone. Our method resolves any ambiguity in the meaning of an image by text analysis, so that images for a polysemous word, such as "hub (network device and center of wheel)", are classified using word senses. We also proposed a method to associate text and video information, for which we integrated information retrieval and speech recognition technologies. In addition, to associate information across languages, we proposed lemmatization and transliteration methods. Our research is a step toward the automatic compilation of multimedia encyclopedias.
|