Project/Area Number |
13224087
|
Research Category |
Grant-in-Aid for Scientific Research on Priority Areas
|
Allocation Type | Single-year Grants |
Review Section |
Science and Engineering
|
Research Institution | National Institute of Informatics |
Principal Investigator |
ADACHI Jun National Institute of Informatics, Software Research Division, Professor, ソフトウェア研究系, 教授 (80143551)
|
Co-Investigator(Kenkyū-buntansha) |
AIZAWA Akiko National Institute of Informatics, Research Center for Information Resources, Professor, 情報学資源研究センター, 教授 (90222447)
KANDO Noriko National Institute of Informatics, Software Research Division, Professor, ソフトウェア研究系, 教授 (80270445)
KAGEURA Kyo Tokyo University, Graduate school of Education, Associate Professor, 教育学研究科, 助教授 (00211152)
TAKASU Atsuhiro National Institute of Informatics, Research Center for Testbeds and Prototyping, Professor, 実証研究センター, 教授 (90216648)
AIHARA Kenro National Institute of Informatics, Software Research Division, Associate Professor, ソフトウェア研究系, 助教授 (90300706)
片山 紀生 国立情報学研究所, 情報メディア研究系, 助教授 (60280559)
井上 雅史 国立情報学研究所, 実証研究センター, 助手 (50390597)
|
Project Period (FY) |
2001 – 2005
|
Project Status |
Completed (Fiscal Year 2005)
|
Budget Amount *help |
¥111,100,000 (Direct Cost: ¥111,100,000)
Fiscal Year 2005: ¥28,000,000 (Direct Cost: ¥28,000,000)
Fiscal Year 2004: ¥26,400,000 (Direct Cost: ¥26,400,000)
Fiscal Year 2003: ¥27,500,000 (Direct Cost: ¥27,500,000)
Fiscal Year 2002: ¥29,200,000 (Direct Cost: ¥29,200,000)
|
Keywords | Informatics / Information Retrieval / Text Processing / Text Mining / Multimedia Processing / Data Engineering / 不均質コンテンツ / Web情報検索 / 多言語トピック活用 / マルチメディア処理 / 情報構造分析 / 多言語Webトピック活用 / 情報構造解析 / テストコレクション / リンク解析 / クラスタリング / Web情報資源 / ジャンル分析 / 動的クラスタリング / リンク分析 / カーネル法 / 多言語トピックディテクション / 映像処理 |
Research Abstract |
This project aims at developing technology for utilizing the heterogeneous contents. We studied link and structural analysis of Webs, cross-media processing technology, epistemological framework of the Web and developed corpora for evaluating information utilization methods for the Web. 1) We developed an information extraction and organization methods using the textual and graphical structure of the Web -Web page clustering methods based on the link structure -Topic tracking using non-linear time-content analysis 2) We proposed some advanced methods for processing and utilizing multimedia as follows, focusing on media heterogeneity: -topic detection from multilingual text collection -user adaptive text summarization based on content types -crossmedia search by enhancing annotation-based image retrieval model with content-based features -JuNii+: user interface for image retrieval -utilizing interview video archives for learning 3) We organized a series of evaluation workshops "NTCIR", in which a number of researchers participated to develop new testbeds, each of which consists of a common test data for research on heterogeneous digital content. As the results, for instance, we built up a terabyte-scale dataset by crawling the -jp domain, and established evaluation methodologies to meet the practical situation. These contributed to the progress of the research in this area 4) We analyzed the epistemological framework within which engineers process and model the Web information sources, contrasting it with the modern system of printed books. On the basis of the analysis, we concluded that it is hard to directly apply the model defined by the quintessentially modern concept of information accumulation as represented in the ideal of libraries, and showed that "information editing" would be necessary to explore fully the potential of web information sources.
|