Study on Information Utilization System for Heterogeneous Contents

研究課題番号:13224087

2005年度 研究成果報告書概要

代表者

    • ADACHI Jun
    • 研究者番号:80143551
    • National Institute of Informatics, Software Research Division, Professor

研究課題基本情報

  • 研究期間

    2001年度〜2005年度

  • 研究分野

  • 審査区分

  • 研究種目

    Grant-in-Aid for Scientific Research on Priority Areas

  • 研究機関

    National Institute of Informatics

  • 配分額

    • 2002年度:29200千円 (直接経費:29200千円)
    • 2003年度:27500千円 (直接経費:27500千円)
    • 2004年度:26400千円 (直接経費:26400千円)
    • 2005年度:28000千円 (直接経費:28000千円)

研究分担者

    • AIZAWA Akiko
    • 研究者番号:90222447
    • National Institute of Informatics, Research Center for Information Resources, Professor
    • KANDO Noriko
    • 研究者番号:80270445
    • National Institute of Informatics, Software Research Division, Professor
    • KAGEURA Kyo
    • 研究者番号:00211152
    • Tokyo University, Graduate school of Education, Associate Professor

    • TAKASU Atsuhiro
    • 研究者番号:90216648
    • National Institute of Informatics, Research Center for Testbeds and Prototyping, Professor
    • AIHARA Kenro
    • 研究者番号:90300706
    • National Institute of Informatics, Software Research Division, Associate Professor

研究概要

This project aims at developing technology for utilizing the heterogeneous contents. We studied link and structural analysis of Webs, cross-media processing technology, epistemological framework of the Web and developed corpora for evaluating information utilization methods for the Web.

1) We developed an information extraction and organization methods using the textual and graphical structure of the Web

-Web page clustering methods based on the link structure

-Topic tracking using non-linear time-content analysis

2) We proposed some advanced methods for processing and utilizing multimedia as follows, focusing on media heterogeneity:

-topic detection from multilingual text collection

-user adaptive text summarization based on content types

-crossmedia search by enhancing annotation-based image retrieval model with content-based features

-JuNii+: user interface for image retrieval

-utilizing interview video archives for learning

3) We organized a series of evaluation workshops "NTCIR", in which a number of researchers participated to develop new testbeds, each of which consists of a common test data for research on heterogeneous digital content. As the results, for instance, we built up a terabyte-scale dataset by crawling the -jp domain, and established evaluation methodologies to meet the practical situation. These contributed to the progress of the research in this area

4) We analyzed the epistemological framework within which engineers process and model the Web information sources, contrasting it with the modern system of printed books. On the basis of the analysis, we concluded that it is hard to directly apply the model defined by the quintessentially modern concept of information accumulation as represented in the ideal of libraries, and showed that "information editing" would be necessary to explore fully the potential of web information sources.

発表文献

雑誌論文

  • Teruhito Kanazawa, Akiko Aizawa, Atsuhiro Takasu, Jun Adachi: "Effectiveness of the Relevance-based Superimposition Model for Cross-language Information Retrieval" IPSJ Transactions on Databases Vol.43-STG(TOD 13). (2002)

  • Akiko Aizawa: "An Information-Theoretic Perspective of Tf-idf Measuress" Information Processing and Management Vol.39-No.1. (2003)

  • Koji Eguchi, Keizo Oyama, Emi Ishida, Noriko Kando, Kazuko Kuriyama: "Evaluation Methods for Web Retrieval Tasks Considering Hyperlink Structure" IEICE Transactions on Information and Systems E86-D-No.9. (2003),

  • Akiko Aizawa: "Improving the Performance of Text Categorization Using Low Frequency Terms" Journal of Information Processing Society of Japan Vol.44-No.7. (2003)

  • Kyung-Soon Lee, Kyo Kageura, Key-Sun Choi: "Implicit Ambiguity Resolution Based on Cluster Analysis in Cross-Language Information Retrieval" Information Processing and Management Vol.40-No.1. (2004)

  • Atsuhiro Takasu, Kenro Aihara: "Bibliographic Attribute Extraction from References Based on Text Recognition Error Model" Journal of IEICE, D-II J87-D-II-No.6. (2004)

  • Kageura, K., Daille, B., Nakagawa, H., Chien: "L-F. Recent trends in computational terminology" Terminology Vol.10-No.1. (2004)

  • Tomonari Masada, Atsuhiro Takasu, Jun Adachi: "Decomposing the Web Graph into Parametarized Connected Components" IEICE Transactions on Information and Systems E87-D-No.2. (2004),

  • Akiko Aizawa, Atsuhiro Takasu, Keizo Oyama, Jun Adachi: "Techniques and Research Trends in Record Linkage Studies" Journal of IEICE D1,VOL.J88-D1-N0.3. (2005)

  • Tomonari Masada, Atsuhiro Takasu, Jun Adachi: "Improving Web search Performance with Hyperlink Information" IPSJ Transactions on Databases Vol.46-STG8 (TOD 26). (2005)

  • Frederic C.Gey, Noriko Kando, Carol Peters: "Cross-language Information Retrieval : the Roard Ahead" Information Processing and Management Vol.41-No.3. (2005)

  • Tsuneaki Kato, Jun'ichi Fukumoto, Fumito Masui, Noriko Kando: "Are Open-domain Question Answering Technologies Useful for Information Access Dialogues? -An Empirical Study and a Proposal of a Novel Challenge" ACM Transactions of Asian Language, Information Processing Vol.4-No.3.

  • Yohei Seki, Koji Eguchi, Noriko Kando: "Multi-Document Viewpoint Summarization Based on Users' Information Needs and its Evaluation" PSJ Transactions on Databases Vol.43, SIG8 (TOD26). (2005)

  • Lee, K-S., Kageura, K.: "Korean-Japanese Story Link Detection based on Event Term Weighting on Timelines and Multilingual Spaces" Information Processing and Management Vol.42-No.2. (2006)

  • Makoto Iwayama, Atsuhi Fujii, Noriko Kando, Yuzo Marukawa: "An empirical study on retrieval models for different document genres : Patents and newspaper articles" Information Processing and Management Vol.42, No.1. (2006)

  • Jun Aadchi: "Chapter 3 : Digital Library - Its Extension and Intention Observed in System Implementations" Digital Libraries---Flow of digital information and the future of libraries---(Series : Frontiers in Library and Information Science) Scientific Committee of the Japan Society of Library and Information Science ed. Bensei Publisher. (2001)

  • Kyo Kageura: "Chapter 2 : Information media in the electronic age as seen from the point of view of information management" Digital Libraries---Flow of digital information and the future of libraries---(Series : Frontiers in Library and Information Science) Scientific Committee of the Japan Society of Library and Information Science ed. Bensei Publisher. (2001)

このページのURI

http://kaken.nii.ac.jp/ja/p/13224087/2005/6/en