• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

A Study on Cross-Lingual Knowledge Discovery Systems

Research Project

Project/Area Number 11480088
Research Category

Grant-in-Aid for Scientific Research (B)

Allocation TypeSingle-year Grants
Section一般
Research Field 情報システム学(含情報図書館学)
Research InstitutionNARA INSTITUTE OF SCIENCE AND TECHNOLOGY

Principal Investigator

UEMURA Shunsuke  Nara Institute of Science and Technollogy, Graduate School of Information Science, Professor, 情報科学研究科, 教授 (00203480)

Co-Investigator(Kenkyū-buntansha) HATANO Kenji  Nara Institute of Science and Technollogy, Graduate School of Information Science, Assitant Professor, 情報科学研究科, 助手 (80314532)
AMAGASA Toshiyuki  Nara Institute of Science and Technollogy, Graduate School of Information Science, Assitant Professor, 情報科学研究科, 助手 (70314531)
YOSHIKAWA Masatoshi  Nara Institute of Science and Technollogy, Graduate School of Information Science, Associate Professor, 情報科学研究科, 助教授 (30182736)
WATANABE Masahiro  The National Institute of Special Education Center for Policy Research, International Collaboration and Special Education Information Services, Researcher, 総合政策情報センター, 研究員 (80321595)
MAEDA Akira  Ritsumeikan University, Department of Computer Science, Associate Professor, 理工学部・情報学科, 助教授 (20351322)
石川 正敏  島根県立大学, 総合政策学部, 助手 (90332973)
Project Period (FY) 1999 – 2002
Project Status Completed (Fiscal Year 2002)
Budget Amount *help
¥14,800,000 (Direct Cost: ¥14,800,000)
Fiscal Year 2002: ¥1,800,000 (Direct Cost: ¥1,800,000)
Fiscal Year 2001: ¥4,500,000 (Direct Cost: ¥4,500,000)
Fiscal Year 2000: ¥5,000,000 (Direct Cost: ¥5,000,000)
Fiscal Year 1999: ¥3,500,000 (Direct Cost: ¥3,500,000)
Keywordscross-lingual information retrieval / query term disambiguation / parallel corpus / WWW / CLIR / XMLデータベース / 言語横断検索 / 多言語ブラウザ / 適合性フィードバック / 問合せ拡張 / 多言語 / 知識 / 発掘 / データベース / 多言語処理 / 情報検索 / 単言語コーパス / 文字符号 / 相互情報量
Research Abstract

With the growth of the Internet and WWW in recent years, documents written in various languages are being provided. Although 80% of current Web pages are written in English, it is estimated that over a half of Web documents will be non-English in 2003. Therefore, WWW can be regarded as a huge document database which contains a mixture of documents written in various languages. However, many problems remain to be solved in order to realize a retrieval system which can handle such multilingual documents in a unified way ; e.g.the diversity of document coding systems used in Web pages, the language barrier of a non-native user to formulate a query, and the limitation on inputting the query strings and displaying the search results.
In this research project we have-studied key, technologies in order to realize cross-language information retrieval which supports conversion of cultural factors, Existing CLIR approaches require a parallel corpus or a comparable corpus for the disambiguation of translated query term, but these corpora are not readily available. Furthermore, bilingual dictionaries may not be readily available for a particular language pair (i.e.minor languages). Thus our approach focuses on a method which does not depend on available language resources as much as possible. For the disambiguation of translated query terms, we use co-occurrence statistics of two words in the target language corpus. The advantage of our approach is that it does not require rarely available language resources like a parallel corpus or a comparable corpus.

Report

(5 results)
  • 2002 Annual Research Report   Final Research Report Summary
  • 2001 Annual Research Report
  • 2000 Annual Research Report
  • 1999 Annual Research Report
  • Research Products

    (52 results)

All Other

All Publications (52 results)

  • [Publications] 前田 亮, 吉川 正俊, 植村 俊亮: "言語横断情報検索におけるWeb文書群による訳語曖昧性解消"情報処理学会論文誌:データベース. 41 SIG6(TOD7). 12-21 (2000)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] 前田 亮, 関 慶妍, 吉川 正俊, 植村 俊亮: "Web文書の符号系および使用言語の自動識別"電子情報通信学会論文誌D-II. J84-D-II No.1. 150-158 (2001)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] Fatiha Sadat, Akira Maeda, Masatoshi Yoshikawa, Shunsuke Uemura: "Exploiting and Combining Multiple Resources for Query Expansion in Cross-Language Information Retrieval"情報処理学会論文誌:データベース. 43 SIG9(TOD15). 39-54 (2002)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] Akira Maeda, Shunsuke Uemura: "Key Technologies for Multilingual Information Processing on WWW"Fourth International Symposium on Standardization of Multilingual Information Technology (MLIT-4). (CD-ROM). (1999)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] Akira Maeda, Fatiha Sadat, Masatoshi Yoshikawa, Shunsuke Uemura: "Query Term Disambiguation for Web Cross-Language Information Retrieval using a Search Engine"The Fifth Informational Workshop on Information Retrieval with Asian Languages (IRAL2000). 25-32 (2000)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] Fatiha Sadat, Akira Meda, Masatoshi Yoshikawa, Shunsuke Uemura: "Cross-Language Information Retrieval Via Dictionary-based and Statistics-based Methods"2001 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM'01). II. 26-28 (2001)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] Fatiha Sadat, Akira Meda, Masatoshi Yoshikawa, Shunsuke Uemura: "Query Expansion Technique for the CLEF Bilingual Track"Working Notes for the CLEF 2001 Workshop. 99-104 (2001)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] Fatiha Sadat: "Cross-Language Information Retrieval via Hybrid Combination of Query Expansion Techniques"The LREC 2002 Workshop on Using Semantics for Information Retrieval and Filtering : State of the Art and Future Research. (CD-ROM). (2002)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] Fatiha Sadat, Akira Maeda, Masatoshi Yoshikawa, Shunsuke Uemura: "A Combined Statistical Query Term Disambiguation in Cross-Language Information Retrieval"The ACL-02 Student Research Workshop. (2002)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] Fatiha Sadat, Akira Maeda, Masatoshi Yoshikawa, Shunsuke Uemura: "A Combined Statistical Query Term Disambiguation in Cross-Language Information Retrieval"The third International Workshop on Natural Language and Information Systems (NLIS2002). 251-255 (2002)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] Fatiha Sadat, Masatoshi Yoshikawa, Sunsuke Uemura: "Exploiting Thesauri and Hierarchical Categories in Cross-Language Information Retrieval"5th International Conference on Text, Speech and Dialogues (TSD2002). 139-146 (2002)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] Fatiha Sadat, Masatoshi Yoshikawa, Sunsuke Uemura: "Cross-Language Information Retrieval Using Multiple Resources and Combinations for Query Expansion"Second International Conference on Advances in Information Systems (ADVIS2002). 23-25 (2002)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] Fatiha Sadat, Masatoshi Yoshikawa, Sunsuke Uemura: "The Role of Query Expansion Techniques in French-English Information Retrieval"Journees Science and Technology Workshop 2002 (JST2002). (CD-ROM). (2002)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] Fatiha Sadat, Akira Maeda, Masatoshi Yoshikawa, and Shunsuke Uemura: "Exploiting and Combining Multiple Resources for Query Expansion in Cross-Language Information Retrieval"Information Processing Society of Japan Transactions : Database. Vol.43, No.SIG9 (TOD15). 39-54 (2002)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] Akira Maeda and Shunsuke Uemura: "Key Technologies for Multilingual Information Processing on WWW"Fourth International Symposium on Standardization of Multilingual Information Technology (MLIT-4),. (1999)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] Akira Maeda, Fatiha Sadat, Masatoshi Yoshikawa, and Shunsuke Uemura: "Query Term Disambiguation for Web Cross-Language Information Retrieval using a Search Engine"The Fifth International Workshop on Information Retrieval with Asian Languages (IRAL 2000). September 30-October 1. 25-32 (2000)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] Fatiha Sadat, Akira Maeda, Masatoshi Yoshikawa, and Shunsuke Uemura: "Cross-Language Information Retrieval Via Dictionary-based and Statistics-based Methods"Proc.of 2001 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM'01). Vol.II, Aug.26-28. 595-598 (2001)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] Fatiha Sadat, Akira Maeda, Masatoshi Yoshikawa and Shunsuke Uemura: "Query Expansion Technique for the CLEF Bilingual Track"Working Notes for the CLEF 2001 Workshop. Sep.3-4. 99-104 (2001)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] Fatiha Sadat, Akira Maeda, Masatoshi Yoshikawa, and Shunsuke Uemura: "Statistical Query Disambiguation, Translation and Expansion in Cross-Language Information Retrieval"The LREC 2002 Workshop on Using Semantics for Information Retrieval and Filtering : State of the Art and Future Research. June 2. (2002)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] Fatiha Sadat: "Cross-Language Information Retrieval via Hybrid Combination of Query Expansion Techniques"The ACL-02 Student Research Workshop. July 7-12. (2002)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] Fatiha Sadat, Akira Maeda, Masatoshi Yoshikawa, and Shunsuke Uemura: "A Combined Statistical Query Term Disambiguation in Cross-Language Information Retrieval"The Third International Workshop on Natural Language and Information Systems (NLIS2002). September 2-3. 251-255 (2002)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] Fatiha Sadat, Masatoshi Yoshikawa, and Shunsuke Uemura: "Exploiting Thesauri and Hierarchical Categories in Cross-Language Information Retrieval"5th International Conference on Text, Speech and Dialogue (TSD 2002), Lecture Notes in Computer Science (LNCS), Springer-Verlag. Vol.2448, September 9-10. 139-146 (2002)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] Fatiha Sadat, Masatoshi Yoshikawa, Shunsuke Uemura: "Cross-Language Information Retrieval Using Multiple Resources and Combinations for Query Expansion"Second International Conference on Advances in Information Systems (ADVIS2002), Lecture Notes in Computer Science (LNCS), Springer-Verlag. Vol.2457, October 23-25. 114-122 (2002)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] Fatiha Sadat, Masatoshi Yoshikawa, and Shunsuke Uemura: "The Role of Query Expansion Techniques in French-English Information Retrieval"Journees Science and Technology Workshop 2002 (JST2002). November 17-19. (2002)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] F.Sadat, A.Maeda, M.Yoshikawa, S.Uemura: "Statistical Query Disambiguation, Translation and Expansion in Cross-Language Information Retrieval"The LREC 2002 Workshop on Using Semantics for Information Retrieval and Filtering : state of the Art and Future Research. (CD-ROM). (2002)

    • Related Report
      2002 Annual Research Report
  • [Publications] F.Sadat, A.Maeda, M.Yoshikawa, S.Uemura: "A Combined Statistical Query Term Disambiguation in Cross-Language Information Retrieval"The Third International Workshop on Natural Language and Information Systems (NLIS2002). 251-255 (2002)

    • Related Report
      2002 Annual Research Report
  • [Publications] F.Sadat, M.Yoshikawa, S.Uemura: "Exploiting Thesauri and Hierarchical Categories in Cross-Language Information Retrieval"5th International Conference on Text, Speech and Dialogue (TSD 2002). 139-146 (2002)

    • Related Report
      2002 Annual Research Report
  • [Publications] 木村文則, 前田亮, 吉川正俊, 植村俊亮: "ディレクトリ型検索エンジンを用いた言語横断情報検索"Forum on Information Technology 2002 (FIT). 第2分冊. 69-70 (2002)

    • Related Report
      2002 Annual Research Report
  • [Publications] F.Sadat, A.Maeda, M.Yoshikawa, S.Uemura: "Exploiting and Combining Multiple Resources for Query Expansion in Cross-Language Information Retrieval"情報処理学会論文誌:データベース. SIG9(TOD15). 39-54 (2002)

    • Related Report
      2002 Annual Research Report
  • [Publications] K.Hatano, H.Kinutani, M.Yoshikawa, S.Uemura: "Extraction of Partial XML Documents Using IR-based Structure and Contents Analysis"Conceptual Modeling for New Information Systems Technologies. 334-347 (2002)

    • Related Report
      2002 Annual Research Report
  • [Publications] 杉山一成, 波多野賢治, 吉川正俊, 植村俊亮: "On Some Methods for Improving Feature Vectors for Web Pages and their Retrieval Accuracy"電子情報通信学会第14回データ光学ワークショップ(DEWS2003). (Web上公開予定). (2003)

    • Related Report
      2002 Annual Research Report
  • [Publications] 木村文則, 前田亮, 吉川正俊, 植村俊亮: "Webディレクトリの階層構造を利用した言語横断検索"電子情報通信学会第14回データ工学ワークショップ(DEWS2003). (Web上公開予定). (2003)

    • Related Report
      2002 Annual Research Report
  • [Publications] F.Sadat, M.Yoshikawa, S.Uemura: "Combining Multiple Knowledge Sources for an Efficient Query Expansion in Cross-Language Information Retrieval"Forum on Information Technology 2002 (FIT). 第2分冊. 67-68 (2002)

    • Related Report
      2002 Annual Research Report
  • [Publications] F.Sadat, M.Yoshikawa, S.Uemura: "The Role of Query Expansion Techniques in French-English Information Retrieval"Journe'es Science and Technology Workshop 2002 (JST2002). (CD-ROM). (2002)

    • Related Report
      2002 Annual Research Report
  • [Publications] F.Sadat, M.Yoshikawa, S.Uemura: "Cross-Language Information Retrieval Using Multiple Resources and Combinations for Query Expansion"Second International Conference on Advances in Information Systems (ADVIS2002). 114-122 (2002)

    • Related Report
      2002 Annual Research Report
  • [Publications] M.Yoshikawa, T.Amagasa, T.Shimura, S.Uemura: "XRel: A Path-Based Approach to Storage and Retrieval of XML Documents using Rela-tional Databases"ACM Transactions on Internet Technology. 1・1. 110-141 (2001)

    • Related Report
      2001 Annual Research Report
  • [Publications] 波多野 賢治, 渡邉 正裕, 吉川 正俊, 植村 俊亮: "情報検索技術を用いた部分文書構造の自動抽出"情報処理学会論文誌:データベース. 42・SIG8(TOD10). 38-46 (2001)

    • Related Report
      2001 Annual Research Report
  • [Publications] F.Sadat, A.Meada, M.Yoshikawa, S.Uemura: "Cross-Language Information Retrieval Via Dictionary-based and Statistics-based Methods"Proc. of 2001 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing(PACRIM'01). II. 595-598 (2001)

    • Related Report
      2001 Annual Research Report
  • [Publications] F.Sadat, A.Maeda, M.Yoshikawa, S.Uemura: "Query Expansion Tehnique for the CLEF Bilingual Track"Working Notes for the CLEF 2001 Workshop. 99-104 (2001)

    • Related Report
      2001 Annual Research Report
  • [Publications] D.D.Kha, M.Yoshikawa, S.Uemur: "An XML Indexing Structure with Relative Region Coordinate"Proc of the 17th IEEE International Conference on Data Engineering(ICDE2001). 313-320 (2001)

    • Related Report
      2001 Annual Research Report
  • [Publications] T.Amagasa, M.Yoshikawa, S.Uemura: "Realizing Temporal XML Repositories using Temporal Relational Databases"The Third International Symposium on Cooperative Database Systems for Advanced Applications (CODAS'2001). 63-67 (2001)

    • Related Report
      2001 Annual Research Report
  • [Publications] Masatoshi Yoshikawa: ""XML Databases", In Nontraditional Database Systems-Results from the Japanese Project on Advanced Database"The Information Processing Society of Japan and Taylor & Books Ltd. (2002)

    • Related Report
      2001 Annual Research Report
  • [Publications] 前田亮,関慶妍,吉川正俊,植村俊亮: "Web文書の符号系および使用言語の自動識別"電子情報通信学会論文誌D-II. J84・D-II. 115-130 (2001)

    • Related Report
      2000 Annual Research Report
  • [Publications] 前田亮,吉川正俊,植村俊亮: "前田亮,吉川正俊,植村俊亮"情報処理学会論文誌;データベース. 61・SIG6. 12-21 (2000)

    • Related Report
      2000 Annual Research Report
  • [Publications] Akira Maeda,Fatiha Sadat,Masatoshi Yoshikawa,and Shunsuke Uemura: "Query Term Disambiguation for Web Cross-Language Information Retrieval using a Search Engine"Proc.of the 5th International Workshop on Information Retrieval with Asian Languages(IRAL2000). 25-32 (2000)

    • Related Report
      2000 Annual Research Report
  • [Publications] Sadat Fatiha,Akira Maeda,Masatoshi Yoshikawa,and Shunsuke Uemura: "Integrating Dictionary-based and Statistical-based Approaches in Cross-Language Information Retrieval"情報処理学会データベースシステム研究会報告. DBS-121-1-10. (2000)

    • Related Report
      2000 Annual Research Report
  • [Publications] 吉川正俊、志村壮是、植村俊亮: "オブジェクト関係データベースを用いたXML文書の格納と検索"情報処理学会論文誌:データベース. 40. 115-131 (1999)

    • Related Report
      1999 Annual Research Report
  • [Publications] 前田 亮、関 慶妍、植村俊亮: "多言語知識発掘システムの構築"情報処理学会研究報告. 99-DBS-118/99-FI-54. 1-8 (199)

    • Related Report
      1999 Annual Research Report
  • [Publications] 阪口哲男、中尾茂岳、前田 亮、杉本重雄、田畑孝一: "タグ付き文書を対象とした多言語全文検索システム"情報知識学会第7回研究報告会講演論文集. 49-52 (1999)

    • Related Report
      1999 Annual Research Report
  • [Publications] Hachim Haddouti,Akira Maeda,Tetsuo Sakaguchi,Shigeo Sugimoto,and Koichi Tabata: "Towards Arabic Rendering Issuse-MHTML Approach"Proceedings of the Arabic Translation and Localisation Symposium (ATLA'99). (1999)

    • Related Report
      1999 Annual Research Report
  • [Publications] Akira Maeda and Shunsuke Uemura: "Key Technologies for Multilingual Information Processing WWW"Proceeding of the Fourth International Symposium on Standardization of Multilingual Information Technology (MLIT-4). (1999)

    • Related Report
      1999 Annual Research Report
  • [Publications] M.Yoshikawa,H.Kinutani,Y.Yamamoto,H.Kato and S.Uemura: "Advances in Databases and Multimedia for the New Century-A Swiss/Japanese Perspective-"World Scientfic Publishing. 140 (2000)

    • Related Report
      1999 Annual Research Report

URL: 

Published: 1999-04-01   Modified: 2016-04-21  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi