• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

The Use of Internet Corpus in Natural Language Processing

Research Project

Project/Area Number 14580411
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeSingle-year Grants
Section一般
Research Field Intelligent informatics
Research InstitutionThe University of Electro-Communications

Principal Investigator

FURUGORI Teiji  Computer Science, The Univ. of Electro-comm., Faculty of Electro-Communications, Professor, 電気通信学部, 教授 (80114932)

Project Period (FY) 2002 – 2003
Project Status Completed (Fiscal Year 2003)
Budget Amount *help
¥2,100,000 (Direct Cost: ¥2,100,000)
Fiscal Year 2003: ¥700,000 (Direct Cost: ¥700,000)
Fiscal Year 2002: ¥1,400,000 (Direct Cost: ¥1,400,000)
KeywordsNatural Language Processing / Information Extraction / Internet Corpus / Automatic Summerization / Machine Translation / Structural Analysis / 複合語分析 / インターネットコーパス / 複合語処理 / 統計量
Research Abstract

Textual materials on the Internet, or Internet corpus, is a language resource important for and valuable in natural language processing. In this research, we have tried to it in the process of devising a method for analyzing compound words in Japanese, a writer's aid program for translating Japanese into English, and an automatic summarization system for newspaper articles on sassho-jiken.
The approach we use in natural language processing is statistical, not linguistic theoretical. We encounter a difficulty in this approach that require a solution to the spares date problem : whatever the result we may get, it will not be reliable one if it is attained from the analysis of insufficient amount of data. The data on the Internet are practically infinite, and our research has proven an effective use of Internet corpus in the areas we dealt with. At the same time, however, Iit has revealed a problem that the data are not well formed on the Internet and a device to eliminate "junk" data would be a necessary process for many language processing systems.

Report

(3 results)
  • 2003 Annual Research Report   Final Research Report Summary
  • 2002 Annual Research Report
  • Research Products

    (24 results)

All Other

All Publications (24 results)

  • [Publications] Dongli Han, Takeshi Ito, Teiji Furugori: "Structural Analysis of Compound Words in Japanese Using Semantic Dependency Relations"J.of Quautitative Linguistics. 9. 1-17 (2002)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] Teiji Furugori, Lin Rihua, Takeshi Ito, Dongli Han: "Information Extraction and Summerization for Newspaper Articles on Sassho-Jiken"IEICE Transactions. E86-D. 1728-1735 (2003)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] 韓勅, 伊藤毅志, 古郡廷治: "要素間の依存関係に基づく複合語の構造分析"電子情報通信学会誌. J86-DII. 706-714 (2003)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] Dongli Han, Takeshi Ito, Teiji Furugori: "A deterministic method for structural analysis of compound words in Japanese"Proc.of 16th Paciffic Asia Conf.on Language, Information and Computation. 79-91 (2002)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] Dongli Han, Takeshi Ito, Teiji Furugori: "Rewritins Japanese Compound nouns into expressions usable effectively in machine translation System"Proc.of 2002 IEEE Int'l Conf.on System Man and Cybernatics. (2002)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] Sawa Takakura, Takeshi Ito, Teiji Furugori: "Trans Aid : a writer's aid system for translating Japanese into English"Proc.of 2002 IEEE Int'l Conf.on System Man and Cybernatics. (2002)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] Sawa Takakura, Dongli Han, Teiji Furugori: "Proc.of Winter Int'l Symp.on Info and Comm Technology"An experiment for determining semantic, relations between main and sobordinate clauses in Complex sentence seateuces. (2004)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] Dongli Han, Takeshi Ito, Teiji Furugori: "A deterministic method for structural analysis of compound words in Japanese"Proc. of the 16th Pacific Asia Conference Language, Information and Computation. 79-91 (2002)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] Dongli Han, Takeshi Ito, Teiji Furugori: "Rewriting Japanese compound nouns into expressions usable effectively in machine translation system"Proc. of 2002 IEEE International Conference on System, Man and Cybernetics. (2002)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] Sawa Takakura, Takeshi Ito, Teiji Furugori: "TransAid : a writer's aid system for translating Japanese into English"Proc. of 2002 IEEE International Conference on System, Man and Cybernetics. (2002)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] Sawa Takakura, Dongli Han, Teiji Furugori: "An experiment for determining semantic relations between main and subordinate clauses in complex sentences"Proc. of Winter International Symposium on Information and Communication Technolog. (2004)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] Dongli Han, Takeshi Ito, Teiji Furugori: "Structural Analysis of Compound Words in Japanese Using Semantic Dependency Releations"Journal of Quantitative Linguistics. Vol.9, No.1. 1-17 (2002)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] Teiji Furugori, Lin Rihua, Takashi Ito, Dongli Han: "Information Extraction and Summarization for Newspaper Articles on Sassho-Jiken"IEICE Trans.. Vol.E86-D, No.9. 1728-35 (2003)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] Dongli Han, Takeshi Ito, Teiji Furugori: "Structural Analysis of Compound Words in Japanese Using Semantic Dependency Releations"Journal of Quantitative Linguistics. Vol.9, No.1. 1-17 (2002)

    • Related Report
      2003 Annual Research Report
  • [Publications] Teiji Furugori, Lin Rihua, Takeshi Ito, Dongli Han: "Information Extraction and Summarization for Newspaper Articles on Sassho-Jiken"IEICE Trans.. Vol..E86-D, No.9. 1728-1735 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] 韓東力, 伊藤毅志, 古郡廷治: "要素間の依存関係に基づく複合語の構造分析"電子情報通信学会論文誌. Vol.J86-D-(監), No.5. 706-714 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] Dongli Han, Takeshi Ito, Teiji Furugori: "A deterministic method for structural analysis of compound words in Japanese"Proc.of the 16th Pacific Asia Conference on Language, Information and Computation. 79-91 (2002)

    • Related Report
      2003 Annual Research Report
  • [Publications] Dongli Han, Takeshi Ito, Teiji Furugori: "Rewriting Japanese compound nouns into expressions useable effectively in machine translation system"Proc.of 2002 IEEE International Conference on System, Man and Cybernetics. (CD-ROM)WA2E4. 6 (2002)

    • Related Report
      2003 Annual Research Report
  • [Publications] Sawa Takakura, Takeshi Ito, Teiji Furugori: "TransAid : a writer's aid system for translating Japanese into English"Proc.of 2002 IEEE International Conference on System, Man and Cybernetics. (CD-ROM)WA2E3. 6 (2002)

    • Related Report
      2003 Annual Research Report
  • [Publications] Sawa Takakura, Dongli Han, Teiji Furugori: "An experiment for determining semantic relations between main and subordinate clauses in complex sentences"Proc.of Winter International Symposium on Information and Comunication Technolog. (2004)

    • Related Report
      2003 Annual Research Report
  • [Publications] Han, D., Ito, T., Furugori, T.: "Structural Analysis of Compound Words in Japanese Using Semantic Dependency Relations"Journal of Quantitative Linguistics. Vol.9,No.1. 1-17 (2002)

    • Related Report
      2002 Annual Research Report
  • [Publications] Han, D., Ito, T., Furugori, T.: "A Deterministic Method for Structural Analysis of Compound Words in Japanese"The 16th Pacific-Asia Conference on Language, Information, and Computation. 79-91 (2002)

    • Related Report
      2002 Annual Research Report
  • [Publications] Peng, Q., Wu, H., Furugori, T.: "A Method for Similarity-based Lexical Disambiguation"Journal of Natural Language Processing. Vol.9,No.2. (2002)

    • Related Report
      2002 Annual Research Report
  • [Publications] Han, D., Ito, T., Furugori, T.: "Rewriting Japanese Compound Nouns into Expressions Usable Effectively in Machine Translation Systems"IEEE SMC '02. (2002)

    • Related Report
      2002 Annual Research Report

URL: 

Published: 2002-04-01   Modified: 2016-04-21  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi