• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Toward Optimal Feature Selection for Word Sense Disambiguation and its Application to Information Retrieval

Research Project

Project/Area Number 13680441
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeSingle-year Grants
Section一般
Research Field Intelligent informatics
Research InstitutionUniversity of Yamanashi

Principal Investigator

FUKUMOTO Fumiyo  Univ. of Yamanashi, Faculty of Engineering, Ass. Prof., 工学部, 助教授 (60262648)

Project Period (FY) 2001 – 2002
Project Status Completed (Fiscal Year 2002)
Budget Amount *help
¥3,400,000 (Direct Cost: ¥3,400,000)
Fiscal Year 2002: ¥1,400,000 (Direct Cost: ¥1,400,000)
Fiscal Year 2001: ¥2,000,000 (Direct Cost: ¥2,000,000)
KeywordsWord Sense Disambiguation / Feature Selection / Information Retrieval / 多義語の曖昧さの解消 / 機械学習 / 文書の自動分類
Research Abstract

This study describes mainly the following three methods. One is a method for feature selection which is used for disambiguating word senses. In our method, sets of features which correspond to each different sense of an ambiguous word are selected by applying a machine learning technique. Empirical results which were tested on the two data, one is 'line' and 'interest' data, and another is SENSEVAL1 data, show that the performance of the method is comparable to the existing sense disambiguation techniques. The second is a method for learning text representation for categorization task. The representation of words in the text, is a variation on the synset of WordNet. A machine learning technique is applied to induce a representative model. The results show that incorporating WordNet into text representation can lead to improvements, especially for rare categories. The third is a method for text classification which manipulates a large collection of data using two well-known machine learning techniques, Naive Bayes (NB) and Support Vector Machines (SVMs). NB is based on the assumption of word independence in a text, which makes the computation of it far more efficient. SVMs, on the other hand, have the potential to handle large feature spaces, which makes it possible to produce better performance. The training data for SVMs are extracted using NB classifiers according to the category hierarchies, which makes it possible to reduce the amount of computation necessary for classification without sacrificing accuracy.

Report

(3 results)
  • 2002 Annual Research Report   Final Research Report Summary
  • 2001 Annual Research Report
  • Research Products

    (14 results)

All Other

All Publications (14 results)

  • [Publications] 福本 文代: "語義の曖昧性解消のための最適な属性選択"情報処理学会論文誌. 43・1. 20-33 (2002)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] 福本 文代, 鈴木 良弥: "WordNetの同義語クラスとその上位関係を利用した文書の自動分類"情報処理学会論文誌. 32・6. 1852-1865 (2002)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] F.Fukumoto, Y.Suzuki: "Manipulating Large Corpora for Text Classification"Conference on Empirical Methods in Natural Language Processing(EMNLP'02). 196-203 (2002)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] F.Fukumoto, Y.Suzuki: "Detecting Shifts in News Stories for Paragraph Extraction"19^<th> International Conference on Computational Linguistics(COLING'02). 280-286 (2002)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] Fumiyo Fukumoto: "Toward Optimal Features Selection for Word Sense Disambiguation"Trans of Information Processing Society of Japan. (43)1. 20-33 (2002)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] Fumiyo Fukumoto, and Yoshimi Suzuki: "Using Synonyms and Their Hypernymy Relations of WordNet for Text Classification"Trans of Information Processing Society of Japan. (43)6. 1852-1865 (2002)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] Fumiyo Fukumoto, and Yoshimi Suzuki: "Manipulating Large Corpora for Text Classification"Conference on Empirical Methods in Natural Language Processing (EMNLP'02). 196-203 (2002)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] Fumiyo Fukumoto, and Yoshimi Suzuki: "Detecting Shifts in News Stories for Paragraph Extraction"19^<th> International Conference on Computational Linguistics (COLING'02). 280-286 (2002)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] 福本 文代: "語義の曖昧性解消のための最適な属性選択"情報処理学会論文誌. 43・1. 20-33 (2002)

    • Related Report
      2002 Annual Research Report
  • [Publications] 福本 文代, 鈴木 良弥: "WordNet同義固クラスとその上位関係を利用した文書の自動分類"情報処理学論文誌. 43・6. 1852-1865 (2002)

    • Related Report
      2002 Annual Research Report
  • [Publications] F.Fukumoto, Y.Suzuki: "Manipulating Large Corpora for Text Classification"Conference on Empirical Methods in Natural Language Processing(EMNLP'02). 196-203 (2002)

    • Related Report
      2002 Annual Research Report
  • [Publications] F.Fukumoto, Y.Suzuki: "Detecting Shifts in News Stories for Paragraph Extraction"19th International Conference on Computational Lin-guistics (COLING'02). 280-286 (2002)

    • Related Report
      2002 Annual Research Report
  • [Publications] F.Fukumoto, Y.Suzuki: "Lcarning Lexical Represcntation for Text Categorization"Proc. of the NAACL2001 Workshop on WordNet and Other Lcxical Rcsources : Applications, Exensions and Customizations. 156-161 (2001)

    • Related Report
      2001 Annual Research Report
  • [Publications] 福本文代: "語義の曖昧性解消のための最適な属性選択"情報処理学会論文誌. Vol43, No1. 20-33 (2002)

    • Related Report
      2001 Annual Research Report

URL: 

Published: 2001-04-01   Modified: 2016-04-21  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi