2002 Fiscal Year Final Research Report Summary

Toward Optimal Feature Selection for Word Sense Disambiguation and its Application to Information Retrieval

Research Project

Project/Area Number	13680441
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Single-year Grants
Section	一般
Research Field	Intelligent informatics
Research Institution	University of Yamanashi
Principal Investigator	FUKUMOTO Fumiyo Univ. of Yamanashi, Faculty of Engineering, Ass. Prof., 工学部, 助教授 (60262648)
Project Period (FY)	2001 – 2002
Keywords	Word Sense Disambiguation / Feature Selection / Information Retrieval
Research Abstract	This study describes mainly the following three methods. One is a method for feature selection which is used for disambiguating word senses. In our method, sets of features which correspond to each different sense of an ambiguous word are selected by applying a machine learning technique. Empirical results which were tested on the two data, one is 'line' and 'interest' data, and another is SENSEVAL1 data, show that the performance of the method is comparable to the existing sense disambiguation techniques. The second is a method for learning text representation for categorization task. The representation of words in the text, is a variation on the synset of WordNet. A machine learning technique is applied to induce a representative model. The results show that incorporating WordNet into text representation can lead to improvements, especially for rare categories. The third is a method for text classification which manipulates a large collection of data using two well-known machine learning techniques, Naive Bayes (NB) and Support Vector Machines (SVMs). NB is based on the assumption of word independence in a text, which makes the computation of it far more efficient. SVMs, on the other hand, have the potential to handle large feature spaces, which makes it possible to produce better performance. The training data for SVMs are extracted using NB classifiers according to the category hierarchies, which makes it possible to reduce the amount of computation necessary for classification without sacrificing accuracy.

Research Products

(8 results)

All Other

All Publications (8 results)

[Publications] 福本文代: "語義の曖昧性解消のための最適な属性選択"情報処理学会論文誌. 43・1. 20-33 (2002)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 福本文代, 鈴木良弥: "WordNetの同義語クラスとその上位関係を利用した文書の自動分類"情報処理学会論文誌. 32・6. 1852-1865 (2002)
- Description
  「研究成果報告書概要(和文)」より
[Publications] F.Fukumoto, Y.Suzuki: "Manipulating Large Corpora for Text Classification"Conference on Empirical Methods in Natural Language Processing(EMNLP'02). 196-203 (2002)
- Description
  「研究成果報告書概要(和文)」より
[Publications] F.Fukumoto, Y.Suzuki: "Detecting Shifts in News Stories for Paragraph Extraction"19^<th> International Conference on Computational Linguistics(COLING'02). 280-286 (2002)
- Description
  「研究成果報告書概要(和文)」より
[Publications] Fumiyo Fukumoto: "Toward Optimal Features Selection for Word Sense Disambiguation"Trans of Information Processing Society of Japan. (43)1. 20-33 (2002)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Fumiyo Fukumoto, and Yoshimi Suzuki: "Using Synonyms and Their Hypernymy Relations of WordNet for Text Classification"Trans of Information Processing Society of Japan. (43)6. 1852-1865 (2002)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Fumiyo Fukumoto, and Yoshimi Suzuki: "Manipulating Large Corpora for Text Classification"Conference on Empirical Methods in Natural Language Processing (EMNLP'02). 196-203 (2002)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Fumiyo Fukumoto, and Yoshimi Suzuki: "Detecting Shifts in News Stories for Paragraph Extraction"19^<th> International Conference on Computational Linguistics (COLING'02). 280-286 (2002)
- Description
  「研究成果報告書概要(欧文)」より

2002 Fiscal Year Final Research Report Summary

Toward Optimal Feature Selection for Word Sense Disambiguation and its Application to Information Retrieval

Principal Investigator

FUKUMOTO Fumiyo Univ. of Yamanashi, Faculty of Engineering, Ass. Prof., 工学部, 助教授 (60262648)

Research Products

[Publications] 福本 文代: "語義の曖昧性解消のための最適な属性選択"情報処理学会論文誌. 43・1. 20-33 (2002)

Description

[Publications] 福本 文代, 鈴木 良弥: "WordNetの同義語クラスとその上位関係を利用した文書の自動分類"情報処理学会論文誌. 32・6. 1852-1865 (2002)

Description

[Publications] F.Fukumoto, Y.Suzuki: "Manipulating Large Corpora for Text Classification"Conference on Empirical Methods in Natural Language Processing(EMNLP'02). 196-203 (2002)

Description

[Publications] F.Fukumoto, Y.Suzuki: "Detecting Shifts in News Stories for Paragraph Extraction"19^<th> International Conference on Computational Linguistics(COLING'02). 280-286 (2002)

Description

[Publications] Fumiyo Fukumoto: "Toward Optimal Features Selection for Word Sense Disambiguation"Trans of Information Processing Society of Japan. (43)1. 20-33 (2002)

Description

[Publications] Fumiyo Fukumoto, and Yoshimi Suzuki: "Using Synonyms and Their Hypernymy Relations of WordNet for Text Classification"Trans of Information Processing Society of Japan. (43)6. 1852-1865 (2002)

Description

[Publications] Fumiyo Fukumoto, and Yoshimi Suzuki: "Manipulating Large Corpora for Text Classification"Conference on Empirical Methods in Natural Language Processing (EMNLP'02). 196-203 (2002)

Description

[Publications] Fumiyo Fukumoto, and Yoshimi Suzuki: "Detecting Shifts in News Stories for Paragraph Extraction"19^<th> International Conference on Computational Linguistics (COLING'02). 280-286 (2002)

Description

[Publications] 福本文代: "語義の曖昧性解消のための最適な属性選択"情報処理学会論文誌. 43・1. 20-33 (2002)

[Publications] 福本文代, 鈴木良弥: "WordNetの同義語クラスとその上位関係を利用した文書の自動分類"情報処理学会論文誌. 32・6. 1852-1865 (2002)