2004 Fiscal Year Final Research Report Summary
A study on unsupervised learning for word sense disambiguations
Project/Area Number |
15500083
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Intelligent informatics
|
Research Institution | IBARAKI UNIVERSITY |
Principal Investigator |
SHINNOU Hiroyuki IBARAKI University, the College of Engineering, Associate Professor, 工学部, 助教授 (10250987)
|
Project Period (FY) |
2003 – 2004
|
Keywords | unsupervised learning / Fuzzy clustering / EM algorithm / Belief Network / word sense disambiguation / SENSEVAL-2 |
Research Abstract |
First, we tried EM algorithm as an unsupervised learning method. The method computes membership degree for a class by using unlabeled data. By combining it and Naive Bayes, which is an inductive learning method, word sense disambiguation classifier is learned automatically. However, this method does not always improve performance of supervised learning. To avoid this dissatisfied situation, we proposed the method to estimate an optimal iteration number of EM. On this research, we issued papers on international conference proceedings and journal. Second we applied Belief Network to unsupervised learning. The Belief Network is regarded as a method extending Naive Bayer method. In Naive Bayes, the dependencies among probabilistic variables corresponding to features are assumed. In Belief Network, the assumption is a little relaxed. We showed that unlabeled data is available in comp using weight on links in Belief Network. On this research, we issued a paper on an international conference proceeding. Unsupervised learning and clustering are related closely. In fact, EM algorithm is regarded as a kind of clustering methods. To conduct clustering, we have to transform a word to a feature vector. In the transformation, it is a big problem what base vectors are used. We studied about method to select base vectors. Moreover, we applied the fuzzy clustering to unsupervised learning, and compared with the EM algorithm. On this research, we issued two papers on international conference proceedings. Active learning is similar to unsupervised method. They attack the same problem. We investigated the active learning, too. In the active learning, QBC(Query By Committee) is the standard method, but the method using loss expectation is used recently. We applied that method to word sense disambiguation problem, and compare with QBC method. On the research, we wrote a research note.
|
Research Products
(12 results)