• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

A study on unsupervised learning for word sense disambiguations

Research Project

Project/Area Number 15500083
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeSingle-year Grants
Section一般
Research Field Intelligent informatics
Research InstitutionIBARAKI UNIVERSITY

Principal Investigator

SHINNOU Hiroyuki  IBARAKI University, the College of Engineering, Associate Professor, 工学部, 助教授 (10250987)

Project Period (FY) 2003 – 2004
Project Status Completed (Fiscal Year 2004)
Budget Amount *help
¥3,200,000 (Direct Cost: ¥3,200,000)
Fiscal Year 2004: ¥1,500,000 (Direct Cost: ¥1,500,000)
Fiscal Year 2003: ¥1,700,000 (Direct Cost: ¥1,700,000)
Keywordsunsupervised learning / Fuzzy clustering / EM algorithm / Belief Network / word sense disambiguation / SENSEVAL-2 / 単語クラスタリング
Research Abstract

First, we tried EM algorithm as an unsupervised learning method. The method computes membership degree for a class by using unlabeled data. By combining it and Naive Bayes, which is an inductive learning method, word sense disambiguation classifier is learned automatically. However, this method does not always improve performance of supervised learning. To avoid this dissatisfied situation, we proposed the method to estimate an optimal iteration number of EM. On this research, we issued papers on international conference proceedings and journal.
Second we applied Belief Network to unsupervised learning. The Belief Network is regarded as a method extending Naive Bayer method.
In Naive Bayes, the dependencies among probabilistic variables corresponding to features are assumed. In Belief Network, the assumption is a little relaxed. We showed that unlabeled data is available in comp using weight on links in Belief Network. On this research, we issued a paper on an international conference proceeding.
Unsupervised learning and clustering are related closely. In fact, EM algorithm is regarded as a kind of clustering methods. To conduct clustering, we have to transform a word to a feature vector. In the transformation, it is a big problem what base vectors are used. We studied about method to select base vectors. Moreover, we applied the fuzzy clustering to unsupervised learning, and compared with the EM algorithm. On this research, we issued two papers on international conference proceedings. Active learning is similar to unsupervised method. They attack the same problem. We investigated the active learning, too. In the active learning, QBC(Query By Committee) is the standard method, but the method using loss expectation is used recently. We applied that method to word sense disambiguation problem, and compare with QBC method. On the research, we wrote a research note.

Report

(3 results)
  • 2004 Annual Research Report   Final Research Report Summary
  • 2003 Annual Research Report
  • Research Products

    (17 results)

All 2004 2003 Other

All Journal Article (11 results) Book (1 results) Publications (5 results)

  • [Journal Article] Semi-supervised learning by Fuzzy clustering and Ensemble learning2004

    • Author(s)
      H.Shinnou, M.Sasaki
    • Journal Title

      LREC-2004

      Pages: 399-402

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2004 Annual Research Report 2004 Final Research Report Summary
  • [Journal Article] Webページ内の目的部分の自動抽出2004

    • Author(s)
      新納浩幸, 佐々木稔
    • Journal Title

      情報処理学会自然言語処理研究会 163-6

      Pages: 30-40

    • NAID

      110002911725

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2004 Annual Research Report 2004 Final Research Report Summary
  • [Journal Article] 決定リストと期待損失を用いた同音異義語識別規則の能動学習2004

    • Author(s)
      紺野憲一, 新納浩幸, 佐々木稔
    • Journal Title

      言語処理学会第10回年次大会

      Pages: 757-760

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2004 Annual Research Report 2004 Final Research Report Summary
  • [Journal Article] 語義識別の誤り原因の調査とオンザフライの類似語判定2004

    • Author(s)
      藤井丈明, 新納浩幸, 佐々木稔
    • Journal Title

      言語処理学会第10回年次大会

      Pages: 753-756

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2004 Annual Research Report 2004 Final Research Report Summary
  • [Journal Article] 検索エンジンを利用した単語クラスタリング2004

    • Author(s)
      大城亜里沙, 新納浩幸, 佐々木稔
    • Journal Title

      言語処理学会第10回年次大会

      Pages: 17-20

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2004 Annual Research Report 2004 Final Research Report Summary
  • [Journal Article] Semi-supervised learning by Fuzzy clustering and Ensemble learning2004

    • Author(s)
      SHINNOU Hiroyuki, SASAKI Minoru
    • Journal Title

      LREC-2004

      Pages: 399-402

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2004 Final Research Report Summary
  • [Journal Article] Information Retrieval System using Latent Contextual Relevance2004

    • Author(s)
      SASAKI Minoru, SHINNOU Hiroyuki
    • Journal Title

      LREC-2004

      Pages: 457-460

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2004 Final Research Report Summary
  • [Journal Article] Unsupervised learning of word sense disambiguation rules by estimating an optimum iteration number in the EM algorithm2003

    • Author(s)
      SHINNOU Hiroyuki, SASAKI Minoru
    • Journal Title

      The Journal of IPSJ Vol.44, No.12

      Pages: 3211-3220

    • NAID

      110002934366

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2004 Final Research Report Summary
  • [Journal Article] Learning of word sense disambiguation rules by Belief Networks2003

    • Author(s)
      SHINNOU Hiroyuki, ABE Shuya, SASAKI Minoru
    • Journal Title

      PACLING-03

      Pages: 245-248

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2004 Final Research Report Summary
  • [Journal Article] Automatic thesaurus construction using word clustering2003

    • Author(s)
      SASAKI Minoru, SHINNOU Hiroyuki
    • Journal Title

      PACLING-03

      Pages: 55-62

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2004 Final Research Report Summary
  • [Journal Article] Unsupervised learning of word sense disambiguation rules by estimating an optimum iteration number in the EM algorithm2003

    • Author(s)
      SHINNOU Hiroyuki, SASAKI Minoru
    • Journal Title

      CoNLL-2003

      Pages: 41-48

    • NAID

      110002934366

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2004 Final Research Report Summary
  • [Book] 数理統計学の基礎2004

    • Author(s)
      新納浩幸
    • Total Pages
      175
    • Publisher
      森北出版
    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2004 Annual Research Report 2004 Final Research Report Summary
  • [Publications] 新納浩幸, 佐々木稔: "EMアルゴリズムの最適ループ回数の予測を用いた語義判別規則の教師なし学習"情報処理学会. 44-12. 3211-3220 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] Hiroyuki Shinnou, Shuya Abe, Minoru Sasaki: "Learning of word sense disambiguation rules by Belief Networks"PACLING-03. 245-248 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] Minoru Sasaki, Hiroyuki Shinnou: "Automatic thesaurus construction using word clustering"PACLING-03. 55-62 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] Hiroyuki Shinnou, Minoru Sasaki: "Unsupervised learning of word sense disambiguation rules by estimating an optimum iteration number in the EM algorithm"CoNLL-2003. 41-48 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] 新納浩幸, 佐々木稔: "多項分布と一様分布の混合分布による語義の事前分布の推定"電子情報通信学会言語理解とコミュニケーション研究会. NLC2003-43. 53-58 (2003)

    • Related Report
      2003 Annual Research Report

URL: 

Published: 2003-04-01   Modified: 2016-04-21  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi