• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

A study of natural language learning by complementary use of tagged and untagged corpus

Research Project

Project/Area Number 13680429
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeSingle-year Grants
Section一般
Research Field Intelligent informatics
Research InstitutionIBARAKI UNIVERSITY

Principal Investigator

SHINNOU Hiroyuki  Ibaraki University, college of Engineering, Associate Professor, 工学部, 助教授 (10250987)

Project Period (FY) 2001 – 2002
Project Status Completed (Fiscal Year 2002)
Budget Amount *help
¥2,500,000 (Direct Cost: ¥2,500,000)
Fiscal Year 2002: ¥1,200,000 (Direct Cost: ¥1,200,000)
Fiscal Year 2001: ¥1,300,000 (Direct Cost: ¥1,300,000)
KeywordsUnsupervised learning / Co-training / EM algorithm / Machine Learning / WSD / Senseval-2 / Fuzzy Clustering
Research Abstract

The inductive learning approach has made a great success in natural language processing. However, this approach has serious problem that the inductive learning method needs tagged training data which is expensive. The aim of this study uses untagged corpus to overcome this issue. This approach is an unsupervised learning methods. Most of unsupervised learning methods use multiviews. Especially, Co-training proposed by Blum et al. and a method using EM algorithm proposed by Nigam et al. are representative. These methods were used for document classification. It is unknown whether they can be applied to word sense disambiguation problems which is the main problem in natural language processing. Last year, I studied Co-training, and proposed the method to relax the independence condition of two feature sets. And I made a presentation about the method in an international conference in this year. Moreover, I studied a method using EM algorithm in this year. And I applied the method to Japanese translation task of SENSEVAL2. By this, I showed that the method proposed by Nigam et al, can be applied to word sense disambiguation problems. This research was published in a Journal. In this paper, I showed that this method cannot often improve the performance of learned rules. To overcome this problem, I proposed the method using cross validation and ad hoc judgments. This method has accomplished the significant performance in Japanese dictionary task of Senseval2. In particular, the score for noun words is as much as best public score. This result were shared in a workshop. And the paper on this method is accepted in an international conference. Next I studied Fuzzy clustering which is essentially shmilar to EM algorithm. I used Fuzzy clustering as a kind of unsupervised learning methods. This study was made a presentation in a workshop held in March.

Report

(3 results)
  • 2002 Annual Research Report   Final Research Report Summary
  • 2001 Annual Research Report
  • Research Products

    (23 results)

All Other

All Publications (23 results)

  • [Publications] H.Shinnou: "Learning of word sense disambiguation rules by Co-training, checking co-occurrence of features"LREC-02. 4. 1380-1384 (2002)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] 新納浩幸, 佐々木稔: "EMアルゴリズムの最適ループ回数の予測を用いた語義判別規則の教師なし学習"情報処理学会自然言語処理研究会. 152-8. 51-58 (2002)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] 新納浩幸, 佐々木稔: "情報検索手法を利用した語義判別問題の高速解法"情報処理学会自然言語処理研究会. 152-9. 57-62 (2002)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] 高橋篤史, 新納浩幸: "ファジィクラスタリングを用いた語義判別規則の教師なし学習"言語処理学会第9回年次大会. 306-309 (2003)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] 新納浩幸: "EMアルゴリズムを用いた教師なし学習の日本語翻訳タスクへの適用"自然言語処理. 10. 61-73 (2003)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] H.Shinnou, M.Sasaki: "Unsupervised learning of word sense disambiguation rules by estimating an optimum iteration number in the EM algorithm"Seventh Conference on Natural Language Learning. 41-48 (2003)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] Hiroyuki Shinnou: "Learning of word sense disambiguation rules by Co-training, checking co-occurrence of features"Proc. LREC-02. 1380-1384 (2002)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] Hiroyuki Shinnou and Minoru Sasaki: "Unsupervised learning of word sense disambiguation rules by estimating an optimum iteration number in the EM algorithm"NL SIG notes of IPSJ. NL-152-8. 51-58 (2002)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] Hiroyuki Shinuou and Minoru Sasaki: "Fast method of word sense disambiguation using information retrieval technique"NL SIG notes of IPSJ. NL-152-9. 57-62 (2002)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] Atsushi Takahashi and Hiroyuki Shinnou: "Unsupervised learning of word sense disambiguation rules by Fuzzy clustering"Proc. of 9th Annual Meeting of the Association for NLP. 306-309 (2002)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] Hiroyuki Shinnou: "Application of unsupervised learning using EM alogorthm to Japanese Translation Task"Journal of Natural Language Processing. 10, No. 3. 61-73 (2003)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] Hiroyuki Shinnou and Miuoru Sasaki: "Unsupervised learning of word sense disambiguation rules by estimating an optimum iteration number in the EM algorithm"7th Conference on Natural Language Learning. 41-48 (2003)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2002 Final Research Report Summary
  • [Publications] Hiroyuki Shinnou: "Learning of word sense disambiguation rules by Co-training, checking co-occurrence of features"LREC-02. 4. 1380-1384 (2002)

    • Related Report
      2002 Annual Research Report
  • [Publications] 新納浩幸, 佐々木稔: "EMアルゴリズムの最適ループ回数の予測を用いた語義判別規則の教師なし学習"情報処理学会自然言語処理研究会. 152-8. 51-58 (2002)

    • Related Report
      2002 Annual Research Report
  • [Publications] 新納浩幸, 佐々木稔: "情報検索手法を利用した語義判別問題の高速解法"情報処理学会自然言語処理研究会. 152-9. 57-62 (2002)

    • Related Report
      2002 Annual Research Report
  • [Publications] 高橋篤史, 新納浩幸: "ファジイクラスタリングを用いた語義判別規則の教師なし学習"言語処理学会第9回年次大会. 306-309 (2003)

    • Related Report
      2002 Annual Research Report
  • [Publications] 新納浩幸: "EMアルゴリズムを用いた教師なし学習の日本語翻訳タスクへの適用"自然言語処理. 10(発表予定). (2003)

    • Related Report
      2002 Annual Research Report
  • [Publications] Hiroyuki Shinnnou, Minoru Sasaki: "Unsupervised learning of word sense disambiguation rules by estimating an optimum iteration number in the EM algorithm"Seventh Conference on Natural Language Learning. (発表予定). (2003)

    • Related Report
      2002 Annual Research Report
  • [Publications] 新納浩幸: "日本語形態素解析の分類問題への変換とその解法"情報処理学会論文誌. 42-9. 2221-2228 (2001)

    • Related Report
      2001 Annual Research Report
  • [Publications] 新納浩幸: "決定リストを弱学習器としたアダブーストによる日本語単語分割"自然言語処理. 8-2. 3-18 (2001)

    • Related Report
      2001 Annual Research Report
  • [Publications] 新納浩幸: "SENSEVAL2日本語翻訳タスクに向けて作成した語義判別学習システムIbaraki"電子情報通信学会言語とコミュニケーション研究会. NLC-. 25-30 (2001)

    • Related Report
      2001 Annual Research Report
  • [Publications] 新納浩幸: "素性間の共起性を検査するCo-trainingによる語義判別規則の学習"情報処理学会自然言語処理研究会. 145-5. 29-36 (2001)

    • Related Report
      2001 Annual Research Report
  • [Publications] Hiroyuki Shinnou: "Detection of errors in training data by using a decision list and Adaboost"IJCAI-2001 workshop"Text Learning:Beyond Supervision". 61-65 (2001)

    • Related Report
      2001 Annual Research Report

URL: 

Published: 2001-04-01   Modified: 2016-04-21  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi