語彙化文法理論に基づく言語学的に妥当な文法の自動獲得

Research Project

Project/Area Number	15700120
Research Category	Grant-in-Aid for Young Scientists (B)
Allocation Type	Single-year Grants
Research Field	Intelligent informatics
Research Institution	The University of Tokyo
Principal Investigator	宮尾祐介東京大学, 大学院・情報学環, 助手 (00343096)
Project Period (FY)	2003 – 2005
Project Status	Completed (Fiscal Year 2005)
Budget Amount *help	¥3,300,000 (Direct Cost: ¥3,300,000) Fiscal Year 2005: ¥700,000 (Direct Cost: ¥700,000) Fiscal Year 2004: ¥1,200,000 (Direct Cost: ¥1,200,000) Fiscal Year 2003: ¥1,400,000 (Direct Cost: ¥1,400,000)
Keywords	文法開発 / 語彙化文法 / HPSG / 構文解析 / predicate argument structure / predicate-argument structure / Proposition Bank / 文法獲得 / Corpus annotation
Research Abstract	本研究課題で開発を行ってきた英語構文解析器Enjuの改良、詳細な分析、及び成果発表を行った。特に、曖昧性解消確率モデルの改良・分析を行い、その成果を国際学会で発表した。曖昧性解消には、カンマの有無、句の長さ、品詞の情報が有効に働くこと、確率モデルの学習データは比較的少量でも高精度が達成できること、長い文でも解析精度はあまり変化しないことなどを実験的に示した。これらの成果はさらなる精度向上に向けての指針となると考えられる。また、Enjuの文法および確率モデルが構文解析だけでなく文生成にも適用でき、高精度を達成することを示した。さらに、Enjuの応用についての研究も引き続き行った。生物学論文からの情報抽出に対して、昨年度はEnjuの出力(predicate argument structure)の上のパターン規則を自動獲得することで高精度が達成できることを示したが、これに加えて、機械学習アルゴリズムSVMを組み合わせることでさらに精度を向上させる研究を行った。Predicate argument structureのパターンを機械学習の素性とすることにより、Enjuの出力を機械学習の入力として利用し、これにより、機械学習のみやパターン規則のみを用いるよりも高精度が達成できることを示した。また、生物学論文の大規模データベースMEDLINEの全アブストラクト約1,500万件をEnjuで解析し、その解析結果を利用して文献検索を行うシステムを開発した。これほど大規模なテキストを構文解析する実験は初の試みである。さらに、この文献検索システムは既存のキーワード検索に比べてはるかに高い精度で検索結果が得られることを示し、構文解析の有用性を実用アプリケーションにおいて示した。

Report

(3 results)

Research Products

(12 results)

All 2005 2004 Other

All Journal Article (9 results) Publications (3 results)

[Journal Article] Probabilistic disambiguation models for wide-coverage HPSG parsing2005
- Author(s)
  Yusuke Miyao
- Journal Title
  
  Proceedings of ACL 2005
  
  Pages: 83-90
- Related Report
  2005 Annual Research Report
[Journal Article] Probabilistic CFG with Latent Annotations2005
- Author(s)
  Takuya Matsuzaki
- Journal Title
  
  Proceedings of ACL 2005
  
  Pages: 75-82
- Related Report
  2005 Annual Research Report
[Journal Article] Adapting a probabilistic disambiguation model of an HPSG parser to a new domain2005
- Author(s)
  Tadayoshi Hara
- Journal Title
  
  Proceedings of IJCNLP 2005
  
  Pages: 199-210
- Related Report
  2005 Annual Research Report
[Journal Article] Probabilistic models for disambiguation of an HPSG-based chart generator2005
- Author(s)
  Hiroko Nakanishi
- Journal Title
  
  Proceedings of IWPT 2005
  
  Pages: 93-102
- Related Report
  2005 Annual Research Report
[Journal Article] Efficacy of Beam Thresholding, Unification Filtering and Hybrid Parsing in Probabilistic HPSG Parsing2005
- Author(s)
  Takashi Ninomiya
- Journal Title
  
  Proceedings of IWPT 2005
  
  Pages: 103-114
- Related Report
  2005 Annual Research Report
[Journal Article] Biomedical Information Extraction with Predicate-Argument Structure Patterns2005
- Author(s)
  Akane Yakushiji
- Journal Title
  
  Proceedings of the First International Symposium on Semantic Mining in Biomedicine
  
  Pages: 60-69
- Related Report
  2005 Annual Research Report
[Journal Article] Deep Linguistic Analysis for the Accurate identification of Predicate-Argument Relations2004
- Author(s)
  Yusuke Miyao, Jun'ichi Tsujii
- Journal Title
  
  In the Proceeding of COLING 2004
  
  Pages: 1392-1397
- Related Report
  2004 Annual Research Report
[Journal Article] An Empirical Investigation of the Effect of Lexical Rules on Parsing with a Treebank Grammar2004
- Author(s)
  Nakanishi Hiroko, Yusuke Miyao, Jun'ichi Tsujii
- Journal Title
  
  In the Proceeding of the third TLT2004
  
  Pages: 103-114
- Related Report
  2004 Annual Research Report
[Journal Article] Finding Anchor Verbs for Biomedical IE Using Predicate-Argument Structures2004
- Author(s)
  Yakushiji, Akane, Yuka Tateisi, Yusuke Miyao, Jun'ichi Tsujii
- Journal Title
  
  In the Companion Volume to the Proceedings of 42nd ACL
  
  Pages: 157-160
- Related Report
  2004 Annual Research Report
[Publications] Yusuke Miyao, Jun'ichi Tsujii: "A model of syntactic disambiguation based on lexicalized grammars"Proceedings of the 7^<th> Conference on Natural Language Learning. 1-8 (2003)
- Related Report
  2003 Annual Research Report
[Publications] Yusuke Miyao, Takashi Ninomiya, Jun'ichi Tsujii: "Probabilistic modeling of argument structures including non-local dependencies"Proceedings of the Conference on Recent Advances in Natural Language Processing. 285-291 (2003)
- Related Report
  2003 Annual Research Report
[Publications] Yusuke Miyao, Takashi Ninomiya, Jun'ichi Tsujii: "Corpus-oriented grammar development for acquiring a Head-driven Phrase Structure Grammar from the Penn Treebank"Proceedings of the International Joint Conference on Natural Language Processing. (To appear). (2004)
- Related Report
  2003 Annual Research Report

語彙化文法理論に基づく言語学的に妥当な文法の自動獲得

Principal Investigator

宮尾 祐介 東京大学, 大学院・情報学環, 助手 (00343096)

¥3,300,000 (Direct Cost: ¥3,300,000)

Report

Research Products

[Journal Article] Probabilistic disambiguation models for wide-coverage HPSG parsing2005

Author(s)

Journal Title

Related Report

[Journal Article] Probabilistic CFG with Latent Annotations2005

Author(s)

Journal Title

Related Report

[Journal Article] Adapting a probabilistic disambiguation model of an HPSG parser to a new domain2005

Author(s)

Journal Title

Related Report

[Journal Article] Probabilistic models for disambiguation of an HPSG-based chart generator2005

Author(s)

Journal Title

Related Report

[Journal Article] Efficacy of Beam Thresholding, Unification Filtering and Hybrid Parsing in Probabilistic HPSG Parsing2005

Author(s)

Journal Title

Related Report

[Journal Article] Biomedical Information Extraction with Predicate-Argument Structure Patterns2005

Author(s)

Journal Title

Related Report

[Journal Article] Deep Linguistic Analysis for the Accurate identification of Predicate-Argument Relations2004

Author(s)

Journal Title

Related Report

[Journal Article] An Empirical Investigation of the Effect of Lexical Rules on Parsing with a Treebank Grammar2004

Author(s)

Journal Title

Related Report

[Journal Article] Finding Anchor Verbs for Biomedical IE Using Predicate-Argument Structures2004

Author(s)

Journal Title

Related Report

[Publications] Yusuke Miyao, Jun'ichi Tsujii: "A model of syntactic disambiguation based on lexicalized grammars"Proceedings of the 7^<th> Conference on Natural Language Learning. 1-8 (2003)

Related Report

[Publications] Yusuke Miyao, Takashi Ninomiya, Jun'ichi Tsujii: "Probabilistic modeling of argument structures including non-local dependencies"Proceedings of the Conference on Recent Advances in Natural Language Processing. 285-291 (2003)

Related Report

[Publications] Yusuke Miyao, Takashi Ninomiya, Jun'ichi Tsujii: "Corpus-oriented grammar development for acquiring a Head-driven Phrase Structure Grammar from the Penn Treebank"Proceedings of the International Joint Conference on Natural Language Processing. (To appear). (2004)

Related Report

宮尾祐介東京大学, 大学院・情報学環, 助手 (00343096)