2007 Fiscal Year Annual Research Report

高度言語理解のための意味・知識処理の基盤技術に関する研究

Research Project

Project/Area Number	18002007
Research Institution	The University of Tokyo
Principal Investigator	辻井潤一 The University of Tokyo, 大学院・情報理工学系研究科, 教授 (20026313)
Co-Investigator(Kenkyū-buntansha)	田浦健次朗東京大学, 大学院・情報理工学系研究科, 准教授 (90282714) 宮尾祐介東京大学, 大学院・情報理工学系研究科, 助教 (00343096) 松崎拓也東京大学, 大学院・情報理工学系研究科, 助教 (40463872)
Keywords	言語理解 / 意味処理 / テキストマイニング / 文脈処理 / 知的検索
Research Abstract	本研究は、文解析研究で有効であった機械学習技術と記号処理アルゴリズムとを融合する手法を、意味・文脈・知識処理に適用することで、高度な言語処理技術の構築を目指している。このために、テキストヘの意味・文脈アノテーション付与、分野オントロジーの自動構築、意味・知識に基づく文解析手法、資源共有型の分散計算機環境の構築の研究を行う。平成19年度は、初年度に構築した分散計算機環境の基盤を本格的な言語処理研究に活用することで、以下の研究成果を上げた。 1.意味・文脈情報のアノテーション : 初年度に第一版を構築した生命科学の事象アノテーション結果をGO、UMLS、Meshなどの標準オントロジーとリンクし、世界に公開した。また、文脈処理研究の基盤データとして、論文抄録からフルペーパに対象を拡張し、文を超えた共参照関係のアノテーションを付与した。 2.テキストからの知識抽出 : 生命科学論文からの知識抽出タスクとしてタンパク質相互作用の抽出を行い,開発中の英文解析器の結果と機械学習手法(ME)の結合で世界最高水準の抽出結果(59%)を得た。このことは,テキストから知識へ写像において深い文構造解析が有効であることを示すものとなった。 3.大域的構造の解析 : 1の結果を活用して共参照関係認識プログラムを構築し,用語意味クラスと深い構造解析の結果を活用するモデルを構築した。このモデルは,平成20年度以降の研究でさらに詳細化される文脈処理モデルの基礎となる。 4.大規模テキスト処理の計算環境 : MEDLINE抄録データベース(16百万抄録)に対して,用語意味認定,構文解析,関係抽出のすべての処理を数時間で完了できるシステムを構築した。一週間の時間と人手によるジョブ管理が数時間の自動処理に置き換えられたことは,本プロジェクトの大きな成果である。 5.機械翻訳の予備実験 : 意味・知識処理の研究成果を統合する日中機械翻訳システムのプロトタイプを構築.特に専門用語の意味辞書を自動構築する実験を行い,すぐれた結果を得た。

Research Products
(37 results)

All 2008 2007 Other

All Journal Article (35 results) (of which Peer Reviewed: 34 results) Presentation (1 results) Remarks (1 results)

[Journal Article] Corpus annotation for mining biomedical events from literature.2008
- Author(s)
  Kim, J-D, T Ohta and J Tsujii
- Journal Title
  
  BMC Bioinformatics 9(1)
  
  Pages: 10
- Peer Reviewed
[Journal Article] New challenges for text mining : Mappign between text and manually curated pathways.2008
- Author(s)
  Oda, K, J-D Kim, T Ohta, D Okanohara, T Matsuzaki, Y Tateisi and J Tsujii
- Journal Title
  
  BMC Bioinformatics 9(Suppl 3)
  
  Pages: S5
- Peer Reviewed
[Journal Article] Feature Forest Models for Probabilistic HPSG Parsing.2008
- Author(s)
  Miyao, Y and J Tsujii
- Journal Title
  
  Computational Linguistics 34(1)
  
  Pages: 35-80
- Peer Reviewed
[Journal Article] Syntactic features for protein-protein interaction extraction.2008
- Author(s)
  Saetre, R, K Sagae and J Tsujii
- Journal Title
  
  Short Paper Proceedings of the 2nd International Symposium on Languages in Biology and Medicine (LBM 2007)
  
  Pages: 6.1-6.14
- Peer Reviewed
[Journal Article] BI-018 Path Text : Text Mining Tools Integrated with Biological Pathway.2008
- Author(s)
  Saetre, R. B Kemper, K Oda, N Okazaki, Y Matsuoka, N Kikuchi, H Kitano, Y Tsuruoka, S Ananiadou and J Tsujii
- Journal Title
  
  Genomes to Systems Conference 2008 Handbook
  
  Pages: 65
- Peer Reviewed
[Journal Article] Task-Oriented Evaluation of Syntactic Parsers and Their Representations.2008
- Author(s)
  Miyao Y, R Saetre, K Sagae, T Matsuzaki and J Tsujii
- Journal Title
  
  Proceedings of ACl-08 : HLT
  
  Pages: 46-54
- Peer Reviewed
[Journal Article] A Comparison of Knowledge Resource Designs : Supporting Term-level Text Annotation.2008
- Author(s)
  Tribble A, J-D Kim, T Ohta and J Tsujii
- Journal Title
  
  Proceedings of LREC-2008 Workshop : Building and evaluating resources for biotext mining.
- Peer Reviewed
[Journal Article] Building a Bilingual Lexicon Using Phrase-based Statistical Machine Translation via a Pivot Language.2008
- Author(s)
  Tsunakawa, T, N Okazaki and J Tsujii
- Journal Title
  
  Proceedings of The 22nd International Conference on Computational Linguistics Companion volume Posters and Demonstrations.
  
  Pages: 127-130
- Peer Reviewed
[Journal Article] Filling the Gaps Between Tools and Users : A Tool Comparator, Using Protein-Protein Interactions as an Example.2008
- Author(s)
  Kano, Y, N Nguyen, R Saetre, K Yoshida, Y Miyao, Y Tsuruoka, Y Matsubayashi, S Ananiadou and J Tsujii
- Journal Title
  
  Proceedings of The Pacific Symposium on Biocomputing (PSB)
  
  Pages: 616-627
- Peer Reviewed
[Journal Article] Exact Inference for Multi-label Classification using Sparse Graphical Models.2008
- Author(s)
  Miyao, Y and J Tsujii
- Journal Title
  
  Proceedings of the 22nd International Conference on Computational Linguistics Poser Session.
  
  Pages: 63-66
- Peer Reviewed
[Journal Article] Word Sense Disambiguation for All Words using Tree-Structured Conditional Random Fields.2008
- Author(s)
  Hatori, J, Y Miyao and J Tsujii
- Journal Title
  
  Proceedings of the 22nd International Conference on Computational Linguistics Poster Session.
  
  Pages: 43-46
- Peer Reviewed
[Journal Article] Towards Data And Goal Oriented Analysis : Tool Inter-Operability And Combinatorial Comparison.2008
- Author(s)
  Kano, Y, N Nguyen, R Saetre, K Yoshida, K Fukamachi, Y Miyao, Y Tsuruoka, S Ananiadou and J Tsujii
- Journal Title
  
  Proceedings of the 3rd International Joint Conference on Natural Language Processin. Hyderabad, India
  
  Pages: 859-864
- Peer Reviewed
[Journal Article] Bilingual Synonym Identification with Spelling Variations.2008
- Author(s)
  Tsunakawa, T and J Tsujii
- Journal Title
  
  Proceedins of the 3rd international Joint Conference on Natural Language Processing. Hyderabad, India
  
  Pages: 457-464
- Peer Reviewed
[Journal Article] GENIA-GR : a Grammatical Relation Corpus for Parser Evaluation in the Biomedical Domain.2008
- Author(s)
  Tateisi, Y, Y Miyao, K Sagae and J Tsujii
- Journal Title
  
  Proceedins of the 5th International Conference on Language Resources and Evaluation.
  
  Pages: 496
- Peer Reviewed
[Journal Article] Building Bilingual Lexicons Using Lexical Translation Probabilities via Pivot Languages.2008
- Author(s)
  Tsunekawa, T, N Okazaki and J Tsujii
- Journal Title
  
  Proceedings of the 5th International Conference on Language Resources and Evaluation.
- Peer Reviewed
[Journal Article] From Text to Pathway : Corpus Annotation for Knowledge Acquisition from Biomedical Literature.2008
- Author(s)
  Kim, J-D, T Ohta, K Oda, and J Tsujii
- Journal Title
  
  proceedings of the 6th Aisa Pcific Bioinformatics Conference. Series on Advances in Bioinformatics and Computational Biology6.
  
  Pages: 165-176
- Peer Reviewed
[Journal Article] Challenges in Pronoun Resolution System for Biomedical Text.2008
- Author(s)
  Nagan, N, J-D Kim and J Tsujii
- Journal Title
  
  proceedings of the 6th editoin of the Language Resources and Evaluation.
- Peer Reviewed
[Journal Article] Raising the Compatibility of Heterogeneous Annotations : A Case Study on Protein Mention Recognition.2008
- Author(s)
  Wang, Y, K Yoshida, J-D Kim, R Saetre and J Tsujii
- Journal Title
  
  Proceedings of the BioNLP workshop of the 46th Annual Meeting of the Association for Computational Linguistics (ACL 2008). BioNLP 2008. Current Trends in Biomedical Natural Language.
  
  Pages: 118-119
- Peer Reviewed
[Journal Article] Comparative Parser Performance Analysis across Grammar Frameworks through Automatic Tree Conversion using Synchronous Grammars.2008
- Author(s)
  Matsuzaki, T and J Tsujii
- Journal Title
  
  Proceedings of the 22nd International Conference on Computational Linguistics (COLING-2008)
  
  Pages: 545-552
- Peer Reviewed
[Journal Article] SB-012 'PAYAO' : Web community tagging system to SBML models.2008
- Author(s)
  Matsuoka, Y, N Kikuchi, R Saetre, B Kemper, N Okazaki, H Sugimura, A Hayama, S Ananiadou J Tsujii and H Kitano
- Journal Title
  
  Genomes to Systems Conference 2008 Handbook
  
  Pages: 91
- Peer Reviewed
[Journal Article] Sharable type system design for tool inter-operability and combinatorial comparison.2008
- Author(s)
  Kano, Y, N Nguyen, R Saetre, K Fukamachi, K Yoshida, Y Miyao, Y Tsuruoka, S Ananiadou and J Tsujii
- Journal Title
  
  Proceedings of the First International Conferene on Global Interoperability for Language Resources (ICGL)
  
  Pages: 122-129
- Peer Reviewed
[Journal Article] Challenges in Mapping of Syntactic Representations for Framework-Independent Parser Evaluation.2008
- Author(s)
  Sagae, K, Y Miyao, T Matsuzaki and J Tsujii
- Journal Title
  
  Proceedings of the Workshop on Automated Syntactic Annotations for Interoperable Language Resources at the First International Conference on Global Interoperability for Language Resources (ICGL'08)
- Peer Reviewed
[Journal Article] Machine Learning-Based Pronoun Resolution for Biomedical Text.2008
- Author(s)
  Nguyen, N, Miyao, J-D Kim and J Tsujii
- Journal Title
  
  言語処理学会第14回年次大会発表論文集(NLP2008)
[Journal Article] 複雑なグリッド環境で柔軟なプログラミングを実現するフレームワーク.2008
- Author(s)
  弘中健, 斎藤秀雄, 高橋慧, 田浦健次朗
- Journal Title
  
  情報処理学会論文誌 : コンピューティングシステム(ACS) 1(2)
  
  Pages: 157-168
- Peer Reviewed
[Journal Article] gluepy : A Simple Distributed Python Framework for Complex Grid Environments.2008
- Author(s)
  Hironaka, K. H Saito, K Takahashi and K Taura
- Journal Title
  
  Proceedings of the 21st Annual International Works hop on Languages and Compilers for Parallel Gomputing (LCPC 2008)
  
  Pages: 249-263
- Peer Reviewed
[Journal Article] Learning string similarity measures for gene/protein name dictionary look-up using logistic regression.2007
- Author(s)
  Tsuruoka, Y, J McNaught, J Tsujii and S Ananiadou
- Journal Title
  
  Bioinformatics 23(20)
  
  Pages: 2768-2774
- Peer Reviewed
[Journal Article] Move Prediction in Go with the Maximum Entropy Method.2007
- Author(s)
  Araki, N, K Yoshida, Y Tsuruoka and J Tsujii
- Journal Title
  
  IEEE Symposium Series on Computational Intelligence
  
  Pages: 189-195
- Peer Reviewed
[Journal Article] Framework independent summarized parser output and its documentation.2007
- Author(s)
  Tam, WL, Y Miyao and J Tsujii
- Journal Title
  
  Proceeding of grammar engineering across framework 2007
  
  Pages: 319-331
- Peer Reviewed
[Journal Article] Towards Framework-Independent Evaluation of Deep Linguistic Parsers.2007
- Author(s)
  Miyao, Y, K Sagae and J Tsujii
- Journal Title
  
  Proceedings of Grammar Engineering across Frameworks 2007
  
  Pages: 238-258
- Peer Reviewed
[Journal Article] A log-linear model with an n-gram reference distribution for accurate HPSG parsing.2007
- Author(s)
  Ninomiya, T, Takuya M, Y Miyao and J Tsujii
- Journal Title
  
  Proceedings of IWPT 2007
  
  Pages: 60-68
- Peer Reviewed
[Journal Article] Evaluating Impact of Re-training a Lexical Disambiguation Model on Domain Adaptation of an HPSG Parser.2007
- Author(s)
  Hara, T, Y Miyao and J Tsujii
- Journal Title
  
  Proceedings of IWPT 2007
  
  Pages: 11-22
- Peer Reviewed
[Journal Article] HPSG parsing with shallow dependency constraints.2007
- Author(s)
  Sagae, K, Y Miyao and J Tsujii
- Journal Title
  
  Proceedings of the 44th Meeting of the Association for Computational Linguistics
  
  Pages: 624-631
- Peer Reviewed
[Journal Article] A discriminative language model with pseudo-negative samples.2007
- Author(s)
  Okanohara, D and J Tsujii
- Journal Title
  
  Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics
  
  Pages: 73-80
- Peer Reviewed
[Journal Article] AKANE System : Protein-Protein Interaction Pairs in BioCreAtlvE2 Challenge, PPI-IPS subtask.2007
- Author(s)
  Saetre, R, K Yoshida, A Yakushiji, Y Miyao, Y Matsubayashi and T Ohta
- Journal Title
  
  Proceedings of the Second BioCreative Challenge Evaluation Workshop
  
  Pages: 209-212
- Peer Reviewed
[Journal Article] Reranking for Biomedical Named-Entity Recognition.2007
- Author(s)
  Yoshida, K and J Tsujii
- Journal Title
  
  Proceedings of the Workshop on BioNLP 2007
  
  Pages: 215-222
- Peer Reviewed
[Presentation] 広域TCPオーバレイにおけるデッドロックフールーティング.2008
- Author(s)
  弘中健, 斎藤秀雄, 田浦健次朗
- Organizer
  情報処理学会研究報告OS-109(SWoPP2008)
- Place of Presentation
  佐賀市
- Year and Date
  2008-08-05
[Remarks]
- URL
  http://www-tsujii.is.s.u-tokyo.ac.jp/indes-j.html

2007 Fiscal Year Annual Research Report

高度言語理解のための意味・知識処理の基盤技術に関する研究

Principal Investigator

辻井 潤一 The University of Tokyo, 大学院・情報理工学系研究科, 教授 (20026313)

Research Products

[Journal Article] Corpus annotation for mining biomedical events from literature.2008

Author(s)

Journal Title

[Journal Article] New challenges for text mining : Mappign between text and manually curated pathways.2008

Author(s)

Journal Title

[Journal Article] Feature Forest Models for Probabilistic HPSG Parsing.2008

Author(s)

Journal Title

[Journal Article] Syntactic features for protein-protein interaction extraction.2008

Author(s)

Journal Title

[Journal Article] BI-018 Path Text : Text Mining Tools Integrated with Biological Pathway.2008

Author(s)

Journal Title

[Journal Article] Task-Oriented Evaluation of Syntactic Parsers and Their Representations.2008

Author(s)

Journal Title

[Journal Article] A Comparison of Knowledge Resource Designs : Supporting Term-level Text Annotation.2008

Author(s)

Journal Title

[Journal Article] Building a Bilingual Lexicon Using Phrase-based Statistical Machine Translation via a Pivot Language.2008

Author(s)

Journal Title

[Journal Article] Filling the Gaps Between Tools and Users : A Tool Comparator, Using Protein-Protein Interactions as an Example.2008

Author(s)

Journal Title

[Journal Article] Exact Inference for Multi-label Classification using Sparse Graphical Models.2008

Author(s)

Journal Title

[Journal Article] Word Sense Disambiguation for All Words using Tree-Structured Conditional Random Fields.2008

Author(s)

Journal Title

[Journal Article] Towards Data And Goal Oriented Analysis : Tool Inter-Operability And Combinatorial Comparison.2008

Author(s)

Journal Title

[Journal Article] Bilingual Synonym Identification with Spelling Variations.2008

Author(s)

Journal Title

[Journal Article] GENIA-GR : a Grammatical Relation Corpus for Parser Evaluation in the Biomedical Domain.2008

Author(s)

Journal Title

[Journal Article] Building Bilingual Lexicons Using Lexical Translation Probabilities via Pivot Languages.2008

Author(s)

Journal Title

[Journal Article] From Text to Pathway : Corpus Annotation for Knowledge Acquisition from Biomedical Literature.2008

Author(s)

Journal Title

[Journal Article] Challenges in Pronoun Resolution System for Biomedical Text.2008

Author(s)

Journal Title

[Journal Article] Raising the Compatibility of Heterogeneous Annotations : A Case Study on Protein Mention Recognition.2008

Author(s)

Journal Title

[Journal Article] Comparative Parser Performance Analysis across Grammar Frameworks through Automatic Tree Conversion using Synchronous Grammars.2008

Author(s)

Journal Title

[Journal Article] SB-012 'PAYAO' : Web community tagging system to SBML models.2008

Author(s)

Journal Title

[Journal Article] Sharable type system design for tool inter-operability and combinatorial comparison.2008

Author(s)

Journal Title

[Journal Article] Challenges in Mapping of Syntactic Representations for Framework-Independent Parser Evaluation.2008

Author(s)

Journal Title

[Journal Article] Machine Learning-Based Pronoun Resolution for Biomedical Text.2008

Author(s)

Journal Title

[Journal Article] 複雑なグリッド環境で柔軟なプログラミングを実現するフレームワーク.2008

Author(s)

Journal Title

[Journal Article] gluepy : A Simple Distributed Python Framework for Complex Grid Environments.2008

Author(s)

Journal Title

辻井潤一 The University of Tokyo, 大学院・情報理工学系研究科, 教授 (20026313)