2010 Fiscal Year Annual Research Report

代表性のあるコーパスを利用した日本語意味解析

Planned Research

Project Area	Compilation of a balanced corpus of written Japanese: Infrastructure for the coming Japanese linguistics
Project/Area Number	18061003
Research Institution	Tokyo Institute of Technology
Principal Investigator	奥村学東京工業大学, 精密工学研究所, 教授 (60214079)
Co-Investigator(Kenkyū-buntansha)	白井清昭北陸先端科学技術大学院大学, 情報科学研究科, 准教授 (30302970) 新納浩幸茨城大学, 工学部, 准教授 (10250987) 高村大也東京工業大学, 精密工学研究所, 准教授 (80361773) 竹内孔一岡山大学, 自然科学研究科, 講師 (80311174) 佐々木稔茨城大学, 工学部, 講師 (60344834)
Keywords	語義タグ付コーパス / 単語の新語義発見 / 機械学習 / 語彙概念構造 / クラスタリング
Research Abstract	語義曖昧性解消における領域適応では,ソース(適応元)データとターゲット(適応先)データの性質により,ソースデータとターゲットデータの組み合わせごとに効果的な領域適応手法が異なることを明らかにし,ソースデータとターゲットデータの組み合わせごとに効果的な領域適応手法を自動的に選択する手法の開発を行った.領域適応手法の自動選択は,ソースデータとターゲットデータの性質に関する情報を元に決定木学習を用いて行う.自動的に選択された領域適応手法を用いることで,語義曖昧性解消の性能が有意に向上することが確認されている.コーパスからの単語の新語義発見では,以下の2つの研究に取り組んだ.1つ目は,用例をクラスタリングする際に,語義の類似性を様々な観点から測るために複数の特徴ベクトルを同時に利用する手法を考案した,もう1つは,用例クラスタが新語義か否かを判定する際に,クラスタ間の類似度を考慮するようにアルゴリズムを改良した.外れ値検出手法を利用した新語義検出では通常,外れ値検出は教師なしの枠組みであるが,少量の語義付き用例が利用できるという前提を設け,既存手法の拡張を行った.SemEval-2のデータを用いた実験により提案手法の効果を示した.また新語義の検出では,識別平面との距離ではなく,語義クラスタと用例との距離が重要になるため,距離学習の研究も同時に行い,最大マージン化最近傍法を利用することを提案した.語義識別ではSVMと同等の精度が出ることを示した.動詞の項構造辞書に関する設計については,他の言語資源(FrameNetなど)との比較を行い,その価値について国際会議で発表した.また動詞類義語を獲得するために大規模テキストデータからクラスタリングを用いて獲得する手法として,重み付きカーネルk-means法と比較して提案手法が有効であることを明らかにした.

Research Products
(9 results)

All 2011 2010

All Presentation (9 results)

[Presentation] 教師付き外れ値検出による新語義の発見2011
- Author(s)
  新納浩幸, 佐々木稔
- Organizer
  言語処理学会第17回年次大会
- Place of Presentation
  豊橋
- Year and Date
  2011-03-10
[Presentation] 距離学習に基づく語義識別の性能分析2011
- Author(s)
  佐々木稔, 新納浩幸
- Organizer
  言語処理学会第17回年次大会
- Place of Presentation
  豊橋
- Year and Date
  2011-03-09
[Presentation] 複数の観点から定義された用例間類似度に基づく語義識別2011
- Author(s)
  中西隆一郎, 白井清昭, 中村誠
- Organizer
  言語処理学会第17回年次大会
- Place of Presentation
  豊橋
- Year and Date
  2011-03-09
[Presentation] 分類器の確信度を用いた合議制による語義曖昧性解消の領域適応2011
- Author(s)
  古宮嘉那子, 奥村学
- Organizer
  言語処理学会第17回年次大会
- Place of Presentation
  豊橋
- Year and Date
  2011-03-09
[Presentation] Document Clustering Using Semantic Relationship Between Target Documents And Related Documents2010
- Author(s)
  Minoru Sasaki, Hiroyuki Shinnou
- Organizer
  The Fourth International Conference on Advances in Semantic Processing
- Place of Presentation
  Florence, Italy
- Year and Date
  2010-10-27
[Presentation] グラフに基づくクラスタリングによる動詞類義語の獲得2010
- Author(s)
  竹内孔一, 高橋秀幸, 小林大介
- Organizer
  言語理解とコミュニケーション研究会
- Place of Presentation
  機械振興会館
- Year and Date
  2010-10-23
[Presentation] 語義曖昧性解消のための領域適応手法の自動選択2010
- Author(s)
  古宮嘉那子, 奥村学
- Organizer
  情報処理学会自然言語処理研究会
- Place of Presentation
  国立情報学研究所
- Year and Date
  2010-09-16
[Presentation] A Thesaurus of Predicate-Argument Structure for Japanese Verbs to Deal with Granularity of Verb Meanings2010
- Author(s)
  Koichi Takeuchi, Kentaro Inui, Nao Takeuchi, Atsushi Fujita
- Organizer
  The 8th Workshop on Asian Language Resources
- Place of Presentation
  Beijing
- Year and Date
  2010-08-21
[Presentation] Detection of Peculiar Examples using LOF and One Class SVM2010
- Author(s)
  Hiroyuki Shinnou, Minoru Sasaki
- Organizer
  LREC-2010
- Place of Presentation
  Malta
- Year and Date
  2010-05-21

2010 Fiscal Year Annual Research Report

代表性のあるコーパスを利用した日本語意味解析

Principal Investigator

奥村 学 東京工業大学, 精密工学研究所, 教授 (60214079)

Research Products

[Presentation] 教師付き外れ値検出による新語義の発見2011

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 距離学習に基づく語義識別の性能分析2011

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 複数の観点から定義された用例間類似度に基づく語義識別2011

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 分類器の確信度を用いた合議制による語義曖昧性解消の領域適応2011

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Document Clustering Using Semantic Relationship Between Target Documents And Related Documents2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] グラフに基づくクラスタリングによる動詞類義語の獲得2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 語義曖昧性解消のための領域適応手法の自動選択2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] A Thesaurus of Predicate-Argument Structure for Japanese Verbs to Deal with Granularity of Verb Meanings2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Detection of Peculiar Examples using LOF and One Class SVM2010

Author(s)

Organizer

Place of Presentation

Year and Date

奥村学東京工業大学, 精密工学研究所, 教授 (60214079)