Stochastic language modeling using high linguistic information

Research Project

Project/Area Number	20680008
Research Category	Grant-in-Aid for Young Scientists (A)
Allocation Type	Single-year Grants
Research Field	Intelligent informatics
Research Institution	Kyoto University
Principal Investigator	MORI Shinsuke Kyoto University, 学術情報メディアセンター, 准教授 (90456773)
Project Period (FY)	2008 – 2010
Project Status	Completed (Fiscal Year 2010)
Budget Amount *help	¥15,990,000 (Direct Cost: ¥12,300,000、Indirect Cost: ¥3,690,000) Fiscal Year 2010: ¥4,030,000 (Direct Cost: ¥3,100,000、Indirect Cost: ¥930,000) Fiscal Year 2009: ¥4,030,000 (Direct Cost: ¥3,100,000、Indirect Cost: ¥930,000) Fiscal Year 2008: ¥7,930,000 (Direct Cost: ¥6,100,000、Indirect Cost: ¥1,830,000)
Keywords	係り受け / 照応・省略 / 確率的言語モデル / 認知科学 / 音声認識 / 点予測 / 部分的アノテーション / 能動学習 / 単語分割 / 係り受け解析 / 仮名漢字変換 / 確率的単語分割 / 確率的タグ付与
Research Abstract	First we proposed a pointwise method and realized an improvement of word segmentation. Then we created a corpus consisting of dictionary example sentences and newspaper articles annotated with dependency information. We also proposed stochastic annotation and language model building from a stochastically segmented or tagged corpus.

Report

(4 results)

2010 Annual Research Report Final Research Report ( PDF )
2009 Annual Research Report
2008 Annual Research Report

Research Products
(42 results)

All 2011 2010 2009 2008 Other

All Journal Article (12 results) (of which Peer Reviewed: 12 results) Presentation (25 results) Book (2 results) Remarks (3 results)

[Journal Article] 確率的タグ付与コーパスからの言語モデル構築2011
- Author(s)
  森信介, 笹田鉄郎
- Journal Title
  
  NeubigGraham, To appear in自然言語処理 Vol.18, No.2
- NAID
  10029062853
- Related Report
  2010 Final Research Report
- Peer Reviewed
[Journal Article] 3種類の辞書による自動単語分割の精度向上2011
- Author(s)
  森信介, 小田裕樹
- Journal Title
  
  To appearin自然言語処理 Vol.18, No.2
- NAID
  10029062928
- Related Report
  2010 Final Research Report
- Peer Reviewed
[Journal Article] 確率的タグ付与コーパスからの言語モデル構築2011
- Author(s)
  森信介, 笹田鉄郎, Neubig Graham
- Journal Title
  
  自然言語処理
  
  Volume: 18 Pages: 71-87
- NAID
  10029062853
- Related Report
  2010 Annual Research Report
- Peer Reviewed
[Journal Article] 3種類の辞書による自動単語分割の精度向上2011
- Author(s)
  森信介, 小田裕樹
- Journal Title
  
  自然言語処理
  
  Pages: 139-152
- NAID
  10029062928
- Related Report
  2010 Annual Research Report
- Peer Reviewed
[Journal Article] 自動獲得した未知語の読み・文脈情報による仮名漢字変換2010
- Author(s)
  笹田鉄郎, 森信介, 河原達也
- Journal Title
  
  自然言語処理 Vol.17, No.4
  
  Pages: 131-154
- NAID
  10027016521
- Related Report
  2010 Final Research Report
- Peer Reviewed
[Journal Article] 自動獲得した未知語の読み・文脈情報による仮名漢字変換2010
- Author(s)
  笹田鉄郎, 森信介, 河原達也
- Journal Title
  
  自然言語処理
  
  Volume: 17 Pages: 131-154
- NAID
  10027016521
- Related Report
  2010 Annual Research Report
- Peer Reviewed
[Journal Article] 擬似確率的単語分割コーパスによる言語モデルの改良2009
- Author(s)
  森信介, 小田裕樹
- Journal Title
  
  自然言語処理 Vol.16, No.5
  
  Pages: 7-22
- NAID
  10025525086
- Related Report
  2010 Final Research Report
- Peer Reviewed
[Journal Article] 日本語単語分割の分野適応のための部分的アノテーションを用いた条件付確率場の学習2009
- Author(s)
  坪井祐太, 森信介, 鹿島久嗣, 小田裕樹, 松本裕治
- Journal Title
  
  情報処理学会論文誌 Vol.50, No.6
  
  Pages: 1622-1635
- Related Report
  2010 Final Research Report
- Peer Reviewed
[Journal Article] 日本語単語分割の分野適応のための部分的アノテーションを用いた条件付確率場の学習2009
- Author(s)
  坪井祐太, 森信介, 鹿島久嗣, 小田裕樹, 松本裕治
- Journal Title
  
  情報処理学会論文誌 50
  
  Pages: 1622-1635
- Related Report
  2009 Annual Research Report
- Peer Reviewed
[Journal Article] 擬似確率的単語分割コーパスによる言語モデルの改良2009
- Author(s)
  森信介, 小田裕樹
- Journal Title
  
  自然言語処理 16
  
  Pages: 7-21
- NAID
  10025525086
- Related Report
  2009 Annual Research Report
- Peer Reviewed
[Journal Article] Training Conditional Random Fields Using Incomplete Annotations2008
- Author(s)
  Yuta TSUBOL.Hisashi,KASHIMA, Shinsuke MORI Hitoki ODA Yuji
- Journal Title
  
  Int'l Conf. of Computational Linguistics
  
  Pages: 897-904
- Related Report
  2008 Annual Research Report
- Peer Reviewed
[Journal Article] Extracting Word-Pronunciation Pairs from Comparable Set of Text and beech2008
- Author(s)
  Tetsuro SASADA, Shinsuke MORI, Tatsuva KAWAHARA
- Journal Title
  
  Int'l Conf. of InterSpeech 2008
  
  Pages: 1821-1824
- Related Report
  2008 Annual Research Report
- Peer Reviewed
[Presentation] A Pointwise Approach to Pronunciation Estimation for a TTS Front-end2011
- Author(s)
  Shinsuke Mori, Graham Neubig
- Organizer
  InterSpeech, Firenze
- Place of Presentation
  Italy(To appear)
- Year and Date
  2011-08-29
- Related Report
  2010 Final Research Report
[Presentation] An Unsupervised Model for Joint Phrase Alignment and Extraction2011
- Author(s)
  Graham Neubig, Taro Watanabe, Eiichiro Sumita, Shinsuke Mori, Tatsuya Kawahara
- Organizer
  ACL-HLT, Portland
- Place of Presentation
  USA(To appear)
- Year and Date
  2011-06-20
- Related Report
  2010 Final Research Report
[Presentation] Pointwise Prediction for Robust, Adaptable Japanese Morphological Analysis2011
- Author(s)
  Graham Neubig, Yosuke Nakata, Shinsuke Mori
- Organizer
  Portland
- Place of Presentation
  USA(To appear)
- Year and Date
  2011-06-20
- Related Report
  2010 Final Research Report
[Presentation] Pointwise Prediction for Robust, Adaptable Japanese Morphological Analysis2011
- Author(s)
  Graham Neubig, Yosuke Nakata, Shinsuke Mori
- Organizer
  ACL-HLT2011
- Place of Presentation
  Portland Marriott Waterfront, Portland, USA(再録決定)
- Year and Date
  2011-06-20
- Related Report
  2010 Annual Research Report
[Presentation] 変換ログを用いた仮名漢字変換精度の向上2011
- Author(s)
  山口洋平, 森信介, 河原達也
- Organizer
  言語処理学会第17年次大会
- Place of Presentation
  愛知県豊橋市・豊橋技術科学大学
- Year and Date
  2011-03-10
- Related Report
  2010 Annual Research Report
[Presentation] 点予測と系列予測の2段階化による品詞推定の精度向上2011
- Author(s)
  中田陽介, NEUBIG Graham, 森信介, 河原達也
- Organizer
  情報処理学会研究報告 NL200
- Place of Presentation
  東京
- Year and Date
  2011-01-28
- Related Report
  2010 Final Research Report
[Presentation] 点予測と系列予測の2段階化による品詞推定の精度向上2011
- Author(s)
  中田陽介, NEUBIG Graham, 森信介, 河原達也
- Organizer
  情報処理学会研究報告
- Place of Presentation
  東京都・NHK放送技術研究所
- Year and Date
  2011-01-28
- Related Report
  2010 Annual Research Report
[Presentation] 点予測による形態素解析2010
- Author(s)
  中田陽介, NEUBIG Graham, 森信介, 河原達也
- Organizer
  情報処理学会研究報告 NL-198
- Place of Presentation
  東京
- Year and Date
  2010-09-17
- Related Report
  2010 Final Research Report
[Presentation] 点予測による形態素解析2010
- Author(s)
  中田陽介, NEUBIG Graham, 森信介, 河原達也
- Organizer
  情報処理学会研究報告
- Place of Presentation
  東京都・国立情報学研究所
- Year and Date
  2010-09-17
- Related Report
  2010 Annual Research Report
[Presentation] 確率的タグ付与コーパスからの言語モデル構築2010
- Author(s)
  森信介, 笹田鉄郎, NEUBIGGraham
- Organizer
  情報処理学会自然言語処理研究会 NL-196/SLP81
- Place of Presentation
  東京
- Year and Date
  2010-05-27
- Related Report
  2010 Final Research Report
[Presentation] Word-based Partial Annotation forEfficient Corpus Construction2010
- Author(s)
  Graham Neubig, Shinsuke Mori
- Organizer
  LREC, Valetta
- Place of Presentation
  Malta
- Year and Date
  2010-05-20
- Related Report
  2010 Final Research Report
[Presentation] Word-based Partial Annotation for Efficient Corpus Construction2010
- Author(s)
  Graham Neubig, Shinsuke Mori
- Organizer
  LREC2010
- Place of Presentation
  Meditterranean Conference Center Valetta, Malta
- Year and Date
  2010-05-20
- Related Report
  2010 Annual Research Report
[Presentation] 点推定と能動学習を用いた自動単語分割器の分野適応2010
- Author(s)
  Neubig Graham, 中田陽介, 森信介
- Organizer
  言語処理学会第16回年次大会
- Place of Presentation
  東京
- Year and Date
  2010-03-11
- Related Report
  2010 Final Research Report 2009 Annual Research Report
[Presentation] 仮名漢字変換ログの活用による言語処理精度の自動向上2010
- Author(s)
  森信介, Neubig Graham
- Organizer
  言語処理学会第16回年次大会
- Place of Presentation
  東京
- Year and Date
  2010-03-09
- Related Report
  2009 Annual Research Report
[Presentation] 利用過程で得られる言語情報を活用する音声言語処理システム2009
- Author(s)
  森信介, 前田浩邦
- Organizer
  NLP若手の会第4回シンポジウム
- Place of Presentation
  京都
- Year and Date
  2009-10-01
- Related Report
  2010 Final Research Report
[Presentation] 利用過程で得られる言語情報を活用する音声言語処理システム2009
- Author(s)
  森信介, 前田浩邦
- Organizer
  NLP若手の会第4回シンポジウム
- Place of Presentation
  京都
- Year and Date
  2009-10-01
- Related Report
  2009 Annual Research Report
[Presentation] 3種類の辞書による自動単語分割の精度向上2009
- Author(s)
  森信介, 小田裕樹
- Organizer
  情報処理学会自然言語処理研究会 NL-193
- Place of Presentation
  京都
- Year and Date
  2009-09-29
- Related Report
  2010 Final Research Report
[Presentation] A WFST-based Log-linear Framework for Speaking-style Transformation2009
- Author(s)
  Graham Neubig, Shinsuke MORI, Tatsuya Kawahara
- Organizer
  InterSpeech
- Place of Presentation
  Brighton, UK
- Year and Date
  2009-09-06
- Related Report
  2009 Annual Research Report
[Presentation] Automatic Word Segmentation using Three Types of Dictionaries2009
- Author(s)
  Shinsuke MORI, Hiroki ODA
- Organizer
  PACLING, Sapporo
- Place of Presentation
  Japan
- Year and Date
  2009-09-01
- Related Report
  2010 Final Research Report
[Presentation] Automatic Word Segmentation using Three Types of Dictionaries2009
- Author(s)
  Shinsuke MORI, Hiroki ODA
- Organizer
  PACLING
- Place of Presentation
  札幌
- Year and Date
  2009-09-01
- Related Report
  2009 Annual Research Report
[Presentation] Extracting Word-Pronunciation Pairs from Comparable Set of Text and Speech2008
- Author(s)
  Tetsuro SASADA, Shinsuke MORI, Tatsuya KAWAHARA
- Organizer
  InterSpeech pp.1821-1824
- Place of Presentation
  Brisbane, Australia
- Year and Date
  2008-09-22
- Related Report
  2010 Final Research Report
[Presentation] Training Conditional Random Fields Using Incomplete Annotations2008
- Author(s)
  Yuta TSUBOI, Hisashi KASHIMA, Shinsuke MORI, Hiroki ODA, Yuji MATSUMOTO
- Organizer
  Coling
- Place of Presentation
  Manchester, UK
- Year and Date
  2008-08-18
- Related Report
  2010 Final Research Report
[Presentation] 音声認識のための言語処理:何が足りないか2008
- Author(s)
  森信介
- Organizer
  情報処理学会音声言語情報処理研究会 SLP-72
- Place of Presentation
  盛岡
- Year and Date
  2008-07-18
- Related Report
  2010 Final Research Report
[Presentation] テキストと音声を用いた単語と読みの自動獲得2008
- Author(s)
  笹田鉄郎, 森信介, 河原達也
- Organizer
  情報処理学会音声言語情報処理研究会 SLP-72
- Place of Presentation
  盛岡
- Year and Date
  2008-07-18
- Related Report
  2010 Final Research Report
[Presentation] 音声認識のための言語処理 : 何が足りないか?2008
- Author(s)
  森信介
- Organizer
  情報処理学会音声言語情報処理研究会
- Place of Presentation
  岩手県盛岡市
- Year and Date
  2008-07-18
- Related Report
  2008 Annual Research Report
[Book] 言語処理学事典(2.1.1 n グラムモデル,2.1.3 言語モデルの評価, 2.9.5 隠れマルコフモデル)(言語処理学会編)2009
- Author(s)
  森信介, 他多数
- Publisher
  共立出版株式会社
- Related Report
  2010 Final Research Report
[Book] 言語処理学事典2009
- Author(s)
  森信介, 他多数
- Total Pages
  915
- Publisher
  共立出版株式会社
- Related Report
  2009 Annual Research Report
[Remarks] ホームページ等
- URL
  http://plata.ar.media.kyoto-u.ac.jp/mori/research/
- Related Report
  2010 Final Research Report
[Remarks]
- URL
  http://plata.ar.media.kyoto-u.ac.jp/mori/research/
- Related Report
  2010 Annual Research Report
[Remarks]
- URL
  http://plata.ar.media.kyoto-u.ac.jp/mori/research/
- Related Report
  2009 Annual Research Report

Stochastic language modeling using high linguistic information

Principal Investigator

MORI Shinsuke Kyoto University, 学術情報メディアセンター, 准教授 (90456773)

¥15,990,000 (Direct Cost: ¥12,300,000、Indirect Cost: ¥3,690,000)

Report

Research Products

[Journal Article] 確率的タグ付与コーパスからの言語モデル構築2011

Author(s)

Journal Title

NAID

Related Report

[Journal Article] 3種類の辞書による自動単語分割の精度向上2011

Author(s)

Journal Title

NAID

Related Report

[Journal Article] 確率的タグ付与コーパスからの言語モデル構築2011

Author(s)

Journal Title

NAID

Related Report

[Journal Article] 3種類の辞書による自動単語分割の精度向上2011

Author(s)

Journal Title

NAID

Related Report

[Journal Article] 自動獲得した未知語の読み・文脈情報による仮名漢字変換2010

Author(s)

Journal Title

NAID

Related Report

[Journal Article] 自動獲得した未知語の読み・文脈情報による仮名漢字変換2010

Author(s)

Journal Title

NAID

Related Report

[Journal Article] 擬似確率的単語分割コーパスによる言語モデルの改良2009

Author(s)

Journal Title

NAID

Related Report

[Journal Article] 日本語単語分割の分野適応のための部分的アノテーションを用いた条件付確率場の学習2009

Author(s)

Journal Title

Related Report

[Journal Article] 日本語単語分割の分野適応のための部分的アノテーションを用いた条件付確率場の学習2009

Author(s)

Journal Title

Related Report

[Journal Article] 擬似確率的単語分割コーパスによる言語モデルの改良2009

Author(s)

Journal Title

NAID

Related Report

[Journal Article] Training Conditional Random Fields Using Incomplete Annotations2008

Author(s)

Journal Title

Related Report

[Journal Article] Extracting Word-Pronunciation Pairs from Comparable Set of Text and beech2008

Author(s)

Journal Title

Related Report

[Presentation] A Pointwise Approach to Pronunciation Estimation for a TTS Front-end2011

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] An Unsupervised Model for Joint Phrase Alignment and Extraction2011

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] Pointwise Prediction for Robust, Adaptable Japanese Morphological Analysis2011

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report