キーフレーズ認識とその信頼度計算に基づく柔軟な音声対話理解

Research Project

Project/Area Number	09780328
Research Category	Grant-in-Aid for Encouragement of Young Scientists (A)
Allocation Type	Single-year Grants
Research Field	Intelligent informatics
Research Institution	Kyoto University
Principal Investigator	河原達也京都大学, 情報学研究科, 助教授 (00234104)
Project Period (FY)	1997 – 1998
Project Status	Completed (Fiscal Year 1998)
Budget Amount *help	¥2,000,000 (Direct Cost: ¥2,000,000) Fiscal Year 1998: ¥700,000 (Direct Cost: ¥700,000) Fiscal Year 1997: ¥1,300,000 (Direct Cost: ¥1,300,000)
Keywords	音声認識 / 音声理解 / 単語スポッティング / 発話検証 / キーフレーズ
Research Abstract	キーフレーズの検出・検証の高精度化のために、ドメインに独立な語彙・言語モデルの構成方法について研究した。このフィラーモデルは、キーワードやキーフレーズ以外の区間を近似することにより、検出や検証のためのスコアの正規化を行うものである。そのためには、できるだけ小さいサイズで十分なカバレージを持つことが望ましい。また、話題やキーワード語彙の変化に対して頑健であることが望ましい。そこで、ドメインに依存した語彙やコーパスを前提とする代りに、会話スタイル(講演調、情報検索対話など)に依存したモデルを考える。例えば講演調スタイルモデルは、講演の内容に関わらず講演というスタイルに固有の話し言葉の特徴をとらえる。これにより、同一のスタイルからなる大規模なコーパスを利用して学習できる。話題(ドメイン)独立性の尺度として、単語wと話題集合T{t_1,...,t_n}との相互情報量I(T;w)を定義し、この値が小さい単語集合を抽出する。このモデルにより,従来の音節連接モデルに基づく手法に比べて、はるかに高い発話検証性能を得ることができ、講演をしながら音声で操作できるスライドプロジェクタを設計・実装できた。

Report

(2 results)

1998 Annual Research Report
1997 Annual Research Report

Research Products
(13 results)

All Other

All Publications (13 results)

[Publications] T.Kawahara: "Flexible speech understanding based on combined key-phrase detection and verification" IEEE Trans.Speech & Audio Processing. 6,6. 558-568 (1998)
- Related Report
  1998 Annual Research Report
[Publications] 河原達也: "発話検証に基づく音声操作プロジェクタとそれによる講演の自動ハイパーテキスト化" 情報処理学会論文誌. 40,4. (1999)
- Related Report
  1998 Annual Research Report
[Publications] 河原達也: "日本語ディクテーション基本ソフトウェア(97年度版)" 日本音響学会誌. 55,3. 175-180 (1999)
- Related Report
  1998 Annual Research Report
[Publications] 李晃伸: "文法カテゴリ対制約を用いたA^*探索に基づく大語彙連続音声認識パーザ" 情報処理学会論文誌. 40,4. (1999)
- Related Report
  1998 Annual Research Report
[Publications] 李晃伸: "単語トレリスインデックスを用いた段階的探索による大語彙連続音声認識" 電子情報通信学会論文誌. J82-DII,1. 1-9 (1999)
- Related Report
  1998 Annual Research Report
[Publications] 政瀧浩和: "最大事後確率推定によるN-gram言語モデルのタスク適応" 電子情報通信学会論文誌. J81-DII,11. 2519-2525 (1998)
- Related Report
  1998 Annual Research Report
[Publications] 堂下修司: "音声による人間と機械の対話" オーム社, 386 (1998)
- Related Report
  1998 Annual Research Report
[Publications] T.Kawahara: "Flexible speech understanding based on combined key-phrase detection and verification" IEEE Trans.Speech & Audio Processing. 採録決定. (1998)
- Related Report
  1997 Annual Research Report
[Publications] T.Kawahara: "Phrase language models for detection and verification-based speech understanding" Proc.IEEE Workshop on Automatic Speech Recognitoin. 49-56 (1997)
- Related Report
  1997 Annual Research Report
[Publications] T.Kawahara: "Combining key-phrase detection and subword-based verification for flexible speech understanding" Proc.IEEE Int'l Conf.Acoust.,Speech & Signal Processing. 1. 1159-1162 (1997)
- Related Report
  1997 Annual Research Report
[Publications] H.Masataki: "Task adaptation using MAP estimation in n-gram language modeling" Proc.IEEE Int'l Conf.Acoust.,Speech & Signal Processing. 1. 783-786 (1997)
- Related Report
  1997 Annual Research Report
[Publications] C-H.Jo: "Japanese pronunciation training system with HMM segmentation and distinctive feature classification" Proc.Int'l Conf.on Speech Processing. 341-346 (1997)
- Related Report
  1997 Annual Research Report
[Publications] T.Kawahara: "Speaking-Style dependent lexicalized filler model for key-phrase detection and verification" 電子情報通信学会技術研究報告. SP97-78. (1997)
- Related Report
  1997 Annual Research Report

キーフレーズ認識とその信頼度計算に基づく柔軟な音声対話理解

Principal Investigator

河原 達也 京都大学, 情報学研究科, 助教授 (00234104)

¥2,000,000 (Direct Cost: ¥2,000,000)

Report

Research Products

[Publications] T.Kawahara: "Flexible speech understanding based on combined key-phrase detection and verification" IEEE Trans.Speech & Audio Processing. 6,6. 558-568 (1998)

Related Report

[Publications] 河原達也: "発話検証に基づく音声操作プロジェクタとそれによる講演の自動ハイパーテキスト化" 情報処理学会論文誌. 40,4. (1999)

Related Report

[Publications] 河原達也: "日本語ディクテーション基本ソフトウェア(97年度版)" 日本音響学会誌. 55,3. 175-180 (1999)

Related Report

[Publications] 李晃伸: "文法カテゴリ対制約を用いたA^*探索に基づく大語彙連続音声認識パーザ" 情報処理学会論文誌. 40,4. (1999)

Related Report

[Publications] 李晃伸: "単語トレリスインデックスを用いた段階的探索による大語彙連続音声認識" 電子情報通信学会論文誌. J82-DII,1. 1-9 (1999)

Related Report

[Publications] 政瀧浩和: "最大事後確率推定によるN-gram言語モデルのタスク適応" 電子情報通信学会論文誌. J81-DII,11. 2519-2525 (1998)

Related Report

[Publications] 堂下修司: "音声による人間と機械の対話" オーム社, 386 (1998)

Related Report

[Publications] T.Kawahara: "Flexible speech understanding based on combined key-phrase detection and verification" IEEE Trans.Speech & Audio Processing. 採録決定. (1998)

Related Report

[Publications] T.Kawahara: "Phrase language models for detection and verification-based speech understanding" Proc.IEEE Workshop on Automatic Speech Recognitoin. 49-56 (1997)

Related Report

[Publications] T.Kawahara: "Combining key-phrase detection and subword-based verification for flexible speech understanding" Proc.IEEE Int'l Conf.Acoust.,Speech & Signal Processing. 1. 1159-1162 (1997)

Related Report

[Publications] H.Masataki: "Task adaptation using MAP estimation in n-gram language modeling" Proc.IEEE Int'l Conf.Acoust.,Speech & Signal Processing. 1. 783-786 (1997)

Related Report

[Publications] C-H.Jo: "Japanese pronunciation training system with HMM segmentation and distinctive feature classification" Proc.Int'l Conf.on Speech Processing. 341-346 (1997)

Related Report

[Publications] T.Kawahara: "Speaking-Style dependent lexicalized filler model for key-phrase detection and verification" 電子情報通信学会技術研究報告. SP97-78. (1997)

Related Report

河原達也京都大学, 情報学研究科, 助教授 (00234104)