複数話者の音声コミュニケーションの意図・状況理解

Research Project

Project/Area Number	16016250
Research Category	Grant-in-Aid for Scientific Research on Priority Areas
Allocation Type	Single-year Grants
Review Section	Science and Engineering
Research Institution	Kyoto University
Principal Investigator	河原達也京都大学, 学術情報メディアセンター, 教授 (00234104)
Co-Investigator(Kenkyū-buntansha)	岡田美智男 ATR, ネットワーク情報学研究所, 室長
Project Period (FY)	2004 – 2005
Project Status	Completed (Fiscal Year 2005)
Budget Amount *help	¥9,300,000 (Direct Cost: ¥9,300,000) Fiscal Year 2005: ¥4,700,000 (Direct Cost: ¥4,700,000) Fiscal Year 2004: ¥4,600,000 (Direct Cost: ¥4,600,000)
Keywords	音声情報処理 / 音声認識 / 意図理解 / 音声対話システム / 協調的応答 / ユーザモデル / ユーザーモデル
Research Abstract	人間と共生する機械を実現するためには、システムがユーザのモデルを知覚し、それに応じて適応的に行動することが重要であると考えられる。本研究ではまず、音声対話によるインタフェースにおいて、協調的な応答を生成するためのユーザモデルについて検討を進めてきた。具体的には、システムに対する習熟度、タスクドメインに関する知識レベル、性急度の3つのユーザモデルを導入し、それに応じて対話制御を行う戦略を提案した。京都市バス運行情報案内システム(現在試験運用中:075-326-3116)において実装・評価を行った結果、各ユーザに適応した協調的応答が、習熟したユーザに対する対話時間を増加させることなく、初心者に対して適切なガイダンスとなることが示された。本研究で用いているユーザモデルは自動判別を行うが、そのための特徴として音声認識結果に含まれる意味情報以外に、発話間間隔やバージインの有無などの音声対話特有の特徴も用いている。特に、習熟度と性急度の学習・判別に用いる特徴はドメイン知識に依存していないため、これらのユーザデルは他のドメインにも応用できる汎用的なものである。ただし、対話制御については人手で規則を記述する必要があり、大規模なドメインやモデルへの適用が困難であった。そこで次に、ユーザや状況のモデルに基づいて、プランニングにより対話制御・応答生成を行う枠組みを考え、このプランニング(プランを動的に選択する機構)を機械学習により行うことを研究した。プランはドメインプランと発話プランの2階層からなり、ドメインプランを決定することで次に提供する情報内容を決定し、発話プランを決定して具体的な応答を生成する。これらのプランは、前記のようなユーザ・状況のモデルをパラメータとする線形の評価関数で規定される。ロールプレイ形式で行う模擬対話サンプルによりこの学習を行うことにより、ユーザに適応した対話プランの選択を実現することができた。

Report

(2 results)

2005 Annual Research Report
2004 Annual Research Report

Research Products
(14 results)

All 2006 2005 2004

All Journal Article (12 results) Book (1 results) Patent(Industrial Property Rights) (1 results)

[Journal Article] Verification of speech recognition results incorporating in-domain confidence and discourse coherence measures.2006
- Author(s)
  I.R.Lane, T.Kawahara.
- Journal Title
  
  IEICE Trans. Vol.E89-D・No.3
  
  Pages: 931-938
- NAID
  110004719366
- Related Report
  2005 Annual Research Report
[Journal Article] Trigger-based language model adaptation for automatic transcription of panel discussions.2006
- Author(s)
  C.Troncoso, T.Kawahara.
- Journal Title
  
  IEICE Trans. Vol.E89-D・No.3
  
  Pages: 1024-1031
- NAID
  110004719377
- Related Report
  2005 Annual Research Report
[Journal Article] Speaker model selection based on Bayesian information criterion applied to unsupervised speaker indexing.2005
- Author(s)
  M.Nishida, T.Kawahara.
- Journal Title
  
  IEEE Trans.Speech & Audio Process. Vol.13.No.4
  
  Pages: 583-592
- NAID
  120002511373
- Related Report
  2005 Annual Research Report
[Journal Article] User modeling in spoken dialogue systems to generate flexible guidance.2005
- Author(s)
  K.Komatani, S.Ueno, T.Kawahara, H.G.Okuno.
- Journal Title
  
  User Modeling and User-Adapted Interaction. Vol.15・No.1
  
  Pages: 169-183
- Related Report
  2005 Annual Research Report
[Journal Article] 話し言葉音声認識のための汎用的な統計的発音変動モデル.2005
- Author(s)
  秋田祐哉, 河原達也.
- Journal Title
  
  電子情報通信学会論文誌 Vol.J88-DII・No.9
  
  Pages: 1780-1789
- NAID
  110003224132
- Related Report
  2005 Annual Research Report
[Journal Article] 日本語話し言葉の係り受け解析と文境界推定の相互作用による高精度化.2005
- Author(s)
  下岡和也, 内元清貴, 河原達也, 井佐原均
- Journal Title
  
  自然言語処理 Vol.12・No.3
  
  Pages: 3-17
- NAID
  10016629478
- Related Report
  2005 Annual Research Report
[Journal Article] 音声対話によるソフトウェアサポートタスクのための効率的な確認戦略2005
- Author(s)
  翠輝久, 駒谷和範, 清田陽司, 河原達也
- Journal Title
  
  電子情報通信学会論文誌 J88-DII,3
  
  Pages: 499-508
- NAID
  110003203199
- Related Report
  2004 Annual Research Report
[Journal Article] Dialogue speech recognition by combining hierarchical topic classification and language model switching2005
- Author(s)
  I.R.Lane, T.Kawahara, T.Matsui, S.Nakamura
- Journal Title
  
  IEICE Trans. E88-D,3
  
  Pages: 446-454
- NAID
  110003214205
- Related Report
  2004 Annual Research Report
[Journal Article] 音声対話システムにおける適応的な応答生成を行うためのユーザモデル2004
- Author(s)
  駒谷和範, 上野晋一, 河原達也, 奥乃博
- Journal Title
  
  電子情報通信学会論文誌 J87-DII,10
  
  Pages: 1921-1928
- NAID
  110003171015
- Related Report
  2004 Annual Research Report
[Journal Article] Automatic indexing of lecture presentations using unsupervised learning of presumed discourse markers2004
- Author(s)
  T.Kawahara, M.Hasegawa, K.Shitaoka, T.Kitade, H.Nanjo
- Journal Title
  
  IEEE Trans. Speech & Audio Processing 12,4
  
  Pages: 409-419
- NAID
  120002511374
- Related Report
  2004 Annual Research Report
[Journal Article] 話し言葉による音声対話システム2004
- Author(s)
  河原達也
- Journal Title
  
  情報処理 45,10
  
  Pages: 1027-1031
- Related Report
  2004 Annual Research Report
[Journal Article] Example-based training of dialogue planning incorporating user and situation models2004
- Author(s)
  S.Ueno, I.R.Lane, T.Kawahara
- Journal Title
  
  Proc. ICSLP
  
  Pages: 2837-2840
- Related Report
  2004 Annual Research Report
[Book] Spoken Language Systems.2005
- Author(s)
  Seiichi Nakagawa, Michio Okada, Tatsuya Kawahara, editors.
- Total Pages
  347
- Publisher
  Ohmsha/IOS Press
- Related Report
  2005 Annual Research Report
[Patent(Industrial Property Rights)] 発話区間検出装置、そのためのコンピュータプログラム及び記録媒体2005
- Inventor(s)
  河原達也, 木田祐介
- Industrial Property Rights Holder
  京都大学
- Industrial Property Number
  2005-197804
- Filing Date
  2005-07-06
- Related Report
  2005 Annual Research Report

複数話者の音声コミュニケーションの意図・状況理解

Principal Investigator

河原 達也 京都大学, 学術情報メディアセンター, 教授 (00234104)

¥9,300,000 (Direct Cost: ¥9,300,000)

Report

Research Products

[Journal Article] Verification of speech recognition results incorporating in-domain confidence and discourse coherence measures.2006

Author(s)

Journal Title

NAID

Related Report

[Journal Article] Trigger-based language model adaptation for automatic transcription of panel discussions.2006

Author(s)

Journal Title

NAID

Related Report

[Journal Article] Speaker model selection based on Bayesian information criterion applied to unsupervised speaker indexing.2005

Author(s)

Journal Title

NAID

Related Report

[Journal Article] User modeling in spoken dialogue systems to generate flexible guidance.2005

Author(s)

Journal Title

Related Report

[Journal Article] 話し言葉音声認識のための汎用的な統計的発音変動モデル.2005

Author(s)

Journal Title

NAID

Related Report

[Journal Article] 日本語話し言葉の係り受け解析と文境界推定の相互作用による高精度化.2005

Author(s)

Journal Title

NAID

Related Report

[Journal Article] 音声対話によるソフトウェアサポートタスクのための効率的な確認戦略2005

Author(s)

Journal Title

NAID

Related Report

[Journal Article] Dialogue speech recognition by combining hierarchical topic classification and language model switching2005

Author(s)

Journal Title

NAID

Related Report

[Journal Article] 音声対話システムにおける適応的な応答生成を行うためのユーザモデル2004

Author(s)

Journal Title

NAID

Related Report

[Journal Article] Automatic indexing of lecture presentations using unsupervised learning of presumed discourse markers2004

Author(s)

Journal Title

NAID

Related Report

[Journal Article] 話し言葉による音声対話システム2004

Author(s)

Journal Title

Related Report

[Journal Article] Example-based training of dialogue planning incorporating user and situation models2004

Author(s)

Journal Title

Related Report

[Book] Spoken Language Systems.2005

Author(s)

Total Pages

Publisher

Related Report

[Patent(Industrial Property Rights)] 発話区間検出装置、そのためのコンピュータプログラム及び記録媒体2005

Inventor(s)

Industrial Property Rights Holder

Industrial Property Number

Filing Date

Related Report

河原達也京都大学, 学術情報メディアセンター, 教授 (00234104)