Study on a speech understanding system with an ability to estimate the effect of various environments and users on recognition accuracy

Research Project

Project/Area Number	21500165
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Single-year Grants
Section	一般
Research Field	Perception information processing/Intelligent robotics
Research Institution	Shizuoka University
Principal Investigator	KAI Atsuhiko 静岡大学, 工学部, 准教授 (60283496)
Co-Investigator(Kenkyū-buntansha)	KOGURE Satoru 静岡大学, 情報学部, 講師 (40359758) WANG Longbiao 静岡大学, 工学部, 助教 (30510458)
Project Period (FY)	2009 – 2011
Project Status	Completed (Fiscal Year 2011)
Budget Amount *help	¥4,420,000 (Direct Cost: ¥3,400,000、Indirect Cost: ¥1,020,000) Fiscal Year 2011: ¥1,300,000 (Direct Cost: ¥1,000,000、Indirect Cost: ¥300,000) Fiscal Year 2010: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000) Fiscal Year 2009: ¥1,690,000 (Direct Cost: ¥1,300,000、Indirect Cost: ¥390,000)
Keywords	音声情報処理 / 音声認識 / 認識性能予測 / 話者性 / 明瞭性 / SNR / 認識信頼度 / 対話制御 / 発話様式・発話スタイル / 実環境 / ユーザインタフェース / ユーザビリティ / 性能予測 / 音声認識性能 / 分布間距離 / 音声理解性能 / 雑音下音声認識
Research Abstract	Designing a spoken dialogue interface involves an appropriate handling of recognition errors, which often caused by background noise or indistinct voice uttered by users, and has great impact on the usability of such interface system. This study has developed the methods for estimating user's recognition accuracy by his/her utterance, and also investigated the method to apply the recognition accuracy estimate, which depends on a dialogue state, for optimal selection of responses to the user. Evaluation experiments showed the effectiveness of the proposed methods.

Report

(4 results)

2011 Annual Research Report Final Research Report ( PDF )
2010 Annual Research Report
2009 Annual Research Report

Research Products
(22 results)

All 2012 2011 2010 2009

All Journal Article (8 results) (of which Peer Reviewed: 2 results) Presentation (14 results)

[Journal Article] 話者や発話固有の特徴の違いに注目した認識性能の個人差の要因分析2012
- Author(s)
  赤尾佳彦, 甲斐充彦, 王龍標
- Journal Title
  
  日本音響学会2012年春季研究発表会講演論文集
  
  Pages: 2-3
- Related Report
  2011 Final Research Report
[Journal Article] 音声認識誤り率の推定を用いたPOMDPモデルの構築の検討2012
- Author(s)
  西島祥悟, 甲斐充彦, 小暮悟, 王龍標
- Journal Title
  
  人工知能学会言語・音声理解と対話処理研究会資料
  
  Pages: 13-19
- Related Report
  2011 Final Research Report
[Journal Article] 音声対話制御のためのHIS-POMDP学習・評価プロトタイプツールの開発2012
- Author(s)
  野末隆史, 小暮悟, 甲斐充彦, 小西達裕, 伊東幸宏
- Journal Title
  
  人工知能学会言語・音声理解と対話処理研究会資料
  
  Pages: 21-26
- Related Report
  2011 Final Research Report
[Journal Article] 複数の人工室内インパルス応答を用いた残響モデルの利用による遠隔発話話者認識2012
- Author(s)
  王龍標, 岸良樹, 張兆峰, 甲斐充彦
- Journal Title
  
  日本音響学会2011年秋季研究発表会講演論文集
- Related Report
  2011 Final Research Report
[Journal Article] 単語断片の候補選択が可能な音声入力インタフェースの実装と評価2011
- Author(s)
  張用起, 甲斐充彦, 王龍標
- Journal Title
  
  情報処理学会研究報告
  
  Volume: Vol.2011-SLP-89, No.25 Pages: 1-8
- NAID
  10031110802
- Related Report
  2011 Final Research Report
[Journal Article] Multimodal Interface with N-best Display Including Candidates of Spoken Word Fragments2010
- Author(s)
  Yonggee Jang, Atsuhiko Kai and Longbiao Wang
- Journal Title
  
  Proceedings of 2nd. APSIPA Annual Summit and Conference
  
  Pages: 478-481
- Related Report
  2011 Final Research Report
- Peer Reviewed
[Journal Article] 複数の車内機器操作と雑談を扱えるマルチタスク音声対話システムのユーザビリティの向上2010
- Author(s)
  尾崎, 小暮, 甲斐, 小西, 伊東
- Journal Title
  
  情報処理学会研究報告
  
  Volume: Vol.2010-SLP-80, No.6 Pages: 1-6
- NAID
  110007990680
- Related Report
  2011 Final Research Report
[Journal Article] Speech Interface for Isolated Words Based on Combination of Search Candidates from the Common Word Parts2009
- Author(s)
  Yonggee Jang, Atsuhiko Kai and Longbiao Wang
- Journal Title
  
  Proceedings of Western Pacific Acoustics Conference(WESPAC X2009)
  
  Pages: 261-261
- Related Report
  2011 Final Research Report
- Peer Reviewed
[Presentation] 音声認識誤り率の推定を用いたPOMDPモデルの構築の検討2012
- Author(s)
  西島祥悟,甲斐充彦,小暮悟,王龍標
- Organizer
  人工知能学会言語・音声理解と対話処理研究会
- Place of Presentation
  東京大学
- Year and Date
  2012-03-26
- Related Report
  2011 Final Research Report
[Presentation] 音声対話制御のためのHIS-POMDP学習・評価プロトタイプツールの開発2012
- Author(s)
  野末隆史,小暮悟,甲斐充彦,小西達裕,伊東幸宏
- Organizer
  人工知能学会言語・音声理解と対話処理研究会
- Place of Presentation
  東京大学
- Year and Date
  2012-03-26
- Related Report
  2011 Final Research Report
[Presentation] 音声認識誤り率の推定を用いたPOMDPモデルの構築の検討2012
- Author(s)
  西島祥悟, 甲斐充彦, 小暮悟, 王龍標
- Organizer
  人工知能学会言語・音声理解と対話処理研究会
- Place of Presentation
  東京大学(東京都)
- Year and Date
  2012-03-26
- Related Report
  2011 Annual Research Report
[Presentation] 音声対話制御のためのHIS-POMDP学習・評価プロトタイプツールの開発2012
- Author(s)
  野末隆史, 小暮悟, 甲斐充彦, 小西達裕, 伊東幸宏
- Organizer
  人工知能学会言語・音声理解と対話処理研究会
- Place of Presentation
  東京大学(東京都)
- Year and Date
  2012-03-26
- Related Report
  2011 Annual Research Report
[Presentation] 話者や発話固有の特徴の違いに注目した認識性能の個人差の要因分析2012
- Author(s)
  赤尾佳彦,甲斐充彦,王龍標
- Organizer
  日本音響学会2012年春季研究発表会
- Place of Presentation
  神奈川大学
- Year and Date
  2012-03-15
- Related Report
  2011 Final Research Report
[Presentation] 複数の人工室内インパルス応答を用いた残響モデルの利用による遠隔発話話者認識2012
- Author(s)
  王龍標,岸良樹,張兆峰,甲斐充彦
- Organizer
  日本音響学会2011年秋季研究発表会
- Place of Presentation
  神奈川大学
- Year and Date
  2012-03-15
- Related Report
  2011 Final Research Report
[Presentation] 話者や発話固有の特徴の違いに注目した認識性能の個人差の要因分析2012
- Author(s)
  赤尾佳彦, 甲斐充彦, 王龍標
- Organizer
  日本音響学会2012年春季研究発表会
- Place of Presentation
  神奈川大学(神奈川県)
- Year and Date
  2012-03-15
- Related Report
  2011 Annual Research Report
[Presentation] 複数の人工室内インパルス応答を用いた残響モデルの利用による遠隔発話話者認識2012
- Author(s)
  王龍標, 岸良樹, 張兆峰, 甲斐充彦
- Organizer
  日本音響学会2011年秋季研究発表会
- Place of Presentation
  神奈川大学(神奈川県)
- Year and Date
  2012-03-15
- Related Report
  2011 Annual Research Report
[Presentation] 単語断片の候補選択が可能な音声入力インタフェースの実装と評価2011
- Author(s)
  張用起,甲斐充彦,王龍標
- Organizer
  情報処理学会音声言語情報処理研究会
- Place of Presentation
  芝浦工業大学
- Year and Date
  2011-12-20
- Related Report
  2011 Final Research Report
[Presentation] Multimodal Interface with N-best Display Including Candidates of Spoken Word Fragments2010
- Author(s)
  Yonggee Jang, Atsuhiko Kai and Longbiao Wang
- Organizer
  APSIPA Annual Summit and Conference
- Place of Presentation
  シンガポール・Biopolis
- Year and Date
  2010-12-16
- Related Report
  2011 Final Research Report
[Presentation] Multimodal Interface with N-best Display Including Candidates of Spoken Word Fragments2010
- Author(s)
  Yonggee Jang, Atsuhiko Kai, Longbiao Wang
- Organizer
  2nd.APSIPA Annual Summit and Conference
- Place of Presentation
  Biopolis(シンガポール)
- Year and Date
  2010-12-16
- Related Report
  2010 Annual Research Report
[Presentation] 複数の車内機器操作と雑談を扱えるマルチタスク音声対話システムのユーザビリティの向上2010
- Author(s)
  尾崎,小暮,甲斐,小西,伊東
- Organizer
  音声言語情報処理研究会
- Place of Presentation
  兵庫県・須磨温泉
- Year and Date
  2010-02-12
- Related Report
  2011 Final Research Report
[Presentation] Speech Interface for Isolated Words Based on Combination of Search Candidates from the Common Word Parts2009
- Author(s)
  Yonggee Jang, Atsuhiko Kai and Longbiao Wang
- Organizer
  Western Pacific Acoustics Conference(WESPAC X 2009)
- Place of Presentation
  中国・北京
- Year and Date
  2009-09-21
- Related Report
  2011 Final Research Report
[Presentation] Speech Interface for Isolated Words Based on Combination of Search Candidates from the Common Word Parts2009
- Author(s)
  Yonggee Jang
- Organizer
  Western Pacific Acoustics Conference (WESPAC X 2009)
- Place of Presentation
  北京 (中国)
- Year and Date
  2009-09-21
- Related Report
  2009 Annual Research Report

Study on a speech understanding system with an ability to estimate the effect of various environments and users on recognition accuracy

Principal Investigator

KAI Atsuhiko 静岡大学, 工学部, 准教授 (60283496)

¥4,420,000 (Direct Cost: ¥3,400,000、Indirect Cost: ¥1,020,000)

Report

Research Products

[Journal Article] 話者や発話固有の特徴の違いに注目した認識性能の個人差の要因分析2012

Author(s)

Journal Title

Related Report

[Journal Article] 音声認識誤り率の推定を用いたPOMDPモデルの構築の検討2012

Author(s)

Journal Title

Related Report

[Journal Article] 音声対話制御のためのHIS-POMDP学習・評価プロトタイプツールの開発2012

Author(s)

Journal Title

Related Report

[Journal Article] 複数の人工室内インパルス応答を用いた残響モデルの利用による遠隔発話話者認識2012

Author(s)

Journal Title

Related Report

[Journal Article] 単語断片の候補選択が可能な音声入力インタフェースの実装と評価2011

Author(s)

Journal Title

NAID

Related Report

[Journal Article] Multimodal Interface with N-best Display Including Candidates of Spoken Word Fragments2010

Author(s)

Journal Title

Related Report

[Journal Article] 複数の車内機器操作と雑談を扱えるマルチタスク音声対話システムのユーザビリティの向上2010

Author(s)

Journal Title

NAID

Related Report

[Journal Article] Speech Interface for Isolated Words Based on Combination of Search Candidates from the Common Word Parts2009

Author(s)

Journal Title

Related Report

[Presentation] 音声認識誤り率の推定を用いたPOMDPモデルの構築の検討2012

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 音声対話制御のためのHIS-POMDP学習・評価プロトタイプツールの開発2012

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 音声認識誤り率の推定を用いたPOMDPモデルの構築の検討2012

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 音声対話制御のためのHIS-POMDP学習・評価プロトタイプツールの開発2012

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 話者や発話固有の特徴の違いに注目した認識性能の個人差の要因分析2012

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 複数の人工室内インパルス応答を用いた残響モデルの利用による遠隔発話話者認識2012

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 話者や発話固有の特徴の違いに注目した認識性能の個人差の要因分析2012

Author(s)

Organizer

Place of Presentation