Large-vocabulary continuous speech recognition on spontaneous speech task

Research Project

Project/Area Number	18500126
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Single-year Grants
Section	一般
Research Field	Perception information processing/Intelligent robotics
Research Institution	Yamagata University
Principal Investigator	KOHDA Masaki Yamagata University, Graduate School of Science and Engineering, Professor (00205337)
Co-Investigator(Kenkyū-buntansha)	KOSAKA Tetsuo Yamagata University, Graduate School of Science and Engineering, Associate Professor (50359569) KATOH Masaharu Yamagata University, Graduate School of Science and Engineering, Research Associate (10250953)
Project Period (FY)	2006 – 2007
Project Status	Completed (Fiscal Year 2007)
Budget Amount *help	¥1,910,000 (Direct Cost: ¥1,700,000、Indirect Cost: ¥210,000) Fiscal Year 2007: ¥910,000 (Direct Cost: ¥700,000、Indirect Cost: ¥210,000) Fiscal Year 2006: ¥1,000,000 (Direct Cost: ¥1,000,000)
Keywords	Corpus of Spontaneous Japanese / Speech recognition / Acoustic model / Language model / Unsupervised adaptation / System integration / Robust speech recognition / 混合連続分布HMM / 離散混合分布HMM
Research Abstract	1. Large-vocabulary continuous speech recognition on spontaneous speech task In large-vocabulary continuous speech recognition, we investigate several methods of unsupervised adaptation of both acoustic and language models and evaluate the methods on the Corpus of Spontaneous Japanese (CSJ). The LVCSR system has full-covariance matrices as the acoustic model. The results of recognition experiments showed the decrease in word error rate (WER) from 19.17% without adaptation to 14.73% with unsupervised adaptation, moreover to 14.47% with unsupervised adaptation by weighting the adaptation data on the basis of a part of speech. Also, we compared the performance between continuous-mixture FRAM (CHMM) system and discrete-mixture HMM (DMHMM) system on the CSJ. As a result, DMHMM system provided almost the same performance as the CHMM system and WER of 19.73% had been obtained with 6000-state 24-mixture DMHMMs, though it has been generally believed that the recognition error rates of DMHMM were … More much higher than those of CHMM until now. 2. Robust speech recognition using discrete-mixture HMMs We introduce a new method of robust speech recognition under noisy conditions based on discrete-mixture HMMs (DMHMMs). DMHMMs were originally proposed to reduce calculation costs in the decoding process. Recently, we have applied DMHMMs to noisy speech recognition, and found that they were effective for modeling noisy speech. Towards the further improvement of noise-robust speech recognition, we propose a novel normalization method for DMHMMs based on histogram equalization (HEQ). The HEQ method can compensate the nonlinear effects of additive noise. It is generally used for the feature space normalization of continuous-mixture HMM (CHMM) systems. In this paper, we propose both model space and feature space normalization of DMHMMs by using HEQ. In the model space normalization, codebooks of DMHMMs are modified by the transform function derived from the HEQ method. The proposed method was compared using both conventional CHMMs and DMHMMs. The results showed that the model space normalization of DMHMMs by multiple transform functions was effective for noise-robust speech recognition. Less

Report

(3 results)

2007 Annual Research Report Final Research Report Summary
2006 Annual Research Report

Research Products
(57 results)

All 2008 2007 2006

All Journal Article (19 results) (of which Peer Reviewed: 3 results) Presentation (36 results) Book (2 results)

[Journal Article] Histogram equalization for noise rebust speech recognition using discrete-mixture HMMs2008
- Author(s)
  T., Kosaka
- Journal Title
  
  Acoustical Science and Techmhnology 29
  
  Pages: 66-73
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2007 Final Research Report Summary
- Peer Reviewed
[Journal Article] Histogram equalization for noise robust speech recognition using discrete-mixture HMMs2008
- Author(s)
  T. Kosaka, M. Katoh, M. Kohda
- Journal Title
  
  Acoustical Science and Technology vol.29, no.1
  
  Pages: 66-73
- NAID
  110006533631
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2007 Final Research Report Summary
[Journal Article] Histogram equalization for noise robust speech recognition by using discrete-mixture HMMs2008
- Author(s)
  T. Kosaka
- Journal Title
  
  Acoustical Science and Technology 29
  
  Pages: 66-73
- NAID
  110006533631
- Related Report
  2007 Annual Research Report
- Peer Reviewed
[Journal Article] 音素モデルを用いた話者ベクトルに基づく話者識腹2007
- Author(s)
  小坂哲夫
- Journal Title
  
  電子情報帳信学会論文誌D J90-D
  
  Pages: 3201-3209
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2007 Annual Research Report 2007 Final Research Report Summary
- Peer Reviewed
[Journal Article] Speaker vector-based speaker identification with phonetic modeling2007
- Author(s)
  T. Kosaka, T. Akatsu, M. Katoh, M. Kohda
- Journal Title
  
  IEICE Trans. on Information and Systems vol.J90-D, no.12
  
  Pages: 3201-3209
- NAID
  110007380643
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2007 Final Research Report Summary
[Journal Article] 音素構造距離を用いた英語発音自動評定の精度向上の検討2007
- Author(s)
  山口涼子
- Journal Title
  
  情報処理学会東北支部研究会 06-6-A1-1
  
  Pages: 1-8
- Related Report
  2006 Annual Research Report
[Journal Article] 日本語話し言葉コーパスを用いた離散混合分布HMMの性能評価2007
- Author(s)
  山本秋祥
- Journal Title
  
  情報処理学会東北支部研究会 06-6-A1-2
  
  Pages: 1-6
- Related Report
  2006 Annual Research Report
[Journal Article] 話し言葉音声認識における教師なし適応の改善2007
- Author(s)
  草間隆
- Journal Title
  
  情報処理学会東北支部研究会 06-6-A1-3
  
  Pages: 1-9
- Related Report
  2006 Annual Research Report
[Journal Article] 会議音声の話者インデキシングと話者適応2007
- Author(s)
  齋藤徹也
- Journal Title
  
  情報処理学会東北支部研究会 06-6-A1-4
  
  Pages: 1-7
- Related Report
  2006 Annual Research Report
[Journal Article] 参議院の議事録を用いた言語モデルの作成2007
- Author(s)
  手塚収太
- Journal Title
  
  情報処理学会東北支部研究会 06-6-A2-1
  
  Pages: 1-6
- Related Report
  2006 Annual Research Report
[Journal Article] 日本語話し言葉コーパスを用いた重要文抽出2007
- Author(s)
  宇野涼子
- Journal Title
  
  情報処理学会東北支部研究会 06-6-A2-2
  
  Pages: 1-8
- Related Report
  2006 Annual Research Report
[Journal Article] 話者ベクトルを用いた話者識別における次元圧縮の効果2007
- Author(s)
  赤津達也
- Journal Title
  
  日本音響学会講演論文集(春季) 1-P-18
  
  Pages: 159-160
- Related Report
  2006 Annual Research Report
[Journal Article] 離散混合分布HMMのヒストグラム同等化を用いたコードブック正規化2006
- Author(s)
  小坂哲夫
- Journal Title
  
  電子情報通信学会技術研究報告 SP2006-15
  
  Pages: 25-30
- NAID
  110004750981
- Related Report
  2006 Annual Research Report
[Journal Article] コードブック適応を用いた離散混合分布型HMMによる講演音声認識2006
- Author(s)
  山本明祥
- Journal Title
  
  情報処理学会研究報告 2006-SLP-62
  
  Pages: 25-30
- Related Report
  2006 Annual Research Report
[Journal Article] コードブック適応を用いた離散混合分布型HMMによる講演音声認識2006
- Author(s)
  山本明祥
- Journal Title
  
  日本音響学会講演論文集(秋季) 2-2-9
- Related Report
  2006 Annual Research Report
[Journal Article] 話者ベクトルを用いた話者識別法における音響モデルの検討2006
- Author(s)
  赤津達也
- Journal Title
  
  日本音響学会講演論文集(秋季) 2-P-10
  
  Pages: 113-114
- Related Report
  2006 Annual Research Report
[Journal Article] 参議院会議音声の言語モデル適応2006
- Author(s)
  加藤正治
- Journal Title
  
  日本音響学会講演論文集(秋季) 2-P-29
  
  Pages: 151-152
- Related Report
  2006 Annual Research Report
[Journal Article] Noisy Speech Recognition Based on Codebook Normalization of Discrete-Mixture HMMs2006
- Author(s)
  T.Kosaka
- Journal Title
  
  ASA/ASJ Forth Joint Meeting 1pSC27
  
  Pages: 3041-3041
- Related Report
  2006 Annual Research Report
[Journal Article] 音素モデルを用いた話者ベクトルに基づく話者識別の検討2006
- Author(s)
  赤津達也
- Journal Title
  
  電子情報通信学会技術研究報告 SP2006-101
  
  Pages: 95-99
- Related Report
  2006 Annual Research Report
[Presentation] 日本語話し言葉コーパスにおける話者クラス音響モデルの効果2008
- Author(s)
  武田優依
- Organizer
  音響学会、1-Q-21
- Place of Presentation
  千葉工業大学
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2007 Annual Research Report 2007 Final Research Report Summary
[Presentation] Effectiveness of speaker-class models for the corpus of spontaneous Japanese2008
- Author(s)
  Y. Takeda, M. Katoh, T. Kosaka, M. Kohda
- Organizer
  ASJ 2008 Spring Meeting, 1-Q-21
- Place of Presentation
  Chiba Institute of Technology
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2007 Final Research Report Summary
[Presentation] マルチコンディションモデルを用いた音楽環境下の音声認識の検討2008
- Author(s)
  大貫芳久
- Organizer
  情報処理学会東北支部研究会、07-6-C-1-1
- Place of Presentation
  山形大学
- Related Report
  2007 Annual Research Report
[Presentation] 話者ベクトルを用いた話者照合の検討2008
- Author(s)
  田所直樹
- Organizer
  情報処理学会東北支部研究会、07-6-C-1-2
- Place of Presentation
  山形大学
- Related Report
  2007 Annual Research Report
[Presentation] ヒストグラム同等化を用いた話者適応の検討2008
- Author(s)
  熊倉拓哉
- Organizer
  情報処理学会東北支部研究会、07-6-C-1-3
- Place of Presentation
  山形大学
- Related Report
  2007 Annual Research Report
[Presentation] 全共分散音響モデルの性能評価2008
- Author(s)
  伊藤貴
- Organizer
  情報処理学会東北支部研究会、07-6-C-1-4
- Place of Presentation
  山形大学
- Related Report
  2007 Annual Research Report
[Presentation] quinphone音響モデルの検討2008
- Author(s)
  東海林拓
- Organizer
  情報処理学会東北支部研究会、07-6-C-2-1
- Place of Presentation
  山形大学
- Related Report
  2007 Annual Research Report
[Presentation] 話し言葉音声認識のPLSA言語モデル適応2008
- Author(s)
  加藤正治
- Organizer
  情報処理学会東北支部研究会、07-6-C-2-2
- Place of Presentation
  山形大学
- Related Report
  2007 Annual Research Report
[Presentation] PLSAに基づくクラスN-gram言語モデルの適応2008
- Author(s)
  梅本真模
- Organizer
  情報処理学会東北支部研究会、07-6-C-2-3
- Place of Presentation
  山形大学
- Related Report
  2007 Annual Research Report
[Presentation] Spontaneous speech recognition using discrete-mixture HMMs2007
- Author(s)
  T., Kosaka
- Organizer
  International Congress on Acoustics 2007
- Place of Presentation
  マドリード、スペイン
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2007 Annual Research Report 2007 Final Research Report Summary
[Presentation] 話者ベクトルによる雑音下話者識別の検討2007
- Author(s)
  後藤佑樹
- Organizer
  電子情報通信学会技術研究報告、SP2007-18
- Place of Presentation
  会津大学
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2007 Annual Research Report 2007 Final Research Report Summary
[Presentation] 講演音声認識における教師なし適応の改善2007
- Author(s)
  草間隆
- Organizer
  電子情報通信学会技術研究報告、SP2007-20
- Place of Presentation
  会津大学
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2007 Annual Research Report 2007 Final Research Report Summary
[Presentation] 音素クラスHMMを使用した話者ベクトルに基づく話者識別法の検討2007
- Author(s)
  赤津達也
- Organizer
  電子情報通信学会技術研究報告、SP2007-135
- Place of Presentation
  NTTけいはんな
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2007 Annual Research Report 2007 Final Research Report Summary
[Presentation] 話者ベクトルを用いた話者識別における次元圧縮の効果2007
- Author(s)
  赤津達也
- Organizer
  音響学会、1-P-18
- Place of Presentation
  芝浦工業大学
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2007 Final Research Report Summary
[Presentation] 繰り返し教師なし適応による講演音声認識2007
- Author(s)
  草間隆
- Organizer
  音響学会、2-3-14
- Place of Presentation
  山梨大学
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2007 Annual Research Report 2007 Final Research Report Summary
[Presentation] Spontaneous speech recognition using discrete-mixture HMMs2007
- Author(s)
  T. Kosaka, M. Katoh, M. Kohda
- Organizer
  The 19th International Congress on Acoustics
- Place of Presentation
  Madrid, Spain
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2007 Final Research Report Summary
[Presentation] An investigation on speaker vector-based speaker identification under noisy conditions2007
- Author(s)
  Y. Goto, T. Akatsu, M. Katoh, T. Kosaka, M. Kohda
- Organizer
  IEICE Technical Report SP2007-18
- Place of Presentation
  University of Aizu
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2007 Final Research Report Summary
[Presentation] Improvement of unsupervised adaptation in lecture speech recognition2007
- Author(s)
  T. Kusama, Y. Okuyama, M. Katoh, T. Kosaka, M. Kohda
- Organizer
  IEICE Technical Report SP2007-20
- Place of Presentation
  University of Aizu
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2007 Final Research Report Summary
[Presentation] An investigation on the speaker vector-based speaker identification method with phonetic-class HMMs2007
- Author(s)
  T. Akatsu, M. Katoh, T. Kosaka, M. Kohda
- Organizer
  IEICE Technical Report SP2007-135
- Place of Presentation
  NTT CS Laboratories
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2007 Final Research Report Summary
[Presentation] An effect of reduction of dimension on the speaker identification using a speaker vector2007
- Author(s)
  T. Akatsu, M. Katoh, T. Kosaka, M. Kohda
- Organizer
  ASJ 2007 Spring Meeting, 1-P-18
- Place of Presentation
  Shibaura Institute of Technology
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2007 Final Research Report Summary
[Presentation] Lecture speech recognition by iterations of unsupervised adaptation2007
- Author(s)
  T. Kusama, Y. Okuyama, M. Katoh, T. Kosaka, M. Kohda
- Organizer
  ASJ 2007 Autumn Meeting, 2-3-14
- Place of Presentation
  University of Yamanashi
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2007 Final Research Report Summary
[Presentation] 識別学習による講演音声認識の性能改善2007
- Author(s)
  関東純平
- Organizer
  東北大学電気通信研究所音響工学研究会、348-2
- Place of Presentation
  東北大学
- Related Report
  2007 Annual Research Report
[Presentation] Noisy speech recognition based on codebook normalization of discrete-mixture HMMs2006
- Author(s)
  T., Kosaka
- Organizer
  ASA/ASJ 4th Joint Meeting
- Place of Presentation
  ハワイ、米国
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2007 Final Research Report Summary
[Presentation] 離散混合分布HMMのヒストグラム同等化を用いたコードブック正規化2006
- Author(s)
  小坂哲夫
- Organizer
  電子情報通信学会技術研究報告、SP2006-15
- Place of Presentation
  東北大学
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2007 Final Research Report Summary
[Presentation] コードブック適応を用いた離散混合分布型HMMによる講演音声認識2006
- Author(s)
  山本明祥
- Organizer
  情報処理学会研究報告、2006-SLP-62
- Place of Presentation
  鳴門温泉
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2007 Final Research Report Summary
[Presentation] 音素モデルを用いた話者ベクトルに基づく話者識別の検討2006
- Author(s)
  赤津達也
- Organizer
  電子情報通信学会技術研究報告、SP2006-101
- Place of Presentation
  名古屋大学
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2007 Final Research Report Summary
[Presentation] コードブック適応を用いた離散混合分布型HMMによる講演音声認識2006
- Author(s)
  山本明祥
- Organizer
  音響学会、2-2-9
- Place of Presentation
  金沢大学
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2007 Final Research Report Summary
[Presentation] 話者ベクトルを用いた話者識別法における音響モデルの検討2006
- Author(s)
  赤津達也
- Organizer
  音響学会、2-P-10
- Place of Presentation
  金沢大学
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2007 Final Research Report Summary
[Presentation] 参議院会議音声の言語モデル適応2006
- Author(s)
  加藤正治
- Organizer
  音響学会、2-P-29
- Place of Presentation
  金沢大学
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2007 Final Research Report Summary
[Presentation] Noisy speech recognition based on codebook normalization of discrete-mixture HMMs2006
- Author(s)
  T. Kosaka, M. Katoh, M. Kohda
- Organizer
  ASA/ASJ 4th Joint Meeting
- Place of Presentation
  Hawaii, USA
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2007 Final Research Report Summary
[Presentation] Codebook normalization of discrete-mixture HMMs by using histogram equalization2006
- Author(s)
  T. Kosaka, M. Katoh, M. Kohda
- Organizer
  MICE Technical Report SP2006-15
- Place of Presentation
  Tohoku University
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2007 Final Research Report Summary
[Presentation] Lecture speech recognition by using codebook adaptation of discrete-mixture HMMs2006
- Author(s)
  A. Yamamoto, T. Kumakura, M. Katoh, T. Kosaka, M. Kohda
- Organizer
  IPSJ SIG Technical Report 2006-SLP-62
- Place of Presentation
  Naruto Spa
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2007 Final Research Report Summary
[Presentation] An investigation on the speaker vector-based speaker identification with phonetic modeling2006
- Author(s)
  T. Akatsu, M. Katoh, T. Kosaka, M. Kohda
- Organizer
  IEICE Technical Report SP2006-101
- Place of Presentation
  Nagoya University
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2007 Final Research Report Summary
[Presentation] Lecture speech recognition by using codebook adaptation of discrete-mixture HMMs2006
- Author(s)
  A. Yamamoto, T. Kumakura, M. Katoh, T. Kosaka, M. Kohda
- Organizer
  ASJ 2006 Autumn Meeting, 2-2-9
- Place of Presentation
  Kanazawa University
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2007 Final Research Report Summary
[Presentation] An investigation on the acoustic model of the speaker identification using a speaker vector2006
- Author(s)
  T. Akatsu, M. Katoh, T. Kosaka, M. Kohda
- Organizer
  ASJ 2006 Autumn Meeting, 2-P-10
- Place of Presentation
  Kanazawa University
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2007 Final Research Report Summary
[Presentation] Language model adaptation for conference speech transcription2006
- Author(s)
  M. Katoh, T. Kosaka, M. Kohda
- Organizer
  ASJ 2006 Autumn Meeting, 2-P-29
- Place of Presentation
  Kanazawa University
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2007 Final Research Report Summary
[Book] Robust Speech Recognition and Understanding2007
- Author(s)
  T., Kosaka (分担執筆)
- Publisher
  I-Tech
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2007 Final Research Report Summary
[Book] Robust Speech Recognition and Understanding2007
- Author(s)
  T. Kosaka(分担執筆)
- Total Pages
  18
- Publisher
  I-Tech
- Related Report
  2007 Annual Research Report

Large-vocabulary continuous speech recognition on spontaneous speech task

Principal Investigator

KOHDA Masaki Yamagata University, Graduate School of Science and Engineering, Professor (00205337)

¥1,910,000 (Direct Cost: ¥1,700,000、Indirect Cost: ¥210,000)

Report

Research Products

[Journal Article] Histogram equalization for noise rebust speech recognition using discrete-mixture HMMs2008

Author(s)

Journal Title

Description

Related Report

[Journal Article] Histogram equalization for noise robust speech recognition using discrete-mixture HMMs2008

Author(s)

Journal Title

NAID

Description

Related Report

[Journal Article] Histogram equalization for noise robust speech recognition by using discrete-mixture HMMs2008

Author(s)

Journal Title

NAID

Related Report

[Journal Article] 音素モデルを用いた話者ベクトルに基づく話者識腹2007

Author(s)

Journal Title

Description

Related Report

[Journal Article] Speaker vector-based speaker identification with phonetic modeling2007

Author(s)

Journal Title

NAID

Description

Related Report

[Journal Article] 音素構造距離を用いた英語発音自動評定の精度向上の検討2007

Author(s)

Journal Title

Related Report

[Journal Article] 日本語話し言葉コーパスを用いた離散混合分布HMMの性能評価2007

Author(s)

Journal Title

Related Report

[Journal Article] 話し言葉音声認識における教師なし適応の改善2007

Author(s)

Journal Title

Related Report

[Journal Article] 会議音声の話者インデキシングと話者適応2007

Author(s)

Journal Title

Related Report

[Journal Article] 参議院の議事録を用いた言語モデルの作成2007

Author(s)

Journal Title

Related Report

[Journal Article] 日本語話し言葉コーパスを用いた重要文抽出2007

Author(s)

Journal Title

Related Report

[Journal Article] 話者ベクトルを用いた話者識別における次元圧縮の効果2007

Author(s)

Journal Title

Related Report

[Journal Article] 離散混合分布HMMのヒストグラム同等化を用いたコードブック正規化2006

Author(s)

Journal Title

NAID

Related Report

[Journal Article] コードブック適応を用いた離散混合分布型HMMによる講演音声認識2006

Author(s)

Journal Title

Related Report

[Journal Article] コードブック適応を用いた離散混合分布型HMMによる講演音声認識2006

Author(s)

Journal Title

Related Report

[Journal Article] 話者ベクトルを用いた話者識別法における音響モデルの検討2006

Author(s)

Journal Title

Related Report

[Journal Article] 参議院会議音声の言語モデル適応2006

Author(s)