2007 Fiscal Year Final Research Report Summary

Large-vocabulary continuous speech recognition on spontaneous speech task

Research Project

Project/Area Number	18500126
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Single-year Grants
Section	一般
Research Field	Perception information processing/Intelligent robotics
Research Institution	Yamagata University
Principal Investigator	KOHDA Masaki Yamagata University, Graduate School of Science and Engineering, Professor (00205337)
Co-Investigator(Kenkyū-buntansha)	KOSAKA Tetsuo Yamagata University, Graduate School of Science and Engineering, Associate Professor (50359569) KATOH Masaharu Yamagata University, Graduate School of Science and Engineering, Research Associate (10250953)
Project Period (FY)	2006 – 2007
Keywords	Corpus of Spontaneous Japanese / Speech recognition / Acoustic model / Language model / Unsupervised adaptation / System integration / Robust speech recognition
Research Abstract	1. Large-vocabulary continuous speech recognition on spontaneous speech task In large-vocabulary continuous speech recognition, we investigate several methods of unsupervised adaptation of both acoustic and language models and evaluate the methods on the Corpus of Spontaneous Japanese (CSJ). The LVCSR system has full-covariance matrices as the acoustic model. The results of recognition experiments showed the decrease in word error rate (WER) from 19.17% without adaptation to 14.73% with unsupervised adaptation, moreover to 14.47% with unsupervised adaptation by weighting the adaptation data on the basis of a part of speech. Also, we compared the performance between continuous-mixture FRAM (CHMM) system and discrete-mixture HMM (DMHMM) system on the CSJ. As a result, DMHMM system provided almost the same performance as the CHMM system and WER of 19.73% had been obtained with 6000-state 24-mixture DMHMMs, though it has been generally believed that the recognition error rates of DMHMM were … More much higher than those of CHMM until now. 2. Robust speech recognition using discrete-mixture HMMs We introduce a new method of robust speech recognition under noisy conditions based on discrete-mixture HMMs (DMHMMs). DMHMMs were originally proposed to reduce calculation costs in the decoding process. Recently, we have applied DMHMMs to noisy speech recognition, and found that they were effective for modeling noisy speech. Towards the further improvement of noise-robust speech recognition, we propose a novel normalization method for DMHMMs based on histogram equalization (HEQ). The HEQ method can compensate the nonlinear effects of additive noise. It is generally used for the feature space normalization of continuous-mixture HMM (CHMM) systems. In this paper, we propose both model space and feature space normalization of DMHMMs by using HEQ. In the model space normalization, codebooks of DMHMMs are modified by the transform function derived from the HEQ method. The proposed method was compared using both conventional CHMMs and DMHMMs. The results showed that the model space normalization of DMHMMs by multiple transform functions was effective for noise-robust speech recognition. Less

Research Products
(33 results)

All 2008 2007 2006

All Journal Article (4 results) (of which Peer Reviewed: 2 results) Presentation (28 results) Book (1 results)

[Journal Article] Histogram equalization for noise rebust speech recognition using discrete-mixture HMMs2008
- Author(s)
  T., Kosaka
- Journal Title
  
  Acoustical Science and Techmhnology 29
  
  Pages: 66-73
- Description
  「研究成果報告書概要(和文)」より
- Peer Reviewed
[Journal Article] Histogram equalization for noise robust speech recognition using discrete-mixture HMMs2008
- Author(s)
  T. Kosaka, M. Katoh, M. Kohda
- Journal Title
  
  Acoustical Science and Technology vol.29, no.1
  
  Pages: 66-73
- Description
  「研究成果報告書概要(欧文)」より
[Journal Article] 音素モデルを用いた話者ベクトルに基づく話者識腹2007
- Author(s)
  小坂哲夫
- Journal Title
  
  電子情報帳信学会論文誌D J90-D
  
  Pages: 3201-3209
- Description
  「研究成果報告書概要(和文)」より
- Peer Reviewed
[Journal Article] Speaker vector-based speaker identification with phonetic modeling2007
- Author(s)
  T. Kosaka, T. Akatsu, M. Katoh, M. Kohda
- Journal Title
  
  IEICE Trans. on Information and Systems vol.J90-D, no.12
  
  Pages: 3201-3209
- Description
  「研究成果報告書概要(欧文)」より
[Presentation] 日本語話し言葉コーパスにおける話者クラス音響モデルの効果2008
- Author(s)
  武田優依
- Organizer
  音響学会、1-Q-21
- Place of Presentation
  千葉工業大学
- Year and Date
  20080300
- Description
  「研究成果報告書概要(和文)」より
[Presentation] Effectiveness of speaker-class models for the corpus of spontaneous Japanese2008
- Author(s)
  Y. Takeda, M. Katoh, T. Kosaka, M. Kohda
- Organizer
  ASJ 2008 Spring Meeting, 1-Q-21
- Place of Presentation
  Chiba Institute of Technology
- Year and Date
  20080300
- Description
  「研究成果報告書概要(欧文)」より
[Presentation] 音素クラスHMMを使用した話者ベクトルに基づく話者識別法の検討2007
- Author(s)
  赤津達也
- Organizer
  電子情報通信学会技術研究報告、SP2007-135
- Place of Presentation
  NTTけいはんな
- Year and Date
  20071200
- Description
  「研究成果報告書概要(和文)」より
[Presentation] An investigation on the speaker vector-based speaker identification method with phonetic-class HMMs2007
- Author(s)
  T. Akatsu, M. Katoh, T. Kosaka, M. Kohda
- Organizer
  IEICE Technical Report SP2007-135
- Place of Presentation
  NTT CS Laboratories
- Year and Date
  20071200
- Description
  「研究成果報告書概要(欧文)」より
[Presentation] Spontaneous speech recognition using discrete-mixture HMMs2007
- Author(s)
  T., Kosaka
- Organizer
  International Congress on Acoustics 2007
- Place of Presentation
  マドリード、スペイン
- Year and Date
  20070900
- Description
  「研究成果報告書概要(和文)」より
[Presentation] 繰り返し教師なし適応による講演音声認識2007
- Author(s)
  草間隆
- Organizer
  音響学会、2-3-14
- Place of Presentation
  山梨大学
- Year and Date
  20070900
- Description
  「研究成果報告書概要(和文)」より
[Presentation] Spontaneous speech recognition using discrete-mixture HMMs2007
- Author(s)
  T. Kosaka, M. Katoh, M. Kohda
- Organizer
  The 19th International Congress on Acoustics
- Place of Presentation
  Madrid, Spain
- Year and Date
  20070900
- Description
  「研究成果報告書概要(欧文)」より
[Presentation] Lecture speech recognition by iterations of unsupervised adaptation2007
- Author(s)
  T. Kusama, Y. Okuyama, M. Katoh, T. Kosaka, M. Kohda
- Organizer
  ASJ 2007 Autumn Meeting, 2-3-14
- Place of Presentation
  University of Yamanashi
- Year and Date
  20070900
- Description
  「研究成果報告書概要(欧文)」より
[Presentation] 話者ベクトルによる雑音下話者識別の検討2007
- Author(s)
  後藤佑樹
- Organizer
  電子情報通信学会技術研究報告、SP2007-18
- Place of Presentation
  会津大学
- Year and Date
  20070600
- Description
  「研究成果報告書概要(和文)」より
[Presentation] 講演音声認識における教師なし適応の改善2007
- Author(s)
  草間隆
- Organizer
  電子情報通信学会技術研究報告、SP2007-20
- Place of Presentation
  会津大学
- Year and Date
  20070600
- Description
  「研究成果報告書概要(和文)」より
[Presentation] An investigation on speaker vector-based speaker identification under noisy conditions2007
- Author(s)
  Y. Goto, T. Akatsu, M. Katoh, T. Kosaka, M. Kohda
- Organizer
  IEICE Technical Report SP2007-18
- Place of Presentation
  University of Aizu
- Year and Date
  20070600
- Description
  「研究成果報告書概要(欧文)」より
[Presentation] Improvement of unsupervised adaptation in lecture speech recognition2007
- Author(s)
  T. Kusama, Y. Okuyama, M. Katoh, T. Kosaka, M. Kohda
- Organizer
  IEICE Technical Report SP2007-20
- Place of Presentation
  University of Aizu
- Year and Date
  20070600
- Description
  「研究成果報告書概要(欧文)」より
[Presentation] 話者ベクトルを用いた話者識別における次元圧縮の効果2007
- Author(s)
  赤津達也
- Organizer
  音響学会、1-P-18
- Place of Presentation
  芝浦工業大学
- Year and Date
  20070300
- Description
  「研究成果報告書概要(和文)」より
[Presentation] An effect of reduction of dimension on the speaker identification using a speaker vector2007
- Author(s)
  T. Akatsu, M. Katoh, T. Kosaka, M. Kohda
- Organizer
  ASJ 2007 Spring Meeting, 1-P-18
- Place of Presentation
  Shibaura Institute of Technology
- Year and Date
  20070300
- Description
  「研究成果報告書概要(欧文)」より
[Presentation] 音素モデルを用いた話者ベクトルに基づく話者識別の検討2006
- Author(s)
  赤津達也
- Organizer
  電子情報通信学会技術研究報告、SP2006-101
- Place of Presentation
  名古屋大学
- Year and Date
  20061200
- Description
  「研究成果報告書概要(和文)」より
[Presentation] An investigation on the speaker vector-based speaker identification with phonetic modeling2006
- Author(s)
  T. Akatsu, M. Katoh, T. Kosaka, M. Kohda
- Organizer
  IEICE Technical Report SP2006-101
- Place of Presentation
  Nagoya University
- Year and Date
  20061200
- Description
  「研究成果報告書概要(欧文)」より
[Presentation] Noisy speech recognition based on codebook normalization of discrete-mixture HMMs2006
- Author(s)
  T., Kosaka
- Organizer
  ASA/ASJ 4th Joint Meeting
- Place of Presentation
  ハワイ、米国
- Year and Date
  20061100
- Description
  「研究成果報告書概要(和文)」より
[Presentation] Noisy speech recognition based on codebook normalization of discrete-mixture HMMs2006
- Author(s)
  T. Kosaka, M. Katoh, M. Kohda
- Organizer
  ASA/ASJ 4th Joint Meeting
- Place of Presentation
  Hawaii, USA
- Year and Date
  20061100
- Description
  「研究成果報告書概要(欧文)」より
[Presentation] コードブック適応を用いた離散混合分布型HMMによる講演音声認識2006
- Author(s)
  山本明祥
- Organizer
  音響学会、2-2-9
- Place of Presentation
  金沢大学
- Year and Date
  20060900
- Description
  「研究成果報告書概要(和文)」より
[Presentation] 話者ベクトルを用いた話者識別法における音響モデルの検討2006
- Author(s)
  赤津達也
- Organizer
  音響学会、2-P-10
- Place of Presentation
  金沢大学
- Year and Date
  20060900
- Description
  「研究成果報告書概要(和文)」より
[Presentation] 参議院会議音声の言語モデル適応2006
- Author(s)
  加藤正治
- Organizer
  音響学会、2-P-29
- Place of Presentation
  金沢大学
- Year and Date
  20060900
- Description
  「研究成果報告書概要(和文)」より
[Presentation] Lecture speech recognition by using codebook adaptation of discrete-mixture HMMs2006
- Author(s)
  A. Yamamoto, T. Kumakura, M. Katoh, T. Kosaka, M. Kohda
- Organizer
  ASJ 2006 Autumn Meeting, 2-2-9
- Place of Presentation
  Kanazawa University
- Year and Date
  20060900
- Description
  「研究成果報告書概要(欧文)」より
[Presentation] An investigation on the acoustic model of the speaker identification using a speaker vector2006
- Author(s)
  T. Akatsu, M. Katoh, T. Kosaka, M. Kohda
- Organizer
  ASJ 2006 Autumn Meeting, 2-P-10
- Place of Presentation
  Kanazawa University
- Year and Date
  20060900
- Description
  「研究成果報告書概要(欧文)」より
[Presentation] Language model adaptation for conference speech transcription2006
- Author(s)
  M. Katoh, T. Kosaka, M. Kohda
- Organizer
  ASJ 2006 Autumn Meeting, 2-P-29
- Place of Presentation
  Kanazawa University
- Year and Date
  20060900
- Description
  「研究成果報告書概要(欧文)」より
[Presentation] コードブック適応を用いた離散混合分布型HMMによる講演音声認識2006
- Author(s)
  山本明祥
- Organizer
  情報処理学会研究報告、2006-SLP-62
- Place of Presentation
  鳴門温泉
- Year and Date
  20060700
- Description
  「研究成果報告書概要(和文)」より
[Presentation] Lecture speech recognition by using codebook adaptation of discrete-mixture HMMs2006
- Author(s)
  A. Yamamoto, T. Kumakura, M. Katoh, T. Kosaka, M. Kohda
- Organizer
  IPSJ SIG Technical Report 2006-SLP-62
- Place of Presentation
  Naruto Spa
- Year and Date
  20060700
- Description
  「研究成果報告書概要(欧文)」より
[Presentation] 離散混合分布HMMのヒストグラム同等化を用いたコードブック正規化2006
- Author(s)
  小坂哲夫
- Organizer
  電子情報通信学会技術研究報告、SP2006-15
- Place of Presentation
  東北大学
- Year and Date
  20060600
- Description
  「研究成果報告書概要(和文)」より
[Presentation] Codebook normalization of discrete-mixture HMMs by using histogram equalization2006
- Author(s)
  T. Kosaka, M. Katoh, M. Kohda
- Organizer
  MICE Technical Report SP2006-15
- Place of Presentation
  Tohoku University
- Year and Date
  20060600
- Description
  「研究成果報告書概要(欧文)」より
[Book] Robust Speech Recognition and Understanding2007
- Author(s)
  T., Kosaka (分担執筆)
- Total Pages
  157-174
- Publisher
  I-Tech
- Description
  「研究成果報告書概要(和文)」より

2007 Fiscal Year Final Research Report Summary

Large-vocabulary continuous speech recognition on spontaneous speech task

Principal Investigator

KOHDA Masaki Yamagata University, Graduate School of Science and Engineering, Professor (00205337)

Research Products

[Journal Article] Histogram equalization for noise rebust speech recognition using discrete-mixture HMMs2008

Author(s)

Journal Title

Description

[Journal Article] Histogram equalization for noise robust speech recognition using discrete-mixture HMMs2008

Author(s)

Journal Title

Description

[Journal Article] 音素モデルを用いた話者ベクトルに基づく話者識腹2007

Author(s)

Journal Title

Description

[Journal Article] Speaker vector-based speaker identification with phonetic modeling2007

Author(s)

Journal Title

Description

[Presentation] 日本語話し言葉コーパスにおける話者クラス音響モデルの効果2008

Author(s)

Organizer

Place of Presentation

Year and Date

Description

[Presentation] Effectiveness of speaker-class models for the corpus of spontaneous Japanese2008

Author(s)

Organizer

Place of Presentation

Year and Date

Description

[Presentation] 音素クラスHMMを使用した話者ベクトルに基づく話者識別法の検討2007

Author(s)

Organizer

Place of Presentation

Year and Date

Description

[Presentation] An investigation on the speaker vector-based speaker identification method with phonetic-class HMMs2007

Author(s)

Organizer

Place of Presentation

Year and Date

Description

[Presentation] Spontaneous speech recognition using discrete-mixture HMMs2007

Author(s)

Organizer

Place of Presentation

Year and Date

Description

[Presentation] 繰り返し教師なし適応による講演音声認識2007

Author(s)

Organizer

Place of Presentation

Year and Date

Description

[Presentation] Spontaneous speech recognition using discrete-mixture HMMs2007

Author(s)

Organizer

Place of Presentation

Year and Date

Description

[Presentation] Lecture speech recognition by iterations of unsupervised adaptation2007

Author(s)

Organizer

Place of Presentation

Year and Date

Description

[Presentation] 話者ベクトルによる雑音下話者識別の検討2007

Author(s)

Organizer

Place of Presentation

Year and Date

Description

[Presentation] 講演音声認識における教師なし適応の改善2007

Author(s)

Organizer

Place of Presentation

Year and Date