• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to project page

2007 Fiscal Year Final Research Report Summary

Large-vocabulary continuous speech recognition on spontaneous speech task

Research Project

Project/Area Number 18500126
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeSingle-year Grants
Section一般
Research Field Perception information processing/Intelligent robotics
Research InstitutionYamagata University

Principal Investigator

KOHDA Masaki  Yamagata University, Graduate School of Science and Engineering, Professor (00205337)

Co-Investigator(Kenkyū-buntansha) KOSAKA Tetsuo  Yamagata University, Graduate School of Science and Engineering, Associate Professor (50359569)
KATOH Masaharu  Yamagata University, Graduate School of Science and Engineering, Research Associate (10250953)
Project Period (FY) 2006 – 2007
KeywordsCorpus of Spontaneous Japanese / Speech recognition / Acoustic model / Language model / Unsupervised adaptation / System integration / Robust speech recognition
Research Abstract

1. Large-vocabulary continuous speech recognition on spontaneous speech task
In large-vocabulary continuous speech recognition, we investigate several methods of unsupervised adaptation of both acoustic and language models and evaluate the methods on the Corpus of Spontaneous Japanese (CSJ). The LVCSR system has full-covariance matrices as the acoustic model. The results of recognition experiments showed the decrease in word error rate (WER) from 19.17% without adaptation to 14.73% with unsupervised adaptation, moreover to 14.47% with unsupervised adaptation by weighting the adaptation data on the basis of a part of speech. Also, we compared the performance between continuous-mixture FRAM (CHMM) system and discrete-mixture HMM (DMHMM) system on the CSJ. As a result, DMHMM system provided almost the same performance as the CHMM system and WER of 19.73% had been obtained with 6000-state 24-mixture DMHMMs, though it has been generally believed that the recognition error rates of DMHMM were … More much higher than those of CHMM until now.
2. Robust speech recognition using discrete-mixture HMMs
We introduce a new method of robust speech recognition under noisy conditions based on discrete-mixture HMMs (DMHMMs). DMHMMs were originally proposed to reduce calculation costs in the decoding process. Recently, we have applied DMHMMs to noisy speech recognition, and found that they were effective for modeling noisy speech. Towards the further improvement of noise-robust speech recognition, we propose a novel normalization method for DMHMMs based on histogram equalization (HEQ). The HEQ method can compensate the nonlinear effects of additive noise. It is generally used for the feature space normalization of continuous-mixture HMM (CHMM) systems. In this paper, we propose both model space and feature space normalization of DMHMMs by using HEQ. In the model space normalization, codebooks of DMHMMs are modified by the transform function derived from the HEQ method. The proposed method was compared using both conventional CHMMs and DMHMMs. The results showed that the model space normalization of DMHMMs by multiple transform functions was effective for noise-robust speech recognition. Less

  • Research Products

    (33 results)

All 2008 2007 2006

All Journal Article (4 results) (of which Peer Reviewed: 2 results) Presentation (28 results) Book (1 results)

  • [Journal Article] Histogram equalization for noise rebust speech recognition using discrete-mixture HMMs2008

    • Author(s)
      T., Kosaka
    • Journal Title

      Acoustical Science and Techmhnology 29

      Pages: 66-73

    • Description
      「研究成果報告書概要(和文)」より
    • Peer Reviewed
  • [Journal Article] Histogram equalization for noise robust speech recognition using discrete-mixture HMMs2008

    • Author(s)
      T. Kosaka, M. Katoh, M. Kohda
    • Journal Title

      Acoustical Science and Technology vol.29, no.1

      Pages: 66-73

    • Description
      「研究成果報告書概要(欧文)」より
  • [Journal Article] 音素モデルを用いた話者ベクトルに基づく話者識腹2007

    • Author(s)
      小坂哲夫
    • Journal Title

      電子情報帳信学会論文誌D J90-D

      Pages: 3201-3209

    • Description
      「研究成果報告書概要(和文)」より
    • Peer Reviewed
  • [Journal Article] Speaker vector-based speaker identification with phonetic modeling2007

    • Author(s)
      T. Kosaka, T. Akatsu, M. Katoh, M. Kohda
    • Journal Title

      IEICE Trans. on Information and Systems vol.J90-D, no.12

      Pages: 3201-3209

    • Description
      「研究成果報告書概要(欧文)」より
  • [Presentation] 日本語話し言葉コーパスにおける話者クラス音響モデルの効果2008

    • Author(s)
      武田優依
    • Organizer
      音響学会、1-Q-21
    • Place of Presentation
      千葉工業大学
    • Year and Date
      20080300
    • Description
      「研究成果報告書概要(和文)」より
  • [Presentation] Effectiveness of speaker-class models for the corpus of spontaneous Japanese2008

    • Author(s)
      Y. Takeda, M. Katoh, T. Kosaka, M. Kohda
    • Organizer
      ASJ 2008 Spring Meeting, 1-Q-21
    • Place of Presentation
      Chiba Institute of Technology
    • Year and Date
      20080300
    • Description
      「研究成果報告書概要(欧文)」より
  • [Presentation] 音素クラスHMMを使用した話者ベクトルに基づく話者識別法の検討2007

    • Author(s)
      赤津達也
    • Organizer
      電子情報通信学会技術研究報告、SP2007-135
    • Place of Presentation
      NTTけいはんな
    • Year and Date
      20071200
    • Description
      「研究成果報告書概要(和文)」より
  • [Presentation] An investigation on the speaker vector-based speaker identification method with phonetic-class HMMs2007

    • Author(s)
      T. Akatsu, M. Katoh, T. Kosaka, M. Kohda
    • Organizer
      IEICE Technical Report SP2007-135
    • Place of Presentation
      NTT CS Laboratories
    • Year and Date
      20071200
    • Description
      「研究成果報告書概要(欧文)」より
  • [Presentation] Spontaneous speech recognition using discrete-mixture HMMs2007

    • Author(s)
      T., Kosaka
    • Organizer
      International Congress on Acoustics 2007
    • Place of Presentation
      マドリード、スペイン
    • Year and Date
      20070900
    • Description
      「研究成果報告書概要(和文)」より
  • [Presentation] 繰り返し教師なし適応による講演音声認識2007

    • Author(s)
      草間 隆
    • Organizer
      音響学会、2-3-14
    • Place of Presentation
      山梨大学
    • Year and Date
      20070900
    • Description
      「研究成果報告書概要(和文)」より
  • [Presentation] Spontaneous speech recognition using discrete-mixture HMMs2007

    • Author(s)
      T. Kosaka, M. Katoh, M. Kohda
    • Organizer
      The 19th International Congress on Acoustics
    • Place of Presentation
      Madrid, Spain
    • Year and Date
      20070900
    • Description
      「研究成果報告書概要(欧文)」より
  • [Presentation] Lecture speech recognition by iterations of unsupervised adaptation2007

    • Author(s)
      T. Kusama, Y. Okuyama, M. Katoh, T. Kosaka, M. Kohda
    • Organizer
      ASJ 2007 Autumn Meeting, 2-3-14
    • Place of Presentation
      University of Yamanashi
    • Year and Date
      20070900
    • Description
      「研究成果報告書概要(欧文)」より
  • [Presentation] 話者ベクトルによる雑音下話者識別の検討2007

    • Author(s)
      後藤佑樹
    • Organizer
      電子情報通信学会技術研究報告、SP2007-18
    • Place of Presentation
      会津大学
    • Year and Date
      20070600
    • Description
      「研究成果報告書概要(和文)」より
  • [Presentation] 講演音声認識における教師なし適応の改善2007

    • Author(s)
      草間 隆
    • Organizer
      電子情報通信学会技術研究報告、SP2007-20
    • Place of Presentation
      会津大学
    • Year and Date
      20070600
    • Description
      「研究成果報告書概要(和文)」より
  • [Presentation] An investigation on speaker vector-based speaker identification under noisy conditions2007

    • Author(s)
      Y. Goto, T. Akatsu, M. Katoh, T. Kosaka, M. Kohda
    • Organizer
      IEICE Technical Report SP2007-18
    • Place of Presentation
      University of Aizu
    • Year and Date
      20070600
    • Description
      「研究成果報告書概要(欧文)」より
  • [Presentation] Improvement of unsupervised adaptation in lecture speech recognition2007

    • Author(s)
      T. Kusama, Y. Okuyama, M. Katoh, T. Kosaka, M. Kohda
    • Organizer
      IEICE Technical Report SP2007-20
    • Place of Presentation
      University of Aizu
    • Year and Date
      20070600
    • Description
      「研究成果報告書概要(欧文)」より
  • [Presentation] 話者ベクトルを用いた話者識別における次元圧縮の効果2007

    • Author(s)
      赤津達也
    • Organizer
      音響学会、1-P-18
    • Place of Presentation
      芝浦工業大学
    • Year and Date
      20070300
    • Description
      「研究成果報告書概要(和文)」より
  • [Presentation] An effect of reduction of dimension on the speaker identification using a speaker vector2007

    • Author(s)
      T. Akatsu, M. Katoh, T. Kosaka, M. Kohda
    • Organizer
      ASJ 2007 Spring Meeting, 1-P-18
    • Place of Presentation
      Shibaura Institute of Technology
    • Year and Date
      20070300
    • Description
      「研究成果報告書概要(欧文)」より
  • [Presentation] 音素モデルを用いた話者ベクトルに基づく話者識別の検討2006

    • Author(s)
      赤津達也
    • Organizer
      電子情報通信学会技術研究報告、SP2006-101
    • Place of Presentation
      名古屋大学
    • Year and Date
      20061200
    • Description
      「研究成果報告書概要(和文)」より
  • [Presentation] An investigation on the speaker vector-based speaker identification with phonetic modeling2006

    • Author(s)
      T. Akatsu, M. Katoh, T. Kosaka, M. Kohda
    • Organizer
      IEICE Technical Report SP2006-101
    • Place of Presentation
      Nagoya University
    • Year and Date
      20061200
    • Description
      「研究成果報告書概要(欧文)」より
  • [Presentation] Noisy speech recognition based on codebook normalization of discrete-mixture HMMs2006

    • Author(s)
      T., Kosaka
    • Organizer
      ASA/ASJ 4th Joint Meeting
    • Place of Presentation
      ハワイ、米国
    • Year and Date
      20061100
    • Description
      「研究成果報告書概要(和文)」より
  • [Presentation] Noisy speech recognition based on codebook normalization of discrete-mixture HMMs2006

    • Author(s)
      T. Kosaka, M. Katoh, M. Kohda
    • Organizer
      ASA/ASJ 4th Joint Meeting
    • Place of Presentation
      Hawaii, USA
    • Year and Date
      20061100
    • Description
      「研究成果報告書概要(欧文)」より
  • [Presentation] コードブック適応を用いた離散混合分布型HMMによる講演音声認識2006

    • Author(s)
      山本明祥
    • Organizer
      音響学会、2-2-9
    • Place of Presentation
      金沢大学
    • Year and Date
      20060900
    • Description
      「研究成果報告書概要(和文)」より
  • [Presentation] 話者ベクトルを用いた話者識別法における音響モデルの検討2006

    • Author(s)
      赤津達也
    • Organizer
      音響学会、2-P-10
    • Place of Presentation
      金沢大学
    • Year and Date
      20060900
    • Description
      「研究成果報告書概要(和文)」より
  • [Presentation] 参議院会議音声の言語モデル適応2006

    • Author(s)
      加藤正治
    • Organizer
      音響学会、2-P-29
    • Place of Presentation
      金沢大学
    • Year and Date
      20060900
    • Description
      「研究成果報告書概要(和文)」より
  • [Presentation] Lecture speech recognition by using codebook adaptation of discrete-mixture HMMs2006

    • Author(s)
      A. Yamamoto, T. Kumakura, M. Katoh, T. Kosaka, M. Kohda
    • Organizer
      ASJ 2006 Autumn Meeting, 2-2-9
    • Place of Presentation
      Kanazawa University
    • Year and Date
      20060900
    • Description
      「研究成果報告書概要(欧文)」より
  • [Presentation] An investigation on the acoustic model of the speaker identification using a speaker vector2006

    • Author(s)
      T. Akatsu, M. Katoh, T. Kosaka, M. Kohda
    • Organizer
      ASJ 2006 Autumn Meeting, 2-P-10
    • Place of Presentation
      Kanazawa University
    • Year and Date
      20060900
    • Description
      「研究成果報告書概要(欧文)」より
  • [Presentation] Language model adaptation for conference speech transcription2006

    • Author(s)
      M. Katoh, T. Kosaka, M. Kohda
    • Organizer
      ASJ 2006 Autumn Meeting, 2-P-29
    • Place of Presentation
      Kanazawa University
    • Year and Date
      20060900
    • Description
      「研究成果報告書概要(欧文)」より
  • [Presentation] コードブック適応を用いた離散混合分布型HMMによる講演音声認識2006

    • Author(s)
      山本明祥
    • Organizer
      情報処理学会研究報告、2006-SLP-62
    • Place of Presentation
      鳴門温泉
    • Year and Date
      20060700
    • Description
      「研究成果報告書概要(和文)」より
  • [Presentation] Lecture speech recognition by using codebook adaptation of discrete-mixture HMMs2006

    • Author(s)
      A. Yamamoto, T. Kumakura, M. Katoh, T. Kosaka, M. Kohda
    • Organizer
      IPSJ SIG Technical Report 2006-SLP-62
    • Place of Presentation
      Naruto Spa
    • Year and Date
      20060700
    • Description
      「研究成果報告書概要(欧文)」より
  • [Presentation] 離散混合分布HMMのヒストグラム同等化を用いたコードブック正規化2006

    • Author(s)
      小坂哲夫
    • Organizer
      電子情報通信学会技術研究報告、SP2006-15
    • Place of Presentation
      東北大学
    • Year and Date
      20060600
    • Description
      「研究成果報告書概要(和文)」より
  • [Presentation] Codebook normalization of discrete-mixture HMMs by using histogram equalization2006

    • Author(s)
      T. Kosaka, M. Katoh, M. Kohda
    • Organizer
      MICE Technical Report SP2006-15
    • Place of Presentation
      Tohoku University
    • Year and Date
      20060600
    • Description
      「研究成果報告書概要(欧文)」より
  • [Book] Robust Speech Recognition and Understanding2007

    • Author(s)
      T., Kosaka (分担執筆)
    • Total Pages
      157-174
    • Publisher
      I-Tech
    • Description
      「研究成果報告書概要(和文)」より

URL: 

Published: 2010-02-04  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi