• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Large-vocabulary continuous speech recognition on spontaneous speech task

Research Project

Project/Area Number 18500126
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeSingle-year Grants
Section一般
Research Field Perception information processing/Intelligent robotics
Research InstitutionYamagata University

Principal Investigator

KOHDA Masaki  Yamagata University, Graduate School of Science and Engineering, Professor (00205337)

Co-Investigator(Kenkyū-buntansha) KOSAKA Tetsuo  Yamagata University, Graduate School of Science and Engineering, Associate Professor (50359569)
KATOH Masaharu  Yamagata University, Graduate School of Science and Engineering, Research Associate (10250953)
Project Period (FY) 2006 – 2007
Project Status Completed (Fiscal Year 2007)
Budget Amount *help
¥1,910,000 (Direct Cost: ¥1,700,000、Indirect Cost: ¥210,000)
Fiscal Year 2007: ¥910,000 (Direct Cost: ¥700,000、Indirect Cost: ¥210,000)
Fiscal Year 2006: ¥1,000,000 (Direct Cost: ¥1,000,000)
KeywordsCorpus of Spontaneous Japanese / Speech recognition / Acoustic model / Language model / Unsupervised adaptation / System integration / Robust speech recognition / 混合連続分布HMM / 離散混合分布HMM
Research Abstract

1. Large-vocabulary continuous speech recognition on spontaneous speech task
In large-vocabulary continuous speech recognition, we investigate several methods of unsupervised adaptation of both acoustic and language models and evaluate the methods on the Corpus of Spontaneous Japanese (CSJ). The LVCSR system has full-covariance matrices as the acoustic model. The results of recognition experiments showed the decrease in word error rate (WER) from 19.17% without adaptation to 14.73% with unsupervised adaptation, moreover to 14.47% with unsupervised adaptation by weighting the adaptation data on the basis of a part of speech. Also, we compared the performance between continuous-mixture FRAM (CHMM) system and discrete-mixture HMM (DMHMM) system on the CSJ. As a result, DMHMM system provided almost the same performance as the CHMM system and WER of 19.73% had been obtained with 6000-state 24-mixture DMHMMs, though it has been generally believed that the recognition error rates of DMHMM were … More much higher than those of CHMM until now.
2. Robust speech recognition using discrete-mixture HMMs
We introduce a new method of robust speech recognition under noisy conditions based on discrete-mixture HMMs (DMHMMs). DMHMMs were originally proposed to reduce calculation costs in the decoding process. Recently, we have applied DMHMMs to noisy speech recognition, and found that they were effective for modeling noisy speech. Towards the further improvement of noise-robust speech recognition, we propose a novel normalization method for DMHMMs based on histogram equalization (HEQ). The HEQ method can compensate the nonlinear effects of additive noise. It is generally used for the feature space normalization of continuous-mixture HMM (CHMM) systems. In this paper, we propose both model space and feature space normalization of DMHMMs by using HEQ. In the model space normalization, codebooks of DMHMMs are modified by the transform function derived from the HEQ method. The proposed method was compared using both conventional CHMMs and DMHMMs. The results showed that the model space normalization of DMHMMs by multiple transform functions was effective for noise-robust speech recognition. Less

Report

(3 results)
  • 2007 Annual Research Report   Final Research Report Summary
  • 2006 Annual Research Report
  • Research Products

    (57 results)

All 2008 2007 2006

All Journal Article (19 results) (of which Peer Reviewed: 3 results) Presentation (36 results) Book (2 results)

  • [Journal Article] Histogram equalization for noise rebust speech recognition using discrete-mixture HMMs2008

    • Author(s)
      T., Kosaka
    • Journal Title

      Acoustical Science and Techmhnology 29

      Pages: 66-73

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2007 Final Research Report Summary
    • Peer Reviewed
  • [Journal Article] Histogram equalization for noise robust speech recognition using discrete-mixture HMMs2008

    • Author(s)
      T. Kosaka, M. Katoh, M. Kohda
    • Journal Title

      Acoustical Science and Technology vol.29, no.1

      Pages: 66-73

    • NAID

      110006533631

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2007 Final Research Report Summary
  • [Journal Article] Histogram equalization for noise robust speech recognition by using discrete-mixture HMMs2008

    • Author(s)
      T. Kosaka
    • Journal Title

      Acoustical Science and Technology 29

      Pages: 66-73

    • NAID

      110006533631

    • Related Report
      2007 Annual Research Report
    • Peer Reviewed
  • [Journal Article] 音素モデルを用いた話者ベクトルに基づく話者識腹2007

    • Author(s)
      小坂哲夫
    • Journal Title

      電子情報帳信学会論文誌D J90-D

      Pages: 3201-3209

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2007 Annual Research Report 2007 Final Research Report Summary
    • Peer Reviewed
  • [Journal Article] Speaker vector-based speaker identification with phonetic modeling2007

    • Author(s)
      T. Kosaka, T. Akatsu, M. Katoh, M. Kohda
    • Journal Title

      IEICE Trans. on Information and Systems vol.J90-D, no.12

      Pages: 3201-3209

    • NAID

      110007380643

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2007 Final Research Report Summary
  • [Journal Article] 音素構造距離を用いた英語発音自動評定の精度向上の検討2007

    • Author(s)
      山口涼子
    • Journal Title

      情報処理学会東北支部研究会 06-6-A1-1

      Pages: 1-8

    • Related Report
      2006 Annual Research Report
  • [Journal Article] 日本語話し言葉コーパスを用いた離散混合分布HMMの性能評価2007

    • Author(s)
      山本秋祥
    • Journal Title

      情報処理学会東北支部研究会 06-6-A1-2

      Pages: 1-6

    • Related Report
      2006 Annual Research Report
  • [Journal Article] 話し言葉音声認識における教師なし適応の改善2007

    • Author(s)
      草間隆
    • Journal Title

      情報処理学会東北支部研究会 06-6-A1-3

      Pages: 1-9

    • Related Report
      2006 Annual Research Report
  • [Journal Article] 会議音声の話者インデキシングと話者適応2007

    • Author(s)
      齋藤徹也
    • Journal Title

      情報処理学会東北支部研究会 06-6-A1-4

      Pages: 1-7

    • Related Report
      2006 Annual Research Report
  • [Journal Article] 参議院の議事録を用いた言語モデルの作成2007

    • Author(s)
      手塚収太
    • Journal Title

      情報処理学会東北支部研究会 06-6-A2-1

      Pages: 1-6

    • Related Report
      2006 Annual Research Report
  • [Journal Article] 日本語話し言葉コーパスを用いた重要文抽出2007

    • Author(s)
      宇野涼子
    • Journal Title

      情報処理学会東北支部研究会 06-6-A2-2

      Pages: 1-8

    • Related Report
      2006 Annual Research Report
  • [Journal Article] 話者ベクトルを用いた話者識別における次元圧縮の効果2007

    • Author(s)
      赤津達也
    • Journal Title

      日本音響学会講演論文集(春季) 1-P-18

      Pages: 159-160

    • Related Report
      2006 Annual Research Report
  • [Journal Article] 離散混合分布HMMのヒストグラム同等化を用いたコードブック正規化2006

    • Author(s)
      小坂哲夫
    • Journal Title

      電子情報通信学会技術研究報告 SP2006-15

      Pages: 25-30

    • NAID

      110004750981

    • Related Report
      2006 Annual Research Report
  • [Journal Article] コードブック適応を用いた離散混合分布型HMMによる講演音声認識2006

    • Author(s)
      山本明祥
    • Journal Title

      情報処理学会研究報告 2006-SLP-62

      Pages: 25-30

    • Related Report
      2006 Annual Research Report
  • [Journal Article] コードブック適応を用いた離散混合分布型HMMによる講演音声認識2006

    • Author(s)
      山本明祥
    • Journal Title

      日本音響学会講演論文集(秋季) 2-2-9

    • Related Report
      2006 Annual Research Report
  • [Journal Article] 話者ベクトルを用いた話者識別法における音響モデルの検討2006

    • Author(s)
      赤津達也
    • Journal Title

      日本音響学会講演論文集(秋季) 2-P-10

      Pages: 113-114

    • Related Report
      2006 Annual Research Report
  • [Journal Article] 参議院会議音声の言語モデル適応2006

    • Author(s)
      加藤正治
    • Journal Title

      日本音響学会講演論文集(秋季) 2-P-29

      Pages: 151-152

    • Related Report
      2006 Annual Research Report
  • [Journal Article] Noisy Speech Recognition Based on Codebook Normalization of Discrete-Mixture HMMs2006

    • Author(s)
      T.Kosaka
    • Journal Title

      ASA/ASJ Forth Joint Meeting 1pSC27

      Pages: 3041-3041

    • Related Report
      2006 Annual Research Report
  • [Journal Article] 音素モデルを用いた話者ベクトルに基づく話者識別の検討2006

    • Author(s)
      赤津達也
    • Journal Title

      電子情報通信学会技術研究報告 SP2006-101

      Pages: 95-99

    • Related Report
      2006 Annual Research Report
  • [Presentation] 日本語話し言葉コーパスにおける話者クラス音響モデルの効果2008

    • Author(s)
      武田優依
    • Organizer
      音響学会、1-Q-21
    • Place of Presentation
      千葉工業大学
    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2007 Annual Research Report 2007 Final Research Report Summary
  • [Presentation] Effectiveness of speaker-class models for the corpus of spontaneous Japanese2008

    • Author(s)
      Y. Takeda, M. Katoh, T. Kosaka, M. Kohda
    • Organizer
      ASJ 2008 Spring Meeting, 1-Q-21
    • Place of Presentation
      Chiba Institute of Technology
    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2007 Final Research Report Summary
  • [Presentation] マルチコンディションモデルを用いた音楽環境下の音声認識の検討2008

    • Author(s)
      大貫芳久
    • Organizer
      情報処理学会東北支部研究会、07-6-C-1-1
    • Place of Presentation
      山形大学
    • Related Report
      2007 Annual Research Report
  • [Presentation] 話者ベクトルを用いた話者照合の検討2008

    • Author(s)
      田所直樹
    • Organizer
      情報処理学会東北支部研究会、07-6-C-1-2
    • Place of Presentation
      山形大学
    • Related Report
      2007 Annual Research Report
  • [Presentation] ヒストグラム同等化を用いた話者適応の検討2008

    • Author(s)
      熊倉拓哉
    • Organizer
      情報処理学会東北支部研究会、07-6-C-1-3
    • Place of Presentation
      山形大学
    • Related Report
      2007 Annual Research Report
  • [Presentation] 全共分散音響モデルの性能評価2008

    • Author(s)
      伊藤貴
    • Organizer
      情報処理学会東北支部研究会、07-6-C-1-4
    • Place of Presentation
      山形大学
    • Related Report
      2007 Annual Research Report
  • [Presentation] quinphone音響モデルの検討2008

    • Author(s)
      東海林拓
    • Organizer
      情報処理学会東北支部研究会、07-6-C-2-1
    • Place of Presentation
      山形大学
    • Related Report
      2007 Annual Research Report
  • [Presentation] 話し言葉音声認識のPLSA言語モデル適応2008

    • Author(s)
      加藤正治
    • Organizer
      情報処理学会東北支部研究会、07-6-C-2-2
    • Place of Presentation
      山形大学
    • Related Report
      2007 Annual Research Report
  • [Presentation] PLSAに基づくクラスN-gram言語モデルの適応2008

    • Author(s)
      梅本真模
    • Organizer
      情報処理学会東北支部研究会、07-6-C-2-3
    • Place of Presentation
      山形大学
    • Related Report
      2007 Annual Research Report
  • [Presentation] Spontaneous speech recognition using discrete-mixture HMMs2007

    • Author(s)
      T., Kosaka
    • Organizer
      International Congress on Acoustics 2007
    • Place of Presentation
      マドリード、スペイン
    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2007 Annual Research Report 2007 Final Research Report Summary
  • [Presentation] 話者ベクトルによる雑音下話者識別の検討2007

    • Author(s)
      後藤佑樹
    • Organizer
      電子情報通信学会技術研究報告、SP2007-18
    • Place of Presentation
      会津大学
    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2007 Annual Research Report 2007 Final Research Report Summary
  • [Presentation] 講演音声認識における教師なし適応の改善2007

    • Author(s)
      草間 隆
    • Organizer
      電子情報通信学会技術研究報告、SP2007-20
    • Place of Presentation
      会津大学
    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2007 Annual Research Report 2007 Final Research Report Summary
  • [Presentation] 音素クラスHMMを使用した話者ベクトルに基づく話者識別法の検討2007

    • Author(s)
      赤津達也
    • Organizer
      電子情報通信学会技術研究報告、SP2007-135
    • Place of Presentation
      NTTけいはんな
    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2007 Annual Research Report 2007 Final Research Report Summary
  • [Presentation] 話者ベクトルを用いた話者識別における次元圧縮の効果2007

    • Author(s)
      赤津達也
    • Organizer
      音響学会、1-P-18
    • Place of Presentation
      芝浦工業大学
    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2007 Final Research Report Summary
  • [Presentation] 繰り返し教師なし適応による講演音声認識2007

    • Author(s)
      草間 隆
    • Organizer
      音響学会、2-3-14
    • Place of Presentation
      山梨大学
    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2007 Annual Research Report 2007 Final Research Report Summary
  • [Presentation] Spontaneous speech recognition using discrete-mixture HMMs2007

    • Author(s)
      T. Kosaka, M. Katoh, M. Kohda
    • Organizer
      The 19th International Congress on Acoustics
    • Place of Presentation
      Madrid, Spain
    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2007 Final Research Report Summary
  • [Presentation] An investigation on speaker vector-based speaker identification under noisy conditions2007

    • Author(s)
      Y. Goto, T. Akatsu, M. Katoh, T. Kosaka, M. Kohda
    • Organizer
      IEICE Technical Report SP2007-18
    • Place of Presentation
      University of Aizu
    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2007 Final Research Report Summary
  • [Presentation] Improvement of unsupervised adaptation in lecture speech recognition2007

    • Author(s)
      T. Kusama, Y. Okuyama, M. Katoh, T. Kosaka, M. Kohda
    • Organizer
      IEICE Technical Report SP2007-20
    • Place of Presentation
      University of Aizu
    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2007 Final Research Report Summary
  • [Presentation] An investigation on the speaker vector-based speaker identification method with phonetic-class HMMs2007

    • Author(s)
      T. Akatsu, M. Katoh, T. Kosaka, M. Kohda
    • Organizer
      IEICE Technical Report SP2007-135
    • Place of Presentation
      NTT CS Laboratories
    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2007 Final Research Report Summary
  • [Presentation] An effect of reduction of dimension on the speaker identification using a speaker vector2007

    • Author(s)
      T. Akatsu, M. Katoh, T. Kosaka, M. Kohda
    • Organizer
      ASJ 2007 Spring Meeting, 1-P-18
    • Place of Presentation
      Shibaura Institute of Technology
    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2007 Final Research Report Summary
  • [Presentation] Lecture speech recognition by iterations of unsupervised adaptation2007

    • Author(s)
      T. Kusama, Y. Okuyama, M. Katoh, T. Kosaka, M. Kohda
    • Organizer
      ASJ 2007 Autumn Meeting, 2-3-14
    • Place of Presentation
      University of Yamanashi
    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2007 Final Research Report Summary
  • [Presentation] 識別学習による講演音声認識の性能改善2007

    • Author(s)
      関東純平
    • Organizer
      東北大学電気通信研究所音響工学研究会、348-2
    • Place of Presentation
      東北大学
    • Related Report
      2007 Annual Research Report
  • [Presentation] Noisy speech recognition based on codebook normalization of discrete-mixture HMMs2006

    • Author(s)
      T., Kosaka
    • Organizer
      ASA/ASJ 4th Joint Meeting
    • Place of Presentation
      ハワイ、米国
    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2007 Final Research Report Summary
  • [Presentation] 離散混合分布HMMのヒストグラム同等化を用いたコードブック正規化2006

    • Author(s)
      小坂哲夫
    • Organizer
      電子情報通信学会技術研究報告、SP2006-15
    • Place of Presentation
      東北大学
    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2007 Final Research Report Summary
  • [Presentation] コードブック適応を用いた離散混合分布型HMMによる講演音声認識2006

    • Author(s)
      山本明祥
    • Organizer
      情報処理学会研究報告、2006-SLP-62
    • Place of Presentation
      鳴門温泉
    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2007 Final Research Report Summary
  • [Presentation] 音素モデルを用いた話者ベクトルに基づく話者識別の検討2006

    • Author(s)
      赤津達也
    • Organizer
      電子情報通信学会技術研究報告、SP2006-101
    • Place of Presentation
      名古屋大学
    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2007 Final Research Report Summary
  • [Presentation] コードブック適応を用いた離散混合分布型HMMによる講演音声認識2006

    • Author(s)
      山本明祥
    • Organizer
      音響学会、2-2-9
    • Place of Presentation
      金沢大学
    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2007 Final Research Report Summary
  • [Presentation] 話者ベクトルを用いた話者識別法における音響モデルの検討2006

    • Author(s)
      赤津達也
    • Organizer
      音響学会、2-P-10
    • Place of Presentation
      金沢大学
    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2007 Final Research Report Summary
  • [Presentation] 参議院会議音声の言語モデル適応2006

    • Author(s)
      加藤正治
    • Organizer
      音響学会、2-P-29
    • Place of Presentation
      金沢大学
    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2007 Final Research Report Summary
  • [Presentation] Noisy speech recognition based on codebook normalization of discrete-mixture HMMs2006

    • Author(s)
      T. Kosaka, M. Katoh, M. Kohda
    • Organizer
      ASA/ASJ 4th Joint Meeting
    • Place of Presentation
      Hawaii, USA
    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2007 Final Research Report Summary
  • [Presentation] Codebook normalization of discrete-mixture HMMs by using histogram equalization2006

    • Author(s)
      T. Kosaka, M. Katoh, M. Kohda
    • Organizer
      MICE Technical Report SP2006-15
    • Place of Presentation
      Tohoku University
    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2007 Final Research Report Summary
  • [Presentation] Lecture speech recognition by using codebook adaptation of discrete-mixture HMMs2006

    • Author(s)
      A. Yamamoto, T. Kumakura, M. Katoh, T. Kosaka, M. Kohda
    • Organizer
      IPSJ SIG Technical Report 2006-SLP-62
    • Place of Presentation
      Naruto Spa
    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2007 Final Research Report Summary
  • [Presentation] An investigation on the speaker vector-based speaker identification with phonetic modeling2006

    • Author(s)
      T. Akatsu, M. Katoh, T. Kosaka, M. Kohda
    • Organizer
      IEICE Technical Report SP2006-101
    • Place of Presentation
      Nagoya University
    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2007 Final Research Report Summary
  • [Presentation] Lecture speech recognition by using codebook adaptation of discrete-mixture HMMs2006

    • Author(s)
      A. Yamamoto, T. Kumakura, M. Katoh, T. Kosaka, M. Kohda
    • Organizer
      ASJ 2006 Autumn Meeting, 2-2-9
    • Place of Presentation
      Kanazawa University
    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2007 Final Research Report Summary
  • [Presentation] An investigation on the acoustic model of the speaker identification using a speaker vector2006

    • Author(s)
      T. Akatsu, M. Katoh, T. Kosaka, M. Kohda
    • Organizer
      ASJ 2006 Autumn Meeting, 2-P-10
    • Place of Presentation
      Kanazawa University
    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2007 Final Research Report Summary
  • [Presentation] Language model adaptation for conference speech transcription2006

    • Author(s)
      M. Katoh, T. Kosaka, M. Kohda
    • Organizer
      ASJ 2006 Autumn Meeting, 2-P-29
    • Place of Presentation
      Kanazawa University
    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2007 Final Research Report Summary
  • [Book] Robust Speech Recognition and Understanding2007

    • Author(s)
      T., Kosaka (分担執筆)
    • Publisher
      I-Tech
    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2007 Final Research Report Summary
  • [Book] Robust Speech Recognition and Understanding2007

    • Author(s)
      T. Kosaka(分担執筆)
    • Total Pages
      18
    • Publisher
      I-Tech
    • Related Report
      2007 Annual Research Report

URL: 

Published: 2006-04-01   Modified: 2016-04-21  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi