• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to project page

2005 Fiscal Year Final Research Report Summary

Spontaneous speech recognition

Research Project

Project/Area Number 15500098
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeSingle-year Grants
Section一般
Research Field Perception information processing/Intelligent robotics
Research InstitutionYamagata University

Principal Investigator

KOHDA Masaki  Yamagata University, Faculty of Engineering, Professor, 工学部, 教授 (00205337)

Co-Investigator(Kenkyū-buntansha) KOSAKA Tetsuo  Yamagata University, Faculty of Engineering, Associate Professor, 工学部, 助教授 (50359569)
KATOH Masaharu  Yamagata University, Faculty of Engineering, Research Associate, 工学部, 助手 (10250953)
Project Period (FY) 2003 – 2005
KeywordsCorpus of Spontaneous Japanese / Spontaneous speech recognition / Robust speech recognition / Acoustic model / Language model / Unsupervised adaptation / Continuous-mixture HMM / Discrete-mixture HMM
Research Abstract

We investigated spontaneous speech recognition on academic lecture task and obtained the following results.
(1) Lecture speech recognition using pronunciation variant modeling and unsupervised adaptation
We focus on the pronunciation variations observed in spontaneous speech. Aiming to introduce the context-dependence of pronunciation variants, we propose a new method of language modeling based on morphological analysis data designed for pronunciation variant. The proposed method was evaluated on the Corpus of Spontaneous Japanese (CSJ) and achieved the decrease in word error rate (WER) by 4.74% absolute. In addition, unsupervised adaptation of both acoustic and language models was introduced to improve the recognition performance further. The results showed the decrease in WER from 19.96% without adaptation to 15.41% with unsupervised adaptation.
(2) Lecture speech recognition using discrete-mixture HMMs
We have investigated noisy speech recognition by using discrete-mixture HMM (DMHMM), … More and found that the performance of DMHMM overcame that of continuous-mixture HMM under environmental noise conditions or impulsive noise conditions. However, it is not clear whether this method is effective in clean conditions. The aim of this investigation is to evaluate the performance of the DMHMM system in clean conditions. In evaluation, we decided to use the "Corpus of Spontaneous Japanese" (CSJ) because we want to compare the performance of our system with that of other recognition systems with common speech corpus, and clarify the performance in such a more difficult task. In the recognition experiments, 3000-state DMHMMs (16 mixture components per state) were used as acoustic models. The language model which represents the pronunciation variety was trained by using 6.86 million words from 2668 lectures in CSJ and was used for recognition. As a result, the system obtained 20.30% WER for 10 academic lectures uttered by male speakers and demonstrated the effectiveness of the proposed method. Less

  • Research Products

    (12 results)

All 2006 2005 2004

All Journal Article (12 results)

  • [Journal Article] 発音変形依存モデルを用いた講演音声認識2006

    • Author(s)
      堤怜介
    • Journal Title

      電子情報通信学会論文誌 J89-D,2

      Pages: 305-313

    • Description
      「研究成果報告書概要(和文)」より
  • [Journal Article] Lecture speech recognition using pronunciation variant modeling2006

    • Author(s)
      R.Tsutsumi, M.Katoh, T.Kosaka, M.Kohda
    • Journal Title

      IEICE Transactions on Information and Systems Vol.J89-D, No.2

      Pages: 305-313

    • Description
      「研究成果報告書概要(欧文)」より
  • [Journal Article] Robust speech recognition under non-stationary noise using discrete-mixture HMMs2005

    • Author(s)
      T.Kosaka
    • Journal Title

      Proc. of International Workshop on Nonlinear Circuit and Signal Processing 1

      Pages: 347-350

    • Description
      「研究成果報告書概要(和文)」より
  • [Journal Article] Fast optimization of language model weight and insertion penalty from n-best candidates2005

    • Author(s)
      A.Ito
    • Journal Title

      Acoustical Science and Technology 26,4

      Pages: 384-387

    • Description
      「研究成果報告書概要(和文)」より
  • [Journal Article] Robust Speech Recognition Using Discrete-Mixture HMMs2005

    • Author(s)
      T.Kosaka
    • Journal Title

      IEICE Transaction on Information and Systems (電子情報通信学会英文論文誌) E88-D,12

      Pages: 2811-2818

    • Description
      「研究成果報告書概要(和文)」より
  • [Journal Article] Robust speech recognition under non-stationary noise using discrete-mixture HMMs2005

    • Author(s)
      T.Kosaka, M.Katoh, M.Kohda
    • Journal Title

      Proc.of International Workshop on Nonlinear Circuit and Signal Processing

      Pages: 347-350

    • Description
      「研究成果報告書概要(欧文)」より
  • [Journal Article] Fast optimization of language model weight and insertion penalty from n-best candidates2005

    • Author(s)
      A.Ito, M.Kohda, S.Makino
    • Journal Title

      Acoustical Science and Technology Vol.26, No.4

      Pages: 384-387

    • Description
      「研究成果報告書概要(欧文)」より
  • [Journal Article] Robust speech recognition using discrete-mixture HMMs2005

    • Author(s)
      T.Kosaka, M.Katoh, M.Kohda
    • Journal Title

      IEICE Transactions on Information and Systems Vol.E88-D, No.12

      Pages: 2811-2818

    • Description
      「研究成果報告書概要(欧文)」より
  • [Journal Article] Noisy speech recognition with discrete-mixture HMMs based on MAP estimation2004

    • Author(s)
      T.Kosaka
    • Journal Title

      Proc. of The 18th International Congress on Acoustics 2

      Pages: 1691-1694

    • Description
      「研究成果報告書概要(和文)」より
  • [Journal Article] Language modeling by an ergodic HMM based on an N-gram2004

    • Author(s)
      A.Ito
    • Journal Title

      Proc. of The 18th International Congress on Acoustics 5

      Pages: 3701-3704

    • Description
      「研究成果報告書概要(和文)」より
  • [Journal Article] Noisy speech recognition with discrete-mixture HMMs based on MAP estimation2004

    • Author(s)
      T.Kosaka, M.Katoh, M.Kohda
    • Journal Title

      Proc.of The 18th International Congress on Acoustics Vol.II

      Pages: 1691-1694

    • Description
      「研究成果報告書概要(欧文)」より
  • [Journal Article] Language modeling by an ergodic HMM based on an N-gram2004

    • Author(s)
      T.Nagano, M.Suzuki, A.Ito, S.Makino, M.Katoh, M.Kohda
    • Journal Title

      Proc.of The 18th International Congress on Acoustics Vol.V

      Pages: 3701-3704

    • Description
      「研究成果報告書概要(欧文)」より

URL: 

Published: 2007-12-13  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi