• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Spontaneous speech recognition

Research Project

Project/Area Number 15500098
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeSingle-year Grants
Section一般
Research Field Perception information processing/Intelligent robotics
Research InstitutionYamagata University

Principal Investigator

KOHDA Masaki  Yamagata University, Faculty of Engineering, Professor, 工学部, 教授 (00205337)

Co-Investigator(Kenkyū-buntansha) KOSAKA Tetsuo  Yamagata University, Faculty of Engineering, Associate Professor, 工学部, 助教授 (50359569)
KATOH Masaharu  Yamagata University, Faculty of Engineering, Research Associate, 工学部, 助手 (10250953)
Project Period (FY) 2003 – 2005
Project Status Completed (Fiscal Year 2005)
Budget Amount *help
¥3,200,000 (Direct Cost: ¥3,200,000)
Fiscal Year 2005: ¥800,000 (Direct Cost: ¥800,000)
Fiscal Year 2004: ¥1,000,000 (Direct Cost: ¥1,000,000)
Fiscal Year 2003: ¥1,400,000 (Direct Cost: ¥1,400,000)
KeywordsCorpus of Spontaneous Japanese / Spontaneous speech recognition / Robust speech recognition / Acoustic model / Language model / Unsupervised adaptation / Continuous-mixture HMM / Discrete-mixture HMM / 音声認識 / 発音変形依存モデル / MLLR / 品詞N-gram
Research Abstract

We investigated spontaneous speech recognition on academic lecture task and obtained the following results.
(1) Lecture speech recognition using pronunciation variant modeling and unsupervised adaptation
We focus on the pronunciation variations observed in spontaneous speech. Aiming to introduce the context-dependence of pronunciation variants, we propose a new method of language modeling based on morphological analysis data designed for pronunciation variant. The proposed method was evaluated on the Corpus of Spontaneous Japanese (CSJ) and achieved the decrease in word error rate (WER) by 4.74% absolute. In addition, unsupervised adaptation of both acoustic and language models was introduced to improve the recognition performance further. The results showed the decrease in WER from 19.96% without adaptation to 15.41% with unsupervised adaptation.
(2) Lecture speech recognition using discrete-mixture HMMs
We have investigated noisy speech recognition by using discrete-mixture HMM (DMHMM), … More and found that the performance of DMHMM overcame that of continuous-mixture HMM under environmental noise conditions or impulsive noise conditions. However, it is not clear whether this method is effective in clean conditions. The aim of this investigation is to evaluate the performance of the DMHMM system in clean conditions. In evaluation, we decided to use the "Corpus of Spontaneous Japanese" (CSJ) because we want to compare the performance of our system with that of other recognition systems with common speech corpus, and clarify the performance in such a more difficult task. In the recognition experiments, 3000-state DMHMMs (16 mixture components per state) were used as acoustic models. The language model which represents the pronunciation variety was trained by using 6.86 million words from 2668 lectures in CSJ and was used for recognition. As a result, the system obtained 20.30% WER for 10 academic lectures uttered by male speakers and demonstrated the effectiveness of the proposed method. Less

Report

(4 results)
  • 2005 Annual Research Report   Final Research Report Summary
  • 2004 Annual Research Report
  • 2003 Annual Research Report
  • Research Products

    (44 results)

All 2006 2005 2004 Other

All Journal Article (32 results) Publications (12 results)

  • [Journal Article] 発音変形依存モデルを用いた講演音声認識2006

    • Author(s)
      堤怜介
    • Journal Title

      電子情報通信学会論文誌 J89-D,2

      Pages: 305-313

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Annual Research Report 2005 Final Research Report Summary
  • [Journal Article] Lecture speech recognition using pronunciation variant modeling2006

    • Author(s)
      R.Tsutsumi, M.Katoh, T.Kosaka, M.Kohda
    • Journal Title

      IEICE Transactions on Information and Systems Vol.J89-D, No.2

      Pages: 305-313

    • NAID

      110004669949

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] 巻き起こしと講演録を用いた言語モデルの作成法の検討2006

    • Author(s)
      加藤正治
    • Journal Title

      日本音響学会講演論文集(春季) 3-1-7

      Pages: 1203-1204

    • Related Report
      2005 Annual Research Report
  • [Journal Article] 日本語話し言葉コーパスを用いた教師なし適応による講演音声認識の性能改善2006

    • Author(s)
      阿部拓也
    • Journal Title

      日本音響学会講演論文集(春季) 3-1-8

      Pages: 1205-1206

    • Related Report
      2005 Annual Research Report
  • [Journal Article] 離散混合分布HMMのコードブック正規化による雑音下音声認識2006

    • Author(s)
      遠藤大悟
    • Journal Title

      日本音響学会講演論文集(春季) 3-1-16

      Pages: 139-140

    • Related Report
      2005 Annual Research Report
  • [Journal Article] ヒストグラム同等化を用いた離散混合分布HMMのコードブック適応2006

    • Author(s)
      熊倉拓哉
    • Journal Title

      情報処理学会東北支部研究会 05-5-A1-1

      Pages: 1-7

    • Related Report
      2005 Annual Research Report
  • [Journal Article] 話者ベクトルを用いた雑音下話者認識手法の検討2006

    • Author(s)
      赤津達也
    • Journal Title

      情報処理学会東北支部研究会 05-5-A1-2

      Pages: 1-7

    • Related Report
      2005 Annual Research Report
  • [Journal Article] 教師なし適応による講演音声認識の性能改善2006

    • Author(s)
      草間隆
    • Journal Title

      情報処理学会東北支部研究会 05-5-A1-3

      Pages: 1-8

    • Related Report
      2005 Annual Research Report
  • [Journal Article] 書き起こしと講演録を用いた言語モデルの作成法の検討2006

    • Author(s)
      梅本真模
    • Journal Title

      情報処理学会東北支部研究会 05-5-A1-4

      Pages: 1-8

    • Related Report
      2005 Annual Research Report
  • [Journal Article] 日本語話し言葉コーパスを用いた音声要約の検討2006

    • Author(s)
      宇野涼子
    • Journal Title

      情報処理学会東北支部研究会 05-5-A1-5

      Pages: 1-8

    • Related Report
      2005 Annual Research Report
  • [Journal Article] Robust speech recognition under non-stationary noise using discrete-mixture HMMs2005

    • Author(s)
      T.Kosaka
    • Journal Title

      Proc. of International Workshop on Nonlinear Circuit and Signal Processing 1

      Pages: 347-350

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Fast optimization of language model weight and insertion penalty from n-best candidates2005

    • Author(s)
      A.Ito
    • Journal Title

      Acoustical Science and Technology 26,4

      Pages: 384-387

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Robust Speech Recognition Using Discrete-Mixture HMMs2005

    • Author(s)
      T.Kosaka
    • Journal Title

      IEICE Transaction on Information and Systems (電子情報通信学会英文論文誌) E88-D,12

      Pages: 2811-2818

    • NAID

      110004019504

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Robust speech recognition under non-stationary noise using discrete-mixture HMMs2005

    • Author(s)
      T.Kosaka, M.Katoh, M.Kohda
    • Journal Title

      Proc.of International Workshop on Nonlinear Circuit and Signal Processing

      Pages: 347-350

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Fast optimization of language model weight and insertion penalty from n-best candidates2005

    • Author(s)
      A.Ito, M.Kohda, S.Makino
    • Journal Title

      Acoustical Science and Technology Vol.26, No.4

      Pages: 384-387

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Robust speech recognition using discrete-mixture HMMs2005

    • Author(s)
      T.Kosaka, M.Katoh, M.Kohda
    • Journal Title

      IEICE Transactions on Information and Systems Vol.E88-D, No.12

      Pages: 2811-2818

    • NAID

      110004019504

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] 離散混合分布型HMMによる講演音声認識の検討2005

    • Author(s)
      小坂哲夫
    • Journal Title

      電子情報通信学会技術研究報告 SP2005-25

      Pages: 31-36

    • Related Report
      2005 Annual Research Report
  • [Journal Article] 日本語話し言葉コーパスを用いた発音変形依存モデルによる講演音声認識の性能評価2005

    • Author(s)
      阿部拓也
    • Journal Title

      日本音響学会講演論文集(秋季) 2-1-1

      Pages: 44-44

    • Related Report
      2005 Annual Research Report
  • [Journal Article] 日本語話し言葉コーパスによる離散混合分布型HMMの評価2005

    • Author(s)
      小坂哲夫
    • Journal Title

      日本音響学会講演論文集(秋季) 2-7-19

      Pages: 64-64

    • Related Report
      2005 Annual Research Report
  • [Journal Article] 日本語話し言葉コーパスを用いた発音変形依存モデルによる講演音声認識の性能評価2005

    • Author(s)
      阿部拓也
    • Journal Title

      電子情報通信学会技術研究報告 SP2005-94

      Pages: 25-30

    • Related Report
      2005 Annual Research Report
  • [Journal Article] Rebust Speech Recognition Using Discrete-Mixture HMMs2005

    • Author(s)
      小坂哲夫
    • Journal Title

      IEICE Trans. on Information and Systems(電子情報通信学会英文論文誌) E88-D,12

      Pages: 2811-2818

    • Related Report
      2005 Annual Research Report
  • [Journal Article] 日本語話し言葉コーパスの形態素解析2005

    • Author(s)
      加藤 正治
    • Journal Title

      情報処理学会 東北支部研究会 04-6-A1-3

      Pages: 1-8

    • Related Report
      2004 Annual Research Report
  • [Journal Article] 日本語話し言葉コーパスを用いた講演音声認識の性能評価2005

    • Author(s)
      阿部 拓也
    • Journal Title

      情報処理学会 東北支部研究会 04-6-A1-4

      Pages: 1-8

    • Related Report
      2004 Annual Research Report
  • [Journal Article] 分散音声認識システムにおける話者ベクトルを用いた話者識別の検討2005

    • Author(s)
      松本 和樹
    • Journal Title

      情報処理学会 東北支部研究会 04-6-A2-1

      Pages: 1-8

    • Related Report
      2004 Annual Research Report
  • [Journal Article] 離散混合出力分布型HMMによる雑音下音声認識のMFCCでの評価2005

    • Author(s)
      小坂 哲夫
    • Journal Title

      日本音響学会講演論文集(春季) 3-5-11

      Pages: 97-98

    • NAID

      10018037199

    • Related Report
      2004 Annual Research Report
  • [Journal Article] Robust speech recognition under non-stationary noise using discrete-mixture HMMs2005

    • Author(s)
      小坂 哲夫
    • Journal Title

      2005 RISP International Workshop on Nonlinear Circuits and Signal Processing

      Pages: 347-350

    • Related Report
      2004 Annual Research Report
  • [Journal Article] Noisy speech recognition with discrete-mixture HMMs based on MAP estimation2004

    • Author(s)
      T.Kosaka
    • Journal Title

      Proc. of The 18th International Congress on Acoustics 2

      Pages: 1691-1694

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Language modeling by an ergodic HMM based on an N-gram2004

    • Author(s)
      A.Ito
    • Journal Title

      Proc. of The 18th International Congress on Acoustics 5

      Pages: 3701-3704

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Noisy speech recognition with discrete-mixture HMMs based on MAP estimation2004

    • Author(s)
      T.Kosaka, M.Katoh, M.Kohda
    • Journal Title

      Proc.of The 18th International Congress on Acoustics Vol.II

      Pages: 1691-1694

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Language modeling by an ergodic HMM based on an N-gram2004

    • Author(s)
      T.Nagano, M.Suzuki, A.Ito, S.Makino, M.Katoh, M.Kohda
    • Journal Title

      Proc.of The 18th International Congress on Acoustics Vol.V

      Pages: 3701-3704

    • NAID

      110003297644

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] ETSI標準フロントエンドを用いた雑音下音声認識の検討2004

    • Author(s)
      福士 なな子
    • Journal Title

      電子情報通信学会 技術研究報告 104,86(SP2004-11)

      Pages: 7-12

    • Related Report
      2004 Annual Research Report
  • [Journal Article] 参議院会議の音声認識2004

    • Author(s)
      加藤 正治
    • Journal Title

      日本音響学会講演論文集(秋季) 2-1-2

      Pages: 39-40

    • Related Report
      2004 Annual Research Report
  • [Publications] 堤 怜介: "講演音声認識における音響・言語モデルの話者適応の検討"電子情報通信学会 技術研究報告. 103, 94(SP2003-27). 7-12 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] 小坂 哲夫: "MAP推定を用いた離散混合出力分布型HMMの雑音重畳音声での評価"電子情報通信学会 技術研究報告. 103, 93(SP2003-21). 7-12 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] 小坂 哲夫: "MAP推定による離散混合出力分布型HMMを用いた非定常雑音下における音声認識の検討"日本音響学会講演論文集(秋季). 1-6-14. 27-28 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] 福士 なな子: "ETSI標準フロントエンドを用いたマルチコンディション学習による雑音重畳音声認識の検討"日本音響学会講演論文集(秋季). 1-6-8. 15-16 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] 金野 弘明: "かな・漢字文字列を単位とした言語モデルの検討"東北大学電気通信研究所 音響工学研究会. 326-4. 1-6 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] 小坂 哲夫: "離散混合出力分布型HMMを用いた非定常雑音下の音声認識"電子情報通信学会 技術研究報告. 103, 519(SP2003-132). 115-120 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] 堤 怜介: "発音変形依存と教師なし適応による講演音声認識の性能改善"話し言葉の科学と工学ワークショップ. 3. 93-98 (2004)

    • Related Report
      2003 Annual Research Report
  • [Publications] 福士 なな子: "ETSI標準フロントエンドを用いた雑音重畳音声認識の検討"情報処理学会 東北支部研究会. 03-5-B2-1. 1-8 (2004)

    • Related Report
      2003 Annual Research Report
  • [Publications] 松本 和樹: "分散音声認識のクライアントにおけるマイク特性変動の除去"情報処理学会 東北支部研究会. 03-5-B2-2. 1-8 (2004)

    • Related Report
      2003 Annual Research Report
  • [Publications] 堤 怜介: "発音変形依存と教師なし適応による講演音声認識の性能改善"日本音響学会講演論文集(春季). 2-11-3. 105-106 (2004)

    • Related Report
      2003 Annual Research Report
  • [Publications] 金野 弘明: "相互情報量と出現頻度を併用した文字列N-gram"日本音響学会講演論文集(春季). 2-8-4. 67-68 (2004)

    • Related Report
      2003 Annual Research Report
  • [Publications] 小坂 哲夫: "Noisy speech recognition with discrete-mixture HMMs based on MAP estimation"18th International Congress on Acoustics. Tu. P2.8. (2004)

    • Related Report
      2003 Annual Research Report

URL: 

Published: 2003-04-01   Modified: 2016-04-21  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi