• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

低認識精度発声に対する音声認識に関する研究

Research Project

Project/Area Number 15700163
Research Category

Grant-in-Aid for Young Scientists (B)

Allocation TypeSingle-year Grants
Research Field Perception information processing/Intelligent robotics
Research InstitutionThe University of Tokushima

Principal Investigator

柘植 覚  徳島大学, 工学部, 講師 (00325250)

Project Period (FY) 2003 – 2005
Project Status Completed (Fiscal Year 2005)
Budget Amount *help
¥2,700,000 (Direct Cost: ¥2,700,000)
Fiscal Year 2005: ¥500,000 (Direct Cost: ¥500,000)
Fiscal Year 2004: ¥900,000 (Direct Cost: ¥900,000)
Fiscal Year 2003: ¥1,300,000 (Direct Cost: ¥1,300,000)
Keywords音声認識 / 低音声認識精度発声 / 相関分析 / 音声認識精度の分析 / 長短期間の音声変動 / 分散型音声認識 / 分散型話者認識 / Earth Mover's Distance / ベクトル量子化 / 周波数特性の変動
Research Abstract

本研究の研究の目的は以下の2点である.
◆低認識精度発声の原因解明
◆低認識精度発声の認識精度向上
この目的を実現するために、次のことを実施した。
原因解明のため、現在定期的に収録を行っている特定話者長期間音声データベースを用い、様々な要因との相関分析を行った。この結果より、特定話者の場合、発話速度は音声認識精度への相関が低いことがわかった。これは、発話速度は置換誤りと相関が低いが、挿入誤りとは高い負の相関を持ち、脱落誤りとは高い正の相関を持つため、挿入誤りと脱落誤りが相殺し、発話速度と音声認識精度の相関が低いことがわかった。また、音声認識精度と母音の各正解率との相関をしらべ、母音/a/、/u/は音声認識精度との相関が高いことがわかった。
低認識精度発声の認識精度向上のため、原因解明のために使用したデータと同様のデータを使用して、認識精度向上のため、各発声日、発声時間帯に音響モデルを適応することを試みた。これは、認識率改善のためには、一日内の音声変動が有効化、同じ時間帯の音声が有効化を検討した。この検討の結果、音声認識精度改善のためには同一内に発声された音声を用い、音響モデルを適応することが有効であることがわかった。

Report

(3 results)
  • 2005 Annual Research Report
  • 2004 Annual Research Report
  • 2003 Annual Research Report
  • Research Products

    (18 results)

All 2006 2005 2004 Other

All Journal Article (12 results) Publications (6 results)

  • [Journal Article] Nonparametric Speaker Recognition Method Using Earth Mover's Distance2006

    • Author(s)
      Shingo Kuroiwa
    • Journal Title

      IEICE Transactions on Information and Systems Vol.E89-D, No.3

      Pages: 1074-1081

    • Related Report
      2005 Annual Research Report
  • [Journal Article] Acoustic Model Adaptation for Cedec Speech based on Leaning-by-Doing Concept2006

    • Author(s)
      Shingo Kuroiwa
    • Journal Title

      Advances in Natural Language Processing Research in Computing Science Vol.18

      Pages: 105-114

    • Related Report
      2005 Annual Research Report
  • [Journal Article] Specific, Speaker's Japanese Speech Corpus over Long and Short Time Periods2006

    • Author(s)
      Satoru Tsuge
    • Journal Title

      Advances in Natural Language Processing Research in Computing Science Vol.18

      Pages: 115-124

    • Related Report
      2005 Annual Research Report
  • [Journal Article] Data Collection for Investigating Speech Variability in a Specific Speaker Over Long and Short Time Periods2005

    • Author(s)
      Satoru Tsuge
    • Journal Title

      Proc.of 2005 IEEE International Conference on Natural Language Processing and Knowledge Engineering (IEEE NLP-KE'05)

      Pages: 152-157

    • Related Report
      2005 Annual Research Report
  • [Journal Article] A Lost Speech Reconstruction Method Using Linguistic Information2005

    • Author(s)
      Shingo Kuroiwa
    • Journal Title

      Proc.of 2005 IEEE International Conference on Natural Language Processing and Knowledge Engineering (IEEE NLP-KE'05)

      Pages: 126-130

    • Related Report
      2005 Annual Research Report
  • [Journal Article] Frequency Characteristic Normalization Method Using Blind Equalization Technique with Multiple References for DSR2005

    • Author(s)
      Satoru Tsuge
    • Journal Title

      Proc.of 10th International Conference SPEECH and COMPUTER (SPECOM2005) Vol.1

      Pages: 103-106

    • Related Report
      2005 Annual Research Report
  • [Journal Article] ETSI標準分散音声認識フロントエンドにおける入力系の周波数特性正規化手法2005

    • Author(s)
      柘植 覚
    • Journal Title

      電気学会論文誌C 125・7

      Pages: 120-127

    • NAID

      10014100435

    • Related Report
      2004 Annual Research Report
  • [Journal Article] Non-negative Matrix Factorizationを用いたベクトル空間情報検索モデルの次元削減手法2004

    • Author(s)
      柘植 覚
    • Journal Title

      電気学会論文誌C 124・7

      Pages: 1500-1506

    • NAID

      10013268306

    • Related Report
      2004 Annual Research Report
  • [Journal Article] Evaluation of frequency characteristic normalization method with multiple reference cepstrum on the Japanese newspaper article sentences speech corpus2004

    • Author(s)
      Satoru Tsuge
    • Journal Title

      Proc.of the third International Conference on Information

      Pages: 199-202

    • Related Report
      2004 Annual Research Report
  • [Journal Article] Speaker Recognition using a Non-parametric Speaker Model Representation and Earth Mover's Distance2004

    • Author(s)
      Umeda Yoshiyuki
    • Journal Title

      Proc.of International Workshop on statistical modeling approach for speech recognition, "BEYOND HMM"

    • Related Report
      2004 Annual Research Report
  • [Journal Article] Distributed Speaker Recognition using Earth Mover's Distance2004

    • Author(s)
      Umeda Yoshiyuki
    • Journal Title

      Proc.of International Conference on Spoken Language Processing Vol.3

      Pages: 2389-2493

    • Related Report
      2004 Annual Research Report
  • [Journal Article] Acoustic model adaptation for coded speech using synthetic speech2004

    • Author(s)
      Shingo Kuroiwa
    • Journal Title

      Proc.of International Conference on Spoken Language Processing Vol.4

      Pages: 2925-2928

    • Related Report
      2004 Annual Research Report
  • [Publications] Satoru Tsuge: "Evaluation of ETSI Advanced Front-end and Bias Removal Method on the Japanese Newspaper Article"Proceedings of EUROSPEECH2003. 2145-2148 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] Shingo Kuroiwa: "Blind Equalization Techniques for ETSI Standard DSR Front-end"Proceedings of ICASSP2003. 1. 392-395 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] Koji Tanaka: "An acoustic model adaptation using HMM-based speech synthesis"Proceedings of Natural Language Processing and Knowledge Engineering. 1. 368-373 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] Shingo Kuroiwa: "Blind equalization via minimization of VQ distortion for ETSI standard DSR front-end"Proceedings of Natural Language Processing and Knowledge Engineering. 1. 585-590 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] 柘植 覚: "周波数特性の変動に頑健な実時間分散音声認識手法"情報処理学会 研究報告. 42. 13-18 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] 柘植 覚: "分散型音声認識のための実時間周波数特性正規化手法"日本音響学会 秋季講演発表会. 111-112 (2003)

    • Related Report
      2003 Annual Research Report

URL: 

Published: 2003-04-01   Modified: 2016-04-21  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi