• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Improvement of speech recognition performance by using phase information with long analysis window

Research Project

Project/Area Number 24500201
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeMulti-year Fund
Section一般
Research Field Perception information processing/Intelligent robotics
Research InstitutionToyohashi University of Technology (2012, 2014)
Toyota National College of Technology (2013)

Principal Investigator

YAMAMOTO Kazumasa  豊橋技術科学大学, 工学(系)研究科(研究院), 准教授 (40324230)

Co-Investigator(Kenkyū-buntansha) NAKAGAWA Seiichi  豊橋技術科学大学, リーディング大学院教育推進機構, 特任教授 (20115893)
Project Period (FY) 2012-04-01 – 2015-03-31
Project Status Completed (Fiscal Year 2014)
Budget Amount *help
¥5,330,000 (Direct Cost: ¥4,100,000、Indirect Cost: ¥1,230,000)
Fiscal Year 2014: ¥1,300,000 (Direct Cost: ¥1,000,000、Indirect Cost: ¥300,000)
Fiscal Year 2013: ¥1,300,000 (Direct Cost: ¥1,000,000、Indirect Cost: ¥300,000)
Fiscal Year 2012: ¥2,730,000 (Direct Cost: ¥2,100,000、Indirect Cost: ¥630,000)
Keywords音声認識 / 音響モデル / 音響特徴量 / 位相スペクトル / 群遅延スペクトル / 分析窓 / 雑音環境 / ディープニューラルネットワーク / 長時間分析 / 群遅延 / 深層学習
Outline of Final Research Achievements

In traditional speech recognition techniques, amplitude spectrum based features (typically MFCC or PLP) are usually used as acoustic features, while phase spectrum based features are almost ignored. In this research, we showed that the phase spectrum based features, which extracted as group delay spectrum based cepstrum features by using the longer (100-200ms) analysis window then usual one (25ms), can be used for speech recognition as the same as the amplitude spectrum based features and we can improve speech recognition performance by using the both features simultaneously. We also studied about deep learning based acoustic models for robust speech recognition in this research. We modified “noise aware training” method of Deep Neural Network based HMM (DNN-HMM) so that the DNN can treat “enhanced” noisy speech features and noise estimates. We then showed the improvement of noisy speech recognition by using the proposed method.

Report

(4 results)
  • 2014 Annual Research Report   Final Research Report ( PDF )
  • 2013 Research-status Report
  • 2012 Research-status Report
  • Research Products

    (7 results)

All 2015 2014 2013 2012

All Presentation (7 results)

  • [Presentation] Noise-aware trainingとSSを併用したDNN-HMM音響モデルの雑音下音声認識の評価2015

    • Author(s)
      阿部晃大, 山本一公, 中川聖一
    • Organizer
      日本音響学会2015年春季研究発表会
    • Place of Presentation
      中央大学後楽園キャンパス
    • Year and Date
      2015-03-16 – 2015-03-18
    • Related Report
      2014 Annual Research Report
  • [Presentation] Speech recognition based on Itakura-Saito divergence and dynamics / sparseness constraints from mixed sound of speech and music by non-negative matrix factorization2014

    • Author(s)
      Naoki Hashimoto, Kazumasa Yamamoto, Seiichi Nakagawa
    • Organizer
      INTERSPEECH 2014
    • Place of Presentation
      Singapore EXPO(シンガポール)
    • Year and Date
      2014-09-15 – 2014-09-18
    • Related Report
      2014 Annual Research Report
  • [Presentation] Comparison of syllable-based and phoneme-based DNN-HMM in Japanese speech recognition2014

    • Author(s)
      Hiroshi Seki, Kazumasa Yamamoto, Seiichi Nakagawa
    • Organizer
      International Conference on Advanced Infomatics: Concepts, Theory and Applications (ICAICTA 2014)
    • Place of Presentation
      バンドン工科大学(インドネシア)
    • Year and Date
      2014-08-20 – 2014-08-21
    • Related Report
      2014 Annual Research Report
  • [Presentation] Fast NMF based approach and VQ based approach using MFCC distance measure for speech recognition from mixed sound2013

    • Author(s)
      Shoichi Nakano, Kazumasa Yamamoto, Seiichi Nakagawa
    • Organizer
      2013 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference
    • Place of Presentation
      Kaohsiung, Taiwan
    • Related Report
      2013 Research-status Report
  • [Presentation] NMF による音楽重畳音声の音声認識の改善2013

    • Author(s)
      橋本尚亮, 仲野翔一, 山本一公, 中川聖一
    • Organizer
      日本音響学会2013年秋季研究発表会
    • Place of Presentation
      豊橋技術科学大学
    • Related Report
      2013 Research-status Report
  • [Presentation] ケプストラム距離に基づくNMFの高速化手法とVQ手法による音楽重畳音声の認識2013

    • Author(s)
      仲野翔一, 山本一公, 中川聖一
    • Organizer
      日本音響学会2013年春季研究発表会
    • Place of Presentation
      東京工科大学
    • Related Report
      2012 Research-status Report
  • [Presentation] Fast NMF based approach and improved VQ based approach for speech recognition from mixed sound2012

    • Author(s)
      Shoichi Nakano, Kazumasa Yamamoto, Seiichi Nakagawa
    • Organizer
      Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC) 2012
    • Place of Presentation
      アメリカ, ハリウッド
    • Related Report
      2012 Research-status Report

URL: 

Published: 2013-05-31   Modified: 2019-07-29  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi