Improvement of speech recognition performance by using phase information with long analysis window
Project/Area Number |
24500201
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Multi-year Fund |
Section | 一般 |
Research Field |
Perception information processing/Intelligent robotics
|
Research Institution | Toyohashi University of Technology (2012, 2014) Toyota National College of Technology (2013) |
Principal Investigator |
YAMAMOTO Kazumasa 豊橋技術科学大学, 工学(系)研究科(研究院), 准教授 (40324230)
|
Co-Investigator(Kenkyū-buntansha) |
NAKAGAWA Seiichi 豊橋技術科学大学, リーディング大学院教育推進機構, 特任教授 (20115893)
|
Project Period (FY) |
2012-04-01 – 2015-03-31
|
Project Status |
Completed (Fiscal Year 2014)
|
Budget Amount *help |
¥5,330,000 (Direct Cost: ¥4,100,000、Indirect Cost: ¥1,230,000)
Fiscal Year 2014: ¥1,300,000 (Direct Cost: ¥1,000,000、Indirect Cost: ¥300,000)
Fiscal Year 2013: ¥1,300,000 (Direct Cost: ¥1,000,000、Indirect Cost: ¥300,000)
Fiscal Year 2012: ¥2,730,000 (Direct Cost: ¥2,100,000、Indirect Cost: ¥630,000)
|
Keywords | 音声認識 / 音響モデル / 音響特徴量 / 位相スペクトル / 群遅延スペクトル / 分析窓 / 雑音環境 / ディープニューラルネットワーク / 長時間分析 / 群遅延 / 深層学習 |
Outline of Final Research Achievements |
In traditional speech recognition techniques, amplitude spectrum based features (typically MFCC or PLP) are usually used as acoustic features, while phase spectrum based features are almost ignored. In this research, we showed that the phase spectrum based features, which extracted as group delay spectrum based cepstrum features by using the longer (100-200ms) analysis window then usual one (25ms), can be used for speech recognition as the same as the amplitude spectrum based features and we can improve speech recognition performance by using the both features simultaneously. We also studied about deep learning based acoustic models for robust speech recognition in this research. We modified “noise aware training” method of Deep Neural Network based HMM (DNN-HMM) so that the DNN can treat “enhanced” noisy speech features and noise estimates. We then showed the improvement of noisy speech recognition by using the proposed method.
|
Report
(4 results)
Research Products
(7 results)