2017 Fiscal Year Final Research Report
Accurate speech recognition system with deep neural network introducing human auditory characteristic in real environments
Project/Area Number |
15K00233
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Multi-year Fund |
Section | 一般 |
Research Field |
Perceptual information processing
|
Research Institution | Chubu University (2017) Toyohashi University of Technology (2015-2016) |
Principal Investigator |
|
Co-Investigator(Kenkyū-buntansha) |
中川 聖一 豊橋技術科学大学, リーディング大学院教育推進機構, 特命教授 (20115893)
|
Project Period (FY) |
2015-04-01 – 2018-03-31
|
Keywords | 音声認識 / 深層学習 / Deep Neural Network / 聴覚特性 / 音響特徴量 / フィルタバンク |
Outline of Final Research Achievements |
Currently, deep learning has been introduced into speech recognition technology and the speech recognition technology is gradually being used practically, but speech recognition performance is still not sufficient in noisy environments or for distant-talking. The purpose of this research is to improve speech recognition accuracy by combining DNN (Deep Neural Network) acoustic model with human auditory characteristics. In this research, we proposed a method to automatically learn feature extraction filterbanks at the bottom of DNN acoustic model by using deep learning considering human auditory characteristics. By using this method, improvement of speech recognition accuracy was obtained for speaker-independent speech recognition. In addition, the proposed method improved speaker-adapted speech recognition accuracy even under the condition that the amount of adaptation data is small. The results showed the effectiveness of the proposed method.
|
Free Research Field |
音声情報処理
|