Budget Amount *help |
¥16,900,000 (Direct Cost: ¥13,000,000、Indirect Cost: ¥3,900,000)
Fiscal Year 2015: ¥4,160,000 (Direct Cost: ¥3,200,000、Indirect Cost: ¥960,000)
Fiscal Year 2014: ¥6,500,000 (Direct Cost: ¥5,000,000、Indirect Cost: ¥1,500,000)
Fiscal Year 2013: ¥6,240,000 (Direct Cost: ¥4,800,000、Indirect Cost: ¥1,440,000)
|
Outline of Final Research Achievements |
In speech recognition, it is important to train an accurate deep neural network (DNN) acoustic model from a large amount speech data from many speakers. In this study, we developed a framework to improve accuracy of the DNN acoustic model by factorizing speech data into phoneme and speaker elements. First we developed a speaker recognition method using deep Siamese network in which two DNNs which share its part. Second, we applied a DNN with a hierarchical phonetic structure to speaker adaptation. Third, we developed a speaker-adaptive training method where we utilized a student-teacher learning framework using soft targets. We improved speaker verification and speech recognition performance. We also studied DNN implementation and DNN structure design.
|