Speech information processing using deep generative models and their factorization
Project/Area Number |
25280058
|
Research Category |
Grant-in-Aid for Scientific Research (B)
|
Allocation Type | Partial Multi-year Fund |
Section | 一般 |
Research Field |
Perceptual information processing
|
Research Institution | Tokyo Institute of Technology |
Principal Investigator |
Shinoda Koichi 東京工業大学, 情報理工学(系)研究科, 教授 (10343097)
|
Co-Investigator(Kenkyū-buntansha) |
IWANO Koji 東京都市大学, メディア学部, 教授 (90323823)
SHINOZAKI Takahiro 東京工業大学, 大学院総合理工学研究科, 准教授 (80447903)
|
Project Period (FY) |
2013-04-01 – 2016-03-31
|
Project Status |
Completed (Fiscal Year 2015)
|
Budget Amount *help |
¥16,900,000 (Direct Cost: ¥13,000,000、Indirect Cost: ¥3,900,000)
Fiscal Year 2015: ¥4,160,000 (Direct Cost: ¥3,200,000、Indirect Cost: ¥960,000)
Fiscal Year 2014: ¥6,500,000 (Direct Cost: ¥5,000,000、Indirect Cost: ¥1,500,000)
Fiscal Year 2013: ¥6,240,000 (Direct Cost: ¥4,800,000、Indirect Cost: ¥1,440,000)
|
Keywords | 音声情報処理 / 深層学習 / 話者適応 / マルチモーダル処理 |
Outline of Final Research Achievements |
In speech recognition, it is important to train an accurate deep neural network (DNN) acoustic model from a large amount speech data from many speakers. In this study, we developed a framework to improve accuracy of the DNN acoustic model by factorizing speech data into phoneme and speaker elements. First we developed a speaker recognition method using deep Siamese network in which two DNNs which share its part. Second, we applied a DNN with a hierarchical phonetic structure to speaker adaptation. Third, we developed a speaker-adaptive training method where we utilized a student-teacher learning framework using soft targets. We improved speaker verification and speech recognition performance. We also studied DNN implementation and DNN structure design.
|
Report
(4 results)
Research Products
(12 results)
-
-
-
-
-
-
-
-
-
-
[Presentation] TokyoTech-Waseda at TRECVID 20142014
Author(s)
Nakamasa Inoue, Zhuolin Liang, Mengxi Lin, Tran Hai Dang, Koichi Shinoda, Zhang Xuefeng, Kazuya Ueki
Organizer
NIST TRECVID workshop 2014
Place of Presentation
セントラルフロリダ大学(米国)
Year and Date
2014-11-10 – 2014-11-12
Related Report
-
-