2015 Fiscal Year Final Research Report
Advanced indexing based on spoken document retrieval and its feedback
Project/Area Number |
25330128
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Multi-year Fund |
Section | 一般 |
Research Field |
Multimedia database
|
Research Institution | Shizuoka University |
Principal Investigator |
KAI ATSUHIKO 静岡大学, 工学部, 准教授 (60283496)
|
Co-Investigator(Kenkyū-buntansha) |
WANG Longbiao 長岡技術科学大学, 技学研究院, 准教授 (30510458)
|
Co-Investigator(Renkei-kenkyūsha) |
KOGURE Satoru 静岡大学, 情報学部, 講師 (40359758)
|
Project Period (FY) |
2013-04-01 – 2016-03-31
|
Keywords | 音声ドキュメント検索 / 音声検索語検出 / STD / 音声クエリ / DNN / 音声認識信頼度 / スコア正規化 |
Outline of Final Research Achievements |
We investigated and developed elemental technologies for indexing and other related processes which are designed to permit efficient and sustainable development of spoken document retrieval systems. For dealing with a possible change in speech features regarding to the recording conditions and speakers, we proposed DNN-based voice activity detection (VAD) and dereverberation models as a frontend of speaker diarization and speech recognition systems and improved accuracy for those systems. Also, we proposed DNN-based feature transformation as a rescoring step of spoken term detection (STD) system for coping with out-of-vocabulary words and the STD performance has been significantly improved.
|
Free Research Field |
音声情報処理
|