2015 Fiscal Year Final Research Report

Advanced indexing based on spoken document retrieval and its feedback

Research Project

Project/Area Number	25330128
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Multi-year Fund
Section	一般
Research Field	Multimedia database
Research Institution	Shizuoka University
Principal Investigator	KAI ATSUHIKO 静岡大学, 工学部, 准教授 (60283496)
Co-Investigator(Kenkyū-buntansha)	WANG Longbiao 長岡技術科学大学, 技学研究院, 准教授 (30510458)
Co-Investigator(Renkei-kenkyūsha)	KOGURE Satoru 静岡大学, 情報学部, 講師 (40359758)
Project Period (FY)	2013-04-01 – 2016-03-31
Keywords	音声ドキュメント検索 / 音声検索語検出 / STD / 音声クエリ / DNN / 音声認識信頼度 / スコア正規化
Outline of Final Research Achievements	We investigated and developed elemental technologies for indexing and other related processes which are designed to permit efficient and sustainable development of spoken document retrieval systems. For dealing with a possible change in speech features regarding to the recording conditions and speakers, we proposed DNN-based voice activity detection (VAD) and dereverberation models as a frontend of speaker diarization and speech recognition systems and improved accuracy for those systems. Also, we proposed DNN-based feature transformation as a rescoring step of spoken term detection (STD) system for coping with out-of-vocabulary words and the STD performance has been significantly improved.
Free Research Field	音声情報処理