Research on construction and application of high discriminative speech feature space using heterogeneous speech units and multiple languages
Project/Area Number |
15K00262
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Multi-year Fund |
Section | 一般 |
Research Field |
Perceptual information processing
|
Research Institution | National Institute of Advanced Industrial Science and Technology |
Principal Investigator |
Lee Shi-wook 国立研究開発法人産業技術総合研究所, 情報・人間工学領域, 主任研究員 (50415642)
|
Co-Investigator(Kenkyū-buntansha) |
伊藤 慶明 岩手県立大学, ソフトウェア情報学部, 教授 (90325928)
|
Project Period (FY) |
2015-10-21 – 2018-03-31
|
Project Status |
Completed (Fiscal Year 2017)
|
Budget Amount *help |
¥4,420,000 (Direct Cost: ¥3,400,000、Indirect Cost: ¥1,020,000)
Fiscal Year 2017: ¥780,000 (Direct Cost: ¥600,000、Indirect Cost: ¥180,000)
Fiscal Year 2016: ¥1,300,000 (Direct Cost: ¥1,000,000、Indirect Cost: ¥300,000)
Fiscal Year 2015: ¥2,340,000 (Direct Cost: ¥1,800,000、Indirect Cost: ¥540,000)
|
Keywords | 音声情報処理 / パターン認識 / ヒューマンインタフェース / 時系列解析 / 統計的パターン認識 / 情報検索 / 多変量解析 / 知能情報処理 / 音声認識 / 異種音声単位 / 深層学習 / システム統合 / 音声検索語検出 / 多言語処理 |
Outline of Final Research Achievements |
This research aims to improve speech recognition performance by enhanced discriminative ability on speech feature space using heterogeneous information. Due to most speech recognition systems by recent deep learning techniques are constructed on the basis of a single speech unit, speech diversity cannot be sufficiently modeled even with enormous speech data. As a solution to the problem, we adopt a sub-phonetic segment unit which is a temporal extension speech unit and is completely different from the conventional contextual dependent speech unit. We confirmed that the proposed high discriminative speech feature space based on heterogeneous speech units is effective on a wide range of speech recognition systems; from conventional generation models to leading-edge deep learning models.
|
Report
(4 results)
Research Products
(22 results)