Spoken Language Proceeding Based on Non-Extensive Information Theory
Project/Area Number |
24650079
|
Research Category |
Grant-in-Aid for Challenging Exploratory Research
|
Allocation Type | Multi-year Fund |
Research Field |
Perception information processing/Intelligent robotics
|
Research Institution | Tokyo Institute of Technology |
Principal Investigator |
SHINODA KOICHI 東京工業大学, 情報理工学(系)研究科, 教授 (10343097)
|
Project Period (FY) |
2012-04-01 – 2015-03-31
|
Project Status |
Completed (Fiscal Year 2014)
|
Budget Amount *help |
¥3,900,000 (Direct Cost: ¥3,000,000、Indirect Cost: ¥900,000)
Fiscal Year 2014: ¥1,040,000 (Direct Cost: ¥800,000、Indirect Cost: ¥240,000)
Fiscal Year 2013: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000)
Fiscal Year 2012: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000)
|
Keywords | 音声情報処理 / 映像情報処理 / 画像情報処理 |
Outline of Final Research Achievements |
We have developed a methodology for spoken language processing based on non-extensive statistical theory, which is an extension from the conventional extensive statistical theory. We first developed q-log spectral subtraction (q-LMSN) to achieve robustness against the difference of environmental noises and of channels. We proved that it was significantly better than the conventional CMN. Next, we developed a recognition a method using q-Gaussian mixtures for output probabilities in GMMs and in HMMs. We applied it to speech recognition and to video semantic indexing and proved its effectiveness.
|
Report
(4 results)
Research Products
(6 results)
-
-
-
[Presentation] TokyoTech-Waseda at TRECVID 20142014
Author(s)
Nakamasa Inoue, Zhuolin Liang, Mengxi Lin, Tran Hai Dang, Koichi Shinoda, Zhang Xuefeng, Kazuya Ueki
Organizer
Proc. TRECVID workshop
Place of Presentation
セントラルフロリダ大学(米国)
Year and Date
2014-11-10 – 2014-11-12
Related Report
-
-
-