2005 Fiscal Year Final Research Report Summary
A Study on Constructing Various Acoustic Models using Distributed Speech Corpora
Project/Area Number |
15200014
|
Research Category |
Grant-in-Aid for Scientific Research (A)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Perception information processing/Intelligent robotics
|
Research Institution | Nagoya University |
Principal Investigator |
TAKEDA Kazuya Nagoya University, Graduate School of Information Science, Professor, 情報科学研究科, 教授 (20273295)
|
Co-Investigator(Kenkyū-buntansha) |
SHIKANO Kiyohiro Nara Institute of Science and Technology, Graduate School of Information Science, Professor, 情報科学研究科, 教授 (00263426)
KAWAHARA Tatsuya Kyoto University, Graduate School of Informatics, Professor, 学術情報メディアセンター, 教授 (00234104)
|
Project Period (FY) |
2003 – 2005
|
Keywords | speech recognition / acoustic model / speech corpus / distributed database / distributed training / sufficient statistics / speaker adaptation |
Research Abstract |
In order to collect speech utterances made under various environmental conditions, field tests of spoken dialogue systems have been conducted for the public transportation guidance, the in-car information retrieval and the guidance for a public space. Based on the three corpora, a prototype of the data sharing infrastructure for acoustic model training has been developed. In the system, one can search for the particular speech subsets by invoking queries on the age of the speakers, SNR of the utterance and distribution of the phoneme frequency. The system can train a set of HMM's by sharing the efficient statistics, i.e., the visiting count, the branching count, the sum and the square sum, for the Gaussian Mixture pdf's for each state of HMM acoustic models. In addition, in order to characterize the utterance, a blind, i.e., does not require the explicit voice activity detection (VAD), method for SNR is developed for wide range of the SNR. As for the training strategy, not only the maximum likelihood (ML) training over the set of utterances, but also a model adaptation method using only statistics has been also studied. The effectiveness of the adaptation approach using pre-stored statistics for each utterance was confirmed through the recognition experiments where the accuracy of the model trained by the adaptation is almost equivalent to the pooled EM algorithm.
|
Research Products
(495 results)
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
[Journal Article] バイモーダル車内音声認識評価用データベースの構築2005
Author(s)
根木大輔, 前野俊希, 北坂孝幸, 森健策, 末永康仁, 宮島千代美, 伊藤克亘, 武田一哉, 板倉文忠, 佐野昌己, 二宮芳樹
-
Journal Title
情報処理学会研究報告 2005-SLP-55(7)
Pages: 35-40
Description
「研究成果報告書概要(和文)」より
-
[Journal Article] 実走行車内単語音声データベースCENSREC-3と共通評価環境の構築2005
Author(s)
藤本雅清, 中村哲, 武田一哉, 黒岩眞吾, 山田武志, 北岡教英, 山本一公, 水町光徳, 西浦敬信, 佐宗晃, 宮島千代美, 遠藤俊樹
-
Journal Title
情報処理学会研究報告 2005-SLP-55(8)
Pages: 41-46
Description
「研究成果報告書概要(和文)」より
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
[Journal Article] Construction of bimodal database for evaluating in-car speech recognition2005
Author(s)
Daisuke Negi, Tosiki Maeno, Takayuki Kitasaka, Kensaku Mori, Yasuhito Suenaga, Chiyomi Miyajima, Katsunobu Ito, Kazuya Takeda, Fumitada Itakura, Masaki Sano, Yoshiki Ninomiya
-
Journal Title
IPSJ SIG Technical Report 2005-SLP-55(7)
Pages: 35-40
Description
「研究成果報告書概要(欧文)」より
-
[Journal Article] CENSREC-3 : Data collection for in-car speech recognition and its common evaluation framework2005
Author(s)
Masakiyo Fujimoto, Satoshi Nakamura, Kazuya Takeda, Shingo Kuroiwa, Takeshi Yamada, Norihide Kitaoka, Kazumasa Yamamoto, Mitsunori Mizumachi, Takanobu Nishiura, Akira Saso, Chiyomi Miyajima, Toshiki Endo
-
Journal Title
IPSJ SIG Technical Report 2005-SLP-55(8)
Pages: 41-46
Description
「研究成果報告書概要(欧文)」より
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
[Journal Article] CIAIR In-Car Speech Database2004
Author(s)
N.Kawaguchi, S.Matsubara, Y.Yamaguchi, K.Takeda, F.Itakura
-
Journal Title
Proc. of International Conference on Spoken Language Processing FrA2702p.18
Pages: 2789-2792
Description
「研究成果報告書概要(和文)」より
-
-
-
-
[Journal Article] AURORA-2J : Japanese speech data collection for performance evaluation of speech recognition in noise2004
Author(s)
Satoshi Nakamura, Kazumasa Yamamoto, Kazuya Takeda, Shingo Kuroiwa, Norihide Kitaoka, Takeshi Yamada, Mitsunori Mizumachi, Takanobu Nishiura, Masakiyo Fujimoto, Akira Sano, Toshiki Nado
-
Journal Title
International Conference on Speech and Language Technology/Oriental-COCOSDA 2004 (in printing)
Description
「研究成果報告書概要(和文)」より
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
[Journal Article] CIAIR In-Car Speech Database2004
Author(s)
N.Kawaguchi, S.Matsubara, Y.Yamaguchi, K.Takeda, F.Itakura
-
Journal Title
Proc.of International Conference on Spoken Language Processing FrA2702p.18
Pages: 2789-2792
Description
「研究成果報告書概要(欧文)」より
-
-
-
-
[Journal Article] AURORA-2J : Japanese speech data collection for performance evaluation of speech recognition in noise2004
Author(s)
Satoshi Nakamura, Kazumasa Yamamoto, Kazuya Takeda, Shingo Kuroiwa, Norihide Kitaoka, Takeshi Yamada, Mitsunori Mizumachi, Takanobu Nishiura, Masakiyo Fujimoto, Akira Saso, Toshiki Endo
-
Journal Title
International Conference on Speech and Language Technology/Oriental-COCOSDA 2004 (in printing)
Description
「研究成果報告書概要(欧文)」より
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
[Journal Article] CIAIR in-car speech database2003
Author(s)
Nobuo Kawaguchi, Shigeki Matsubara, Yukiko Yamaguchi, Kazuya Takeda, Fumitada Itakura
-
Journal Title
Information Processing Society of Japan 2003-SLP-49
Pages: 130-144
Description
「研究成果報告書概要(欧文)」より
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-