Grant-in-Aid for Scientific Research (A)
|Research Institution||The University of Tokyo|
HIROSE Keikichi Univ.of Tokyo, Dept.of Inf.and Commu.Engg., Professor, 大学院・工学系研究科, 教授 (50111472)
鈴木 敏克 東京電力株式会社, システム研究所, 主任(研究職)
MINEMATSU Nobuaki Toyohashi Univ.of Tech., Dept.of Inf.& Computer Sciences, Assistant, 工学部, 助手 (90273333)
OHNO Sumio Tokyo Science Univ., Dept.of Applied Electronics, Assistant, 基礎工学部, 助手 (80256677)
小杉 康宏 東京電力株式会社, システム研究所, 主任研究職員
KOSUGI Yasuhiro Tokyo Electric Power Co., Telecom.Engg.Dept., Chief Researcher
|Project Fiscal Year
1996 – 1998
Completed(Fiscal Year 1998)
|Budget Amount *help
¥8,000,000 (Direct Cost : ¥8,000,000)
Fiscal Year 1998 : ¥1,700,000 (Direct Cost : ¥1,700,000)
Fiscal Year 1997 : ¥1,700,000 (Direct Cost : ¥1,700,000)
Fiscal Year 1996 : ¥4,600,000 (Direct Cost : ¥4,600,000)
|Keywords||Spoken Dialogue System / Speech Recognition / Speech Synthesis / Multi-lingual System / Viterbi Bayesian Predictive Classification / Waveform Concatenation Synthesis / Prosodic Modelig / Tone Recognition / 音声対話システム / 音声認識 / 音声合成 / 多言語システム / ビタビベイズ予測分類 / 波形編集合成 / 韻律モデル / 声調認識 / ビタビ探索 / TD-PSOLA / 対話処理 / 言語自動識別 / ベイズ予測分類 / HMM|
With the aim of developing a spoken dialogue system for both Japanese and Chinese in order to check the possibility of realizing practical systems of multilingual spoken dialogue, the following major results were obtained.
1. After selecting literature retrieval as the system task, we have arranged necessary databases and installed dictionary for speech synthesis. Also a speech corpus was cotstructed for training and evaluating speech recognition.
2. Phoneme HMM's and phoneme class HMM's were trained using the corpus. A method was developed to identify the input speech being Japanese or Chinese based on the phoneme/phoneme class sequences.
3. A robust speech recognition method was developed based on Bayesian predictive classification with Viterbi approximation. An adaptation method was further proposed, where improved posterior probability density function was estimated via sequential Bayesian learning using adaptation data. Another robust method, minimax, was also investigated to make it
applicable to continuous speech.
4. An automatic Waveform concatenation speech synthesis method was developed. This method is based on segmenting speech waveform using speech recognition technique, and automatically placing pitch marks after LMA analysis. It was utilized for the Chinese speech synthesis.
5. Waveform concatenation synthesizer was combined with formant synthesizer to generate a new speech synthesis system. This system was shown to improve several low quality phonemes.
6. A speech synthesis oriented modeling of ; Chinese prosody was developed based on the newly defined function for unified representation of Chinese fundamental frequency contours.
7. A method was developed for precise tone recognition of Chinese continuous speech. This method is based on using features of tone nucleus of a syllable only.
8. A Japanese/Chinese spoken dialogue system was constructed (or literature retrieval. Chinese responses were pre-stored sentences, while Japanese responses were generated from semantic representations. The system was confirmed to operate both in Japanese and Chinese. Less