2009 Fiscal Year Annual Research Report

生成過程モデルに基づく表現力豊かな多言語音声合成とそれによる音声自動翻訳

Research Project

Project/Area Number	21300061
Research Institution	The University of Tokyo
Principal Investigator	広瀬啓吉 The University of Tokyo, 大学院・情報理工学系研究科, 教授 (50111472)
Co-Investigator(Kenkyū-buntansha)	峯松信明東京大学, 大学院・情報理工学系研究科, 准教授 (90273333)
Keywords	生成過程モデル / 基本周波数パターン / コーパスベース韻律制御 / 音声自動翻訳 / 発話焦点 / HMM音声合成 / 声質と調子 / 音声モーフィング
Research Abstract	本研究は、"基本周波数パターン生成過程モデル(F_0モデル)の枠組みでのコーパスベース韻律制御に基づく音声合成"をもとに、多言語の韻律制御の研究を統合的に進め、声質や調子の柔軟な制御が可能な音声合成手法を当該言語について開発すると共に、それによって、もとの発話の声質・調子、あるいは意図・態度・感情等を翻訳後の音声に反映させることを行うもので、下記の成果を達成した。 1.日本語及び中国語の朗読音声コーパスに関し、F_0パターンを抽出してF_0モデルの指令の位置と大きさを求め、韻律コーパスとした。 2.日本語、中国語に加え、英語、ドイツ語について、その韻律的特徴を比較し、フレーズ成分には、統語との対応、句の長さなど、言語によらず、共通点が多いことを明らかにした。その結果から、日本語の知見を基に、中国語について、ルールベースのフレーズ成分制御手法を開発し、有効性を確認した。 3.焦点を付与した場合と付与しない場合の音声について、F_0モデルの指令の差分を学習し、焦点を付与しない音声から付与した音声を合成する手法を開発した。差分を内挿、外挿する音声モーフィングを行い、聴取実験により手法の有効性を確認した。同様の手法を、感情音声合成に用いることを進めている。 4.HMM音声合成で生成されるF_0パターンについて、F_0モデルを用いた2つの改良を行った。1つは、生成されたF_0パターンに対し、F_0モデルの最良近似を行うもので、もう1つは、音声コーパスのF_0パターンに対し、F_0モデルによる補間を行うものである。合成音声の聴取により、前者は、生成されたF_0パターンの問題が低減されること、後者は、無声/有声区間の誤りによる音質の劣化の解消に有効であることを示した。 5.バイリンガル話者の日本語と英語の音素間の相対関係を用い、他話者の日本語音声から、声質を保持した英語音声を生成する手法を開発した。

Research Products
(5 results)

All 2010 2009

All Journal Article (3 results) (of which Peer Reviewed: 3 results) Presentation (1 results) Book (1 results)

[Journal Article] HMM-Based synthesis of fundamental frequency contours using the generation process model2010
- Author(s)
  Tetsuya Matsuda
- Journal Title
  
  Proceedings of International Workshop on Nonlinear Circuits and Signal Processing 1
  
  Pages: 464-467
- Peer Reviewed
[Journal Article] Generation of fundamental frequency in HMM-based TTS using generation process model2010
- Author(s)
  Miaomiao Wang
- Journal Title
  
  Proceedings of International Conference on Speech Prosody 1(印刷中,掲載確定)
- Peer Reviewed
[Journal Article] Control of prosodic focus in corpus-based generation of fundamental frequency contours of Japanese based on the generation process model2009
- Author(s)
  Keiko Ochi
- Journal Title
  
  Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing 1
  
  Pages: 4485-4488
- Peer Reviewed
[Presentation] Control of prosodic features based on the super-positional representation of F_0 contours -toward flexible control of prosodic features in speech synthesis-2009
- Author(s)
  Keikichi Hirose
- Organizer
  International Workshop on Spoken Language Prosody
- Place of Presentation
  C-DAC, Kolkata, India
- Year and Date
  2009-11-25
[Book] Speech prosody corpora based on F_0 contour generation model and automatic extraction of model parameters, in Computer Processing of Asian Spoken languages, edited by Shuichi Itahashi et.al.2010
- Author(s)
  Keikichi Hirose
- Total Pages
  372
- Publisher
  Consideration Books, Los Angeles

2009 Fiscal Year Annual Research Report

生成過程モデルに基づく表現力豊かな多言語音声合成とそれによる音声自動翻訳

Principal Investigator

広瀬 啓吉 The University of Tokyo, 大学院・情報理工学系研究科, 教授 (50111472)

Research Products

[Journal Article] HMM-Based synthesis of fundamental frequency contours using the generation process model2010

Author(s)

Journal Title

[Journal Article] Generation of fundamental frequency in HMM-based TTS using generation process model2010

Author(s)

Journal Title

[Journal Article] Control of prosodic focus in corpus-based generation of fundamental frequency contours of Japanese based on the generation process model2009

Author(s)

Journal Title

[Presentation] Control of prosodic features based on the super-positional representation of F_0 contours -toward flexible control of prosodic features in speech synthesis-2009

Author(s)

Organizer

Place of Presentation

Year and Date

[Book] Speech prosody corpora based on F_0 contour generation model and automatic extraction of model parameters, in Computer Processing of Asian Spoken languages, edited by Shuichi Itahashi et.al.2010

Author(s)

Total Pages

Publisher

広瀬啓吉 The University of Tokyo, 大学院・情報理工学系研究科, 教授 (50111472)