2009 年度実績報告書

生成過程モデルに基づく表現力豊かな多言語音声合成とそれによる音声自動翻訳

研究課題

研究課題/領域番号	21300061
研究機関	東京大学
研究代表者	広瀬啓吉東京大学, 大学院・情報理工学系研究科, 教授 (50111472)
研究分担者	峯松信明東京大学, 大学院・情報理工学系研究科, 准教授 (90273333)
キーワード	生成過程モデル / 基本周波数パターン / コーパスベース韻律制御 / 音声自動翻訳 / 発話焦点 / HMM音声合成 / 声質と調子 / 音声モーフィング
研究概要	本研究は、"基本周波数パターン生成過程モデル(F_0モデル)の枠組みでのコーパスベース韻律制御に基づく音声合成"をもとに、多言語の韻律制御の研究を統合的に進め、声質や調子の柔軟な制御が可能な音声合成手法を当該言語について開発すると共に、それによって、もとの発話の声質・調子、あるいは意図・態度・感情等を翻訳後の音声に反映させることを行うもので、下記の成果を達成した。 1.日本語及び中国語の朗読音声コーパスに関し、F_0パターンを抽出してF_0モデルの指令の位置と大きさを求め、韻律コーパスとした。 2.日本語、中国語に加え、英語、ドイツ語について、その韻律的特徴を比較し、フレーズ成分には、統語との対応、句の長さなど、言語によらず、共通点が多いことを明らかにした。その結果から、日本語の知見を基に、中国語について、ルールベースのフレーズ成分制御手法を開発し、有効性を確認した。 3.焦点を付与した場合と付与しない場合の音声について、F_0モデルの指令の差分を学習し、焦点を付与しない音声から付与した音声を合成する手法を開発した。差分を内挿、外挿する音声モーフィングを行い、聴取実験により手法の有効性を確認した。同様の手法を、感情音声合成に用いることを進めている。 4.HMM音声合成で生成されるF_0パターンについて、F_0モデルを用いた2つの改良を行った。1つは、生成されたF_0パターンに対し、F_0モデルの最良近似を行うもので、もう1つは、音声コーパスのF_0パターンに対し、F_0モデルによる補間を行うものである。合成音声の聴取により、前者は、生成されたF_0パターンの問題が低減されること、後者は、無声/有声区間の誤りによる音質の劣化の解消に有効であることを示した。 5.バイリンガル話者の日本語と英語の音素間の相対関係を用い、他話者の日本語音声から、声質を保持した英語音声を生成する手法を開発した。

研究成果
(5件)

すべて 2010 2009

すべて雑誌論文 (3件) (うち査読あり 3件) 学会発表 (1件) 図書 (1件)

[雑誌論文] HMM-Based synthesis of fundamental frequency contours using the generation process model2010
- 著者名/発表者名
  Tetsuya Matsuda
- 雑誌名
  
  Proceedings of International Workshop on Nonlinear Circuits and Signal Processing 1
  
  ページ: 464-467
- 査読あり
[雑誌論文] Generation of fundamental frequency in HMM-based TTS using generation process model2010
- 著者名/発表者名
  Miaomiao Wang
- 雑誌名
  
  Proceedings of International Conference on Speech Prosody 1(印刷中,掲載確定)
- 査読あり
[雑誌論文] Control of prosodic focus in corpus-based generation of fundamental frequency contours of Japanese based on the generation process model2009
- 著者名/発表者名
  Keiko Ochi
- 雑誌名
  
  Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing 1
  
  ページ: 4485-4488
- 査読あり
[学会発表] Control of prosodic features based on the super-positional representation of F_0 contours -toward flexible control of prosodic features in speech synthesis-2009
- 著者名/発表者名
  Keikichi Hirose
- 学会等名
  International Workshop on Spoken Language Prosody
- 発表場所
  C-DAC, Kolkata, India
- 年月日
  2009-11-25
[図書] Speech prosody corpora based on F_0 contour generation model and automatic extraction of model parameters, in Computer Processing of Asian Spoken languages, edited by Shuichi Itahashi et.al.2010
- 著者名/発表者名
  Keikichi Hirose
- 総ページ数
  372
- 出版者
  Consideration Books, Los Angeles

2009 年度 実績報告書

生成過程モデルに基づく表現力豊かな多言語音声合成とそれによる音声自動翻訳

研究代表者

広瀬 啓吉 東京大学, 大学院・情報理工学系研究科, 教授 (50111472)

研究成果

[雑誌論文] HMM-Based synthesis of fundamental frequency contours using the generation process model2010

著者名/発表者名

雑誌名

[雑誌論文] Generation of fundamental frequency in HMM-based TTS using generation process model2010

著者名/発表者名

雑誌名

[雑誌論文] Control of prosodic focus in corpus-based generation of fundamental frequency contours of Japanese based on the generation process model2009

著者名/発表者名

雑誌名

[学会発表] Control of prosodic features based on the super-positional representation of F_0 contours -toward flexible control of prosodic features in speech synthesis-2009

著者名/発表者名

学会等名

発表場所

年月日

[図書] Speech prosody corpora based on F_0 contour generation model and automatic extraction of model parameters, in Computer Processing of Asian Spoken languages, edited by Shuichi Itahashi et.al.2010

著者名/発表者名

総ページ数

出版者

2009 年度実績報告書

広瀬啓吉東京大学, 大学院・情報理工学系研究科, 教授 (50111472)