2011 Fiscal Year Final Research Report

Expressive Multi-language Speech Synthesis Based on the Generation Process Model and Its Use for Automatic Speech Translation

Research Project

Project/Area Number	21300061
Research Category	Grant-in-Aid for Scientific Research (B)
Allocation Type	Single-year Grants
Section	一般
Research Field	Perception information processing/Intelligent robotics
Research Institution	The University of Tokyo
Principal Investigator	HIROSE Keikichi 東京大学, 大学院・情報理工学系研究科, 教授 (50111472)
Co-Investigator(Kenkyū-buntansha)	峯松信明東京大学, 情報理工学系研究科, 准教授 (90273333)
Co-Investigator(Renkei-kenkyūsha)	河合剛北海道大学, メディア・コミュニケーション研究院, 准教授 (70312981)
Project Period (FY)	2009 – 2011
Keywords	生成過程モデル / 基本周波数パターン / コーパスベース韻律制御 / 音声自動翻訳 / 談話焦点 / HMM音声合成 / 声質と調子 / 音声モーフィング
Research Abstract	A unified study on prosody control for multi-languages was conductedbased on the generation process model of fundamental frequency contours(F_0 model). We developeda method of prosody adaptation, where differences in F_0 model commands were learned from parallelspeech corpus and were applied to baseline speech. Focus control, style conversion and voiceconversion were realized. Furthermore, by approximating F_0 contours of training speech corpusand/or generated F_0 contours using the F_0 model, we improved the quality of synthetic speech by theHMM-based speech synthesis. Also, we added focus control. Based on the above results, experiments were conducted on conveying discourse information and intentions in speech Translation.

Research Products
(20 results)

All 2012 2011 2010 2009 Other

All Journal Article (4 results) (of which Peer Reviewed: 4 results) Presentation (13 results) Book (2 results) Remarks (1 results)

[Journal Article] A method for generation of Mandarin F_0 contours based on tone nucleus model and superpositional model2012
- Author(s)
  Qinghua Sun, Keikichi Hirose, and Nobuaki Minematsu
- Journal Title
  
  Speech Communication
  
  Volume: Vol.54, Issue 8 Pages: 932-945
- URL
  http://www.sciencedirect.com/science/journal/01676393
- Peer Reviewed
[Journal Article] HMM-based F_0 contour synthesis using the generation process model2012
- Author(s)
  Tatsuya Matsuda, Keikichi Hirose, and Nobuaki Minematsu
- Journal Title
  
  Acoustical Science and Technology, Acoustical Society of Japan
- URL
  https://www.jstage.jst.go.jp/browse/ast/33/0/_contents,http://journals.acoustics.jp/ast-archive/
- Peer Reviewed
[Journal Article] Improvement of prosody in HMM-based speech synthesis using generation process model2011
- Author(s)
  Miaomiao Wang, Keikichi Hirose, and Nobuaki Minematsu
- Journal Title
  
  Journal of Signal Processing
  
  Volume: vol.15, no.4 Pages: 279-282
- URL
  http://www.risp.jp/Product.html
- Peer Reviewed
[Journal Article] HMM-based synthesis of fundamental frequency contours using the generation process model2010
- Author(s)
  Tetsuya Matsuda, Keikichi Hirose and Nobuaki Minematsu
- Journal Title
  
  Journal of Signal Processing
  
  Volume: vol.14, no.4 Pages: 277-280
- URL
  http://www.risp.jp/Product.html
- Peer Reviewed
[Presentation] Improved automatic extraction of generation process model commands and its use for generating fundamental frequency contours for training HMM-based speech synthesis2012
- Author(s)
  Hiroya Hashimoto, Keikichi Hirose, and Nobuaki Minematsu
- Organizer
  Proceedings INTERSPEECH
- Place of Presentation
  Portland
- Year and Date
  20120909-13
[Presentation] Emotional voice conversion for mandarin using tone nucleus model-small corpus and high efficiency2012
- Author(s)
  Miaomiao Wang, Miaomiao Wen, Keikichi Hirose, and Nobuaki Minematsu
- Organizer
  Proceedings of International Conference on Speech Prosody
- Place of Presentation
  Shanghai
- Year and Date
  20120522-25
[Presentation] Fundamental frequency contour reshaping in HMM-based speech synthesis and realization of prosodic focus using generation process model2012
- Author(s)
  Keikichi Hirose, Hiroya Hashimoto, Jun Ikeshima, and Nobuaki Minematsu
- Organizer
  Proceedings of International Conference on Speech Prosody
- Place of Presentation
  Shanghai
- Year and Date
  20120522-25
[Presentation] Fundamental frequency contour generation process model for improved and flexible control of prosodic features in hmm-based speech synthesis2012
- Author(s)
  Keikichi Hirose
- Organizer
  Proceedings of International Symposium on Frontiers of Research on Speech and Music
- Place of Presentation
  Gurgaon
- Year and Date
  20120118-20
[Presentation] Representing fundamental frequency contours generated by hmm-based speech synthesis using generation process model2011
- Author(s)
  Keikichi Hirose, Tatsuya Matsuda, Hiroya Hashimoto, and Nobuaki Minematsu
- Organizer
  Proceedings of IEEE International Workshop on Machine Learning for Signal Processing
- Place of Presentation
  Beijing
- Year and Date
  20110918-21
[Presentation] Adaptation of prosody in speech synthesis by changing command values of the generation process model of fundamental frequency2011
- Author(s)
  Keikichi Hirose, Keiko Ochi, Ryusuke Mihara, Hiroya Hashimoto, Daisuke Saito, and Nobuaki Minematsu
- Organizer
  Proceedings INTERSPEECH
- Place of Presentation
  Florence
- Year and Date
  20110828-31
[Presentation] Prosody conversion for emotional Mandarin speech synthesis using the tone nucleus model2011
- Author(s)
  Miaomiao Wen, Miaomiao Wang, Keikichi Hirose, and Nobuaki Minematsu
- Organizer
  Proceedings INTERSPEECH
- Place of Presentation
  Florence
- Year and Date
  20110828-31
[Presentation] Analysis and Synthesis of F_0 Contours for Bangla Readout Speech2010
- Author(s)
  Shyamal Das Mandal, Anal Haque Warsi, Tulika Basu, Keikichi Hirose, and Hiroya Fujisaki
- Organizer
  Proceedings Oriental COCOSDA
- Place of Presentation
  Kathmandu
- Year and Date
  20101124-25
[Presentation] Control of prosodic features in corpus-based generation of fundamental frequency contours based on the generation process model2010
- Author(s)
  Keikichi Hirose, Keiko Ochi, and Nobuaki Minematsu
- Organizer
  Proceedings IEEE International Conference on Signal Processing
- Place of Presentation
  Beijing
- Year and Date
  20101000
[Presentation] Improved generation of fundamental frequency in HMM-based speech synthesis using generation process model2010
- Author(s)
  Miaomiao Wang, Miaomiao Wen, Keikichi Hirose, and Nobuaki Minematsu
- Organizer
  Proceedings INTERSPEECH
- Place of Presentation
  Makuhari
- Year and Date
  20100926-30
[Presentation] Using F_0 contour generation process model for improved and flexible control of prosodic features in HMM-based speech synthesis2010
- Author(s)
  Keikichi Hirose, Keiko Ochi, Miaomiao Wang, Tatsuya Matsuda, Miaomiao Wen, and Nobuaki Minematsu
- Organizer
  Proceedings of 21^<st> Conference on Electronic Speech Signal Processing
- Place of Presentation
  Berlin
- Year and Date
  20100908-10
[Presentation] Generation of fundamental frequency in HMM-based TTS using generation process model2010
- Author(s)
  Miaomiao Wang, Keikichi Hirose, and Nobuaki Minematsu
- Organizer
  Proceedings of International Conference on Speech Prosody
- Place of Presentation
  Chicago
- Year and Date
  20100511-14
[Presentation] Control of prosodic focus in corpus-based generation of fundamental frequency contours of Japanese based on the generation process model2009
- Author(s)
  Keiko Ochi, Keikichi Hirose, and Nobuaki Minematsu
- Organizer
  Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing
- Place of Presentation
  Taipei
- Year and Date
  20090420-24
[Book] Prosodic corpora based on F_0 contour generation model and automatic extraction of model parameters2010
- Author(s)
  Keikichi Hirose
- Total Pages
  180-183
- Publisher
  Computer Processing of Asian Spoken languages
[Book] On the prosodic features for emotional speech2009
- Author(s)
  Keikichi Hirose, Qinghua Sun
- Total Pages
  263-274
- Publisher
  Frontiers in Phonetics and Speech Science
[Remarks] (研究業績)
- URL
  http://www.gavo.t.u-tokyo.ac.jp/~hirose/cv/curriculumvitae.pdf

2011 Fiscal Year Final Research Report

Expressive Multi-language Speech Synthesis Based on the Generation Process Model and Its Use for Automatic Speech Translation

Principal Investigator

HIROSE Keikichi 東京大学, 大学院・情報理工学系研究科, 教授 (50111472)

Research Products

[Journal Article] A method for generation of Mandarin F_0 contours based on tone nucleus model and superpositional model2012

Author(s)

Journal Title

URL

[Journal Article] HMM-based F_0 contour synthesis using the generation process model2012

Author(s)

Journal Title

URL

[Journal Article] Improvement of prosody in HMM-based speech synthesis using generation process model2011

Author(s)

Journal Title

URL

[Journal Article] HMM-based synthesis of fundamental frequency contours using the generation process model2010

Author(s)

Journal Title

URL

[Presentation] Improved automatic extraction of generation process model commands and its use for generating fundamental frequency contours for training HMM-based speech synthesis2012

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Emotional voice conversion for mandarin using tone nucleus model-small corpus and high efficiency2012

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Fundamental frequency contour reshaping in HMM-based speech synthesis and realization of prosodic focus using generation process model2012

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Fundamental frequency contour generation process model for improved and flexible control of prosodic features in hmm-based speech synthesis2012

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Representing fundamental frequency contours generated by hmm-based speech synthesis using generation process model2011

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Adaptation of prosody in speech synthesis by changing command values of the generation process model of fundamental frequency2011

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Prosody conversion for emotional Mandarin speech synthesis using the tone nucleus model2011

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Analysis and Synthesis of F_0 Contours for Bangla Readout Speech2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Control of prosodic features in corpus-based generation of fundamental frequency contours based on the generation process model2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Improved generation of fundamental frequency in HMM-based speech synthesis using generation process model2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Using F_0 contour generation process model for improved and flexible control of prosodic features in HMM-based speech synthesis2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Generation of fundamental frequency in HMM-based TTS using generation process model2010

Author(s)

Organizer

Place of Presentation