• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Expressive Multi-language Speech Synthesis Based on the Generation Process Model and Its Use for Automatic Speech Translation

Research Project

Project/Area Number 21300061
Research Category

Grant-in-Aid for Scientific Research (B)

Allocation TypeSingle-year Grants
Section一般
Research Field Perception information processing/Intelligent robotics
Research InstitutionThe University of Tokyo

Principal Investigator

HIROSE Keikichi  東京大学, 大学院・情報理工学系研究科, 教授 (50111472)

Co-Investigator(Kenkyū-buntansha) 峯松 信明  東京大学, 情報理工学系研究科, 准教授 (90273333)
Co-Investigator(Renkei-kenkyūsha) 河合 剛  北海道大学, メディア・コミュニケーション研究院, 准教授 (70312981)
Project Period (FY) 2009 – 2011
Project Status Completed (Fiscal Year 2011)
Budget Amount *help
¥17,550,000 (Direct Cost: ¥13,500,000、Indirect Cost: ¥4,050,000)
Fiscal Year 2011: ¥5,980,000 (Direct Cost: ¥4,600,000、Indirect Cost: ¥1,380,000)
Fiscal Year 2010: ¥5,330,000 (Direct Cost: ¥4,100,000、Indirect Cost: ¥1,230,000)
Fiscal Year 2009: ¥6,240,000 (Direct Cost: ¥4,800,000、Indirect Cost: ¥1,440,000)
Keywords生成過程モデル / 基本周波数パターン / コーパスベース韻律制御 / 音声自動翻訳 / 談話焦点 / HMM音声合成 / 声質と調子 / 音声モーフィング / 発話スタイル / 声調核 / 多言語 / 音素長 / 発話焦点
Research Abstract

A unified study on prosody control for multi-languages was conductedbased on the generation process model of fundamental frequency contours(F_0 model). We developeda method of prosody adaptation, where differences in F_0 model commands were learned from parallelspeech corpus and were applied to baseline speech. Focus control, style conversion and voiceconversion were realized. Furthermore, by approximating F_0 contours of training speech corpusand/or generated F_0 contours using the F_0 model, we improved the quality of synthetic speech by theHMM-based speech synthesis. Also, we added focus control. Based on the above results, experiments were conducted on conveying discourse information and intentions in speech Translation.

Report

(4 results)
  • 2011 Annual Research Report   Final Research Report ( PDF )
  • 2010 Annual Research Report
  • 2009 Annual Research Report
  • Research Products

    (33 results)

All 2012 2011 2010 2009 Other

All Journal Article (13 results) (of which Peer Reviewed: 13 results) Presentation (16 results) Book (3 results) Remarks (1 results)

  • [Journal Article] A method for generation of Mandarin F_0 contours based on tone nucleus model and superpositional model2012

    • Author(s)
      Qinghua Sun, Keikichi Hirose, and Nobuaki Minematsu
    • Journal Title

      Speech Communication

      Volume: Vol.54, Issue 8 Pages: 932-945

    • URL

      http://www.sciencedirect.com/science/journal/01676393

    • Related Report
      2011 Final Research Report
    • Peer Reviewed
  • [Journal Article] HMM-based F_0 contour synthesis using the generation process model2012

    • Author(s)
      Tatsuya Matsuda, Keikichi Hirose, and Nobuaki Minematsu
    • Journal Title

      Acoustical Science and Technology, Acoustical Society of Japan

    • NAID

      110007969995

    • URL

      https://www.jstage.jst.go.jp/browse/ast/33/0/_contents,http://journals.acoustics.jp/ast-archive/

    • Related Report
      2011 Final Research Report
    • Peer Reviewed
  • [Journal Article] Improvement of prosody in HMM-based speech synthesis using generation process model2011

    • Author(s)
      Miaomiao Wang, Keikichi Hirose, and Nobuaki Minematsu
    • Journal Title

      Journal of Signal Processing

      Volume: vol.15, no.4 Pages: 279-282

    • URL

      http://www.risp.jp/Product.html

    • Related Report
      2011 Final Research Report
    • Peer Reviewed
  • [Journal Article] Improvement of prosody in HMM-based speech synthesis using generation process model2011

    • Author(s)
      Miaomiao Wang
    • Journal Title

      Journal of Research Institute of Signal Processing

      Volume: 15 Pages: 279-282

    • Related Report
      2011 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Adaptation of prosody in speech synthesis by changing command values of the generation process model of fundamental frequency2011

    • Author(s)
      Keikichi Hirose
    • Journal Title

      Proceedings of INTERSPEECH

      Volume: 1 Pages: 2793-2796

    • Related Report
      2011 Annual Research Report
    • Peer Reviewed
  • [Journal Article] HMM-based F_0 contour synthesis using the generation process model2011

    • Author(s)
      Tatsuya Matsuda
    • Journal Title

      Acoustical Science and Technology, Acoustical Society of Japan

      Volume: (印刷中)(掲載確定)

    • Related Report
      2011 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Improvement of prosody in HMM-based speech synthesis using generation process model2011

    • Author(s)
      Miaomiao Wang
    • Journal Title

      Journal of Signal Processing

      Volume: 15(7月号掲載)

    • Related Report
      2010 Annual Research Report
    • Peer Reviewed
  • [Journal Article] HMM-based synthesis of fundamental frequency contours using the generation process model2010

    • Author(s)
      Tetsuya Matsuda, Keikichi Hirose and Nobuaki Minematsu
    • Journal Title

      Journal of Signal Processing

      Volume: vol.14, no.4 Pages: 277-280

    • URL

      http://www.risp.jp/Product.html

    • Related Report
      2011 Final Research Report
    • Peer Reviewed
  • [Journal Article] MM-based synthesis of fundamental frequency contours using the generation process model2010

    • Author(s)
      Tetsuya Matsuda
    • Journal Title

      Journal of Signal Processing

      Volume: 14 Pages: 277-280

    • Related Report
      2010 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Improving Mandarin segmental duration prediction with automatically extracted syntax features2010

    • Author(s)
      Miaomiao Wen
    • Journal Title

      Proceedings of INTERSPEECH

      Volume: 1 Pages: 2178-2181

    • Related Report
      2010 Annual Research Report
    • Peer Reviewed
  • [Journal Article] HMM-Based synthesis of fundamental frequency contours using the generation process model2010

    • Author(s)
      Tetsuya Matsuda
    • Journal Title

      Proceedings of International Workshop on Nonlinear Circuits and Signal Processing 1

      Pages: 464-467

    • Related Report
      2009 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Generation of fundamental frequency in HMM-based TTS using generation process model2010

    • Author(s)
      Miaomiao Wang
    • Journal Title

      Proceedings of International Conference on Speech Prosody 1(印刷中,掲載確定)

    • Related Report
      2009 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Control of prosodic focus in corpus-based generation of fundamental frequency contours of Japanese based on the generation process model2009

    • Author(s)
      Keiko Ochi
    • Journal Title

      Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing 1

      Pages: 4485-4488

    • Related Report
      2009 Annual Research Report
    • Peer Reviewed
  • [Presentation] Fundamental frequency contour generation process model for improved and flexible control of prosodic features in hmm-based speech synthesis2012

    • Author(s)
      Keikichi Hirose
    • Organizer
      International Symposium on Frontiers of Research on Speech and Music
    • Place of Presentation
      KIIT, Gurgaon, India(招待講演)
    • Year and Date
      2012-01-19
    • Related Report
      2011 Annual Research Report
  • [Presentation] Improved automatic extraction of generation process model commands and its use for generating fundamental frequency contours for training HMM-based speech synthesis2012

    • Author(s)
      Hiroya Hashimoto, Keikichi Hirose, and Nobuaki Minematsu
    • Organizer
      Proceedings INTERSPEECH
    • Place of Presentation
      Portland
    • Related Report
      2011 Final Research Report
  • [Presentation] Emotional voice conversion for mandarin using tone nucleus model-small corpus and high efficiency2012

    • Author(s)
      Miaomiao Wang, Miaomiao Wen, Keikichi Hirose, and Nobuaki Minematsu
    • Organizer
      Proceedings of International Conference on Speech Prosody
    • Place of Presentation
      Shanghai
    • Related Report
      2011 Final Research Report
  • [Presentation] Fundamental frequency contour reshaping in HMM-based speech synthesis and realization of prosodic focus using generation process model2012

    • Author(s)
      Keikichi Hirose, Hiroya Hashimoto, Jun Ikeshima, and Nobuaki Minematsu
    • Organizer
      Proceedings of International Conference on Speech Prosody
    • Place of Presentation
      Shanghai
    • Related Report
      2011 Final Research Report
  • [Presentation] Fundamental frequency contour generation process model for improved and flexible control of prosodic features in hmm-based speech synthesis2012

    • Author(s)
      Keikichi Hirose
    • Organizer
      Proceedings of International Symposium on Frontiers of Research on Speech and Music
    • Place of Presentation
      Gurgaon
    • Related Report
      2011 Final Research Report
  • [Presentation] Representing fundamental frequency contours generated by hmm-based speech synthesis using generation process model2011

    • Author(s)
      Keikichi Hirose, Tatsuya Matsuda, Hiroya Hashimoto, and Nobuaki Minematsu
    • Organizer
      Proceedings of IEEE International Workshop on Machine Learning for Signal Processing
    • Place of Presentation
      Beijing
    • Related Report
      2011 Final Research Report
  • [Presentation] Adaptation of prosody in speech synthesis by changing command values of the generation process model of fundamental frequency2011

    • Author(s)
      Keikichi Hirose, Keiko Ochi, Ryusuke Mihara, Hiroya Hashimoto, Daisuke Saito, and Nobuaki Minematsu
    • Organizer
      Proceedings INTERSPEECH
    • Place of Presentation
      Florence
    • Related Report
      2011 Final Research Report
  • [Presentation] Prosody conversion for emotional Mandarin speech synthesis using the tone nucleus model2011

    • Author(s)
      Miaomiao Wen, Miaomiao Wang, Keikichi Hirose, and Nobuaki Minematsu
    • Organizer
      Proceedings INTERSPEECH
    • Place of Presentation
      Florence
    • Related Report
      2011 Final Research Report
  • [Presentation] Control of prosodic features in corpus-based generation of fundamental frequency contours based on the generation process model2010

    • Author(s)
      Keikichi Hirose
    • Organizer
      IEEE International Conference on Signal Processing
    • Place of Presentation
      Taiyangdao Hotel, Beijing, Chin
    • Year and Date
      2010-10-27
    • Related Report
      2010 Annual Research Report
  • [Presentation] Analysis and Synthesis of F_0 Contours for Bangla Readout Speech2010

    • Author(s)
      Shyamal Das Mandal, Anal Haque Warsi, Tulika Basu, Keikichi Hirose, and Hiroya Fujisaki
    • Organizer
      Proceedings Oriental COCOSDA
    • Place of Presentation
      Kathmandu
    • Related Report
      2011 Final Research Report
  • [Presentation] Control of prosodic features in corpus-based generation of fundamental frequency contours based on the generation process model2010

    • Author(s)
      Keikichi Hirose, Keiko Ochi, and Nobuaki Minematsu
    • Organizer
      Proceedings IEEE International Conference on Signal Processing
    • Place of Presentation
      Beijing
    • Related Report
      2011 Final Research Report
  • [Presentation] Improved generation of fundamental frequency in HMM-based speech synthesis using generation process model2010

    • Author(s)
      Miaomiao Wang, Miaomiao Wen, Keikichi Hirose, and Nobuaki Minematsu
    • Organizer
      Proceedings INTERSPEECH
    • Place of Presentation
      Makuhari
    • Related Report
      2011 Final Research Report
  • [Presentation] Using F_0 contour generation process model for improved and flexible control of prosodic features in HMM-based speech synthesis2010

    • Author(s)
      Keikichi Hirose, Keiko Ochi, Miaomiao Wang, Tatsuya Matsuda, Miaomiao Wen, and Nobuaki Minematsu
    • Organizer
      Proceedings of 21^<st> Conference on Electronic Speech Signal Processing
    • Place of Presentation
      Berlin
    • Related Report
      2011 Final Research Report
  • [Presentation] Generation of fundamental frequency in HMM-based TTS using generation process model2010

    • Author(s)
      Miaomiao Wang, Keikichi Hirose, and Nobuaki Minematsu
    • Organizer
      Proceedings of International Conference on Speech Prosody
    • Place of Presentation
      Chicago
    • Related Report
      2011 Final Research Report
  • [Presentation] Control of prosodic features based on the super-positional representation of F_0 contours -toward flexible control of prosodic features in speech synthesis-2009

    • Author(s)
      Keikichi Hirose
    • Organizer
      International Workshop on Spoken Language Prosody
    • Place of Presentation
      C-DAC, Kolkata, India
    • Year and Date
      2009-11-25
    • Related Report
      2009 Annual Research Report
  • [Presentation] Control of prosodic focus in corpus-based generation of fundamental frequency contours of Japanese based on the generation process model2009

    • Author(s)
      Keiko Ochi, Keikichi Hirose, and Nobuaki Minematsu
    • Organizer
      Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing
    • Place of Presentation
      Taipei
    • Related Report
      2011 Final Research Report
  • [Book] Prosodic corpora based on F_0 contour generation model and automatic extraction of model parameters2010

    • Author(s)
      Keikichi Hirose
    • Publisher
      Computer Processing of Asian Spoken languages
    • Related Report
      2011 Final Research Report
  • [Book] Speech prosody corpora based on F_0 contour generation model and automatic extraction of model parameters, in Computer Processing of Asian Spoken languages, edited by Shuichi Itahashi et.al.2010

    • Author(s)
      Keikichi Hirose
    • Total Pages
      372
    • Publisher
      Consideration Books, Los Angeles
    • Related Report
      2009 Annual Research Report
  • [Book] On the prosodic features for emotional speech2009

    • Author(s)
      Keikichi Hirose, Qinghua Sun
    • Publisher
      Frontiers in Phonetics and Speech Science
    • Related Report
      2011 Final Research Report
  • [Remarks] (研究業績)

    • URL

      http://www.gavo.t.u-tokyo.ac.jp/~hirose/cv/curriculumvitae.pdf

    • Related Report
      2011 Final Research Report

URL: 

Published: 2009-04-01   Modified: 2016-04-21  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi