• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to project page

2013 Fiscal Year Final Research Report

A study on speech diversification techniques based on corpus design for advanced humanoid speech synthesis

Research Project

  • PDF
Project/Area Number 23700195
Research Category

Grant-in-Aid for Young Scientists (B)

Allocation TypeMulti-year Fund
Research Field Perception information processing/Intelligent robotics
Research InstitutionTohoku University (2013)
Tokyo Institute of Technology (2011-2012)

Principal Investigator

NOSE Takashi  東北大学, 工学(系)研究科(研究院), 講師 (90550591)

Project Period (FY) 2011 – 2012
Keywords音声合成 / 隠れマルコフモデル / 統計的音声合成 / 感情音声合成 / ヒューマノイドロボット / 音声コーパス
Research Abstract

Our goal in this research is to realize more human-like, natural text-to-speech system with various emotional expressions and speaking styles, and the achievements of our studies are as follows:
(1)We proposed a novel corpus-design technique in which accent, style, and sentence-final expression are taken into account. (2)We incorporated user's subjective emotional intensities into acoustic model training to improve the performance of expressive speech synthesis. (3)We proposed an automatic labeling technique of emphasis expression using a parameter generation technique of fundamental frequency to realize emphatic speech synthesis. (4)We proposed cross-lingual speech synthesis using only a target speaker's native language speech samples to synthesis multi-lingual speech at a low cost.

  • Research Products

    (34 results)

All 2014 2013 2012 2011

All Journal Article (20 results) (of which Peer Reviewed: 20 results) Presentation (14 results)

  • [Journal Article] Prosodic variation enhancement using unsupervised context labeling for HMM-based expressive speech synthesis2014

    • Author(s)
      Yu Maeno, Takashi Nose, Takao Kobayashi, Tomoki Koriyama, Yusuke Ijima, Hideharu Nakajima, Hideyuki Mizuno, Osamu Yoshioka
    • Journal Title

      Speech Communication

      Volume: Vol.57 Pages: 144-154

    • DOI

      10.1016/j.specom.2013.09.014

    • Peer Reviewed
  • [Journal Article] Robust estimation of multiple-regression HMM parameters for dimension-based expressive dialogue speech synthesis2013

    • Author(s)
      Tomohiro Nagata, Hiroki Mori, Takashi Nose
    • Journal Title

      Proceedings of 14th Annual Conference of the International Speech Communication Association (ISCA)

      Pages: 1549-1553

    • Peer Reviewed
  • [Journal Article] Statistical nonparametric speech synthesis using sparse Gaussian processes2013

    • Author(s)
      Tomoki Koriyama, Takashi Nose, Takao Kobayashi
    • Journal Title

      Proceedings of 14th Annual Conference of the International Speech Communication Association (ISCA)

      Pages: 1072-1076

    • Peer Reviewed
  • [Journal Article] A style control technique for singing voice synthesis based on multiple-regression HSMM2013

    • Author(s)
      Takashi Nose, Misa Kanemoto, Tomoki Koriyama, Takao Kobayashi
    • Journal Title

      Proceedings of 14th Annual Conference of the International Speech Communication Association (ISCA)

      Pages: 378-382

    • Peer Reviewed
  • [Journal Article] Frame-level acoustic modeling based on Gaussian process regression for statistical nonparametric speech synthesis2013

    • Author(s)
      Tomoki Koriyama, Takashi Nose, Takao Kobayashi
    • Journal Title

      Proceedings of 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing

      Pages: 8007-8011

    • Peer Reviewed
  • [Journal Article] Speaker-independent style conversion for HMM-based expressive speech synthesis2013

    • Author(s)
      Hiroki Kanagawa, Takashi Nose, Takao Kobayashi
    • Journal Title

      Proceedings of 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing

      Pages: 7864-7868

    • Peer Reviewed
  • [Journal Article] HMM-based expressive speech synthesis based on phrase-level F0 context labeling2013

    • Author(s)
      Yu Maeno, Takashi Nose, Takao Kobayashi, Tomoki Koriyama, Yusuke Ijima, Hideharu Nakajima, Hideyuki Mizuno, Osamu Yoshioka
    • Journal Title

      Proceedings of 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing

      Pages: 7859-7863

    • Peer Reviewed
  • [Journal Article] An intuitive style control technique in HMM-based expressive speech synthesis using subjective style intensity and multiple-regression global variance model2013

    • Author(s)
      Takashi Nose, Takao Kobayashi
    • Journal Title

      Speech Communication

      Volume: Vol.55, No.2 Pages: 347-357

    • DOI

      10.1016/j.specom.2012.09.003

    • Peer Reviewed
  • [Journal Article] A speech parameter generation algorithm using local variance for HMM-based speech synthesis2012

    • Author(s)
      Vataya Chunwijitra, Takashi Nose, Takao Kobayashi
    • Journal Title

      Proceedings of 13th Annual Conference of the International Speech Communication Association (ISCA)

      Pages: 1151-1154

    • Peer Reviewed
  • [Journal Article] Discontinuous observation HMM for prosodic-event-based F0 generation2012

    • Author(s)
      Tomoki Koriyama, Takashi Nose, Takao Kobayashi
    • Journal Title

      Proceedings of 13th Annual Conference of the International Speech Communication Association (ISCA)

      Pages: 462-465

    • Peer Reviewed
  • [Journal Article] An F0 modeling technique based on prosodic events for spontaneous speech synthesis2012

    • Author(s)
      Tomoki Koriyama, Takashi Nose, Takao Kobayashi
    • Journal Title

      Proceedings of 2012 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2012)

      Pages: 4589-4593

    • Peer Reviewed
  • [Journal Article] HMM に基づく対話音声合成における多様な韻律生成のためのコンテクストの拡張2012

    • Author(s)
      郡山知樹, 能勢 隆, 小林隆夫
    • Journal Title

      電子情報通信学会論文誌

      Volume: Vol.J95-D, No.3 Pages: 597-607

    • Peer Reviewed
  • [Journal Article] Very low bit-rate F0 coding for phonetic vocoders using MSD-HMM with quantized F0 symbols2012

    • Author(s)
      Takashi Nose, Takao Kobayashi
    • Journal Title

      Speech Communication

      Pages: 384-392

    • DOI

      10.1016/j.specom.2011.10.002

    • Peer Reviewed
  • [Journal Article] A tone-modeling technique using a quantized F0 context to improve tone correctness in average-voice-based speech synthesis2012

    • Author(s)
      Vataya Chunwijitra, Takashi Nose, Takao Kobayashi
    • Journal Title

      Speech Communication

      Volume: Vol.54, No.2 Pages: 245-255

    • DOI

      10.1016/j.specom.2011.08.006

    • Peer Reviewed
  • [Journal Article] Recent development of HMM-based expressive speech synthesis and its applications2011

    • Author(s)
      Takashi Nose, Takao Kobayashi
    • Journal Title

      Proceedings of 2011 Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference

    • URL

      http://www.apsipa.org/proceedings_2011/pdf/APSIPA189.pdf

    • Peer Reviewed
  • [Journal Article] Speaker-independent HMM-based voice conversion using adaptive quantization of the fundamental frequency2011

    • Author(s)
      Takashi Nose, Takao Kobayashi
    • Journal Title

      Speech Communication

      Volume: Vol.53, No.7 Pages: 973-985

    • DOI

      10.1016/j.specom.2011.05.001

    • Peer Reviewed
  • [Journal Article] On the use of extended context for HMM-based spontaneous conversational speech synthesis2011

    • Author(s)
      Tomoki Koriyama, Takashi Nose, Takao Kobayashi
    • Journal Title

      Proceedings of 12th Annual Conference of the International Speech Communication Association (ISCA) (INTERSPEECH 2011)

      Pages: 2657-2660

    • Peer Reviewed
  • [Journal Article] Performance prediction of speech recognition using average-voice-based speech synthesis2011

    • Author(s)
      Tatsuhiko Saito, Takashi Nose, Takao Kobayashi, Yohei Okato, Akio Horii
    • Journal Title

      Proceedings of 12th Annual Conference of the International Speech Communication Association (ISCA) (INTERSPEECH 2011)

      Pages: 1953-1956

    • Peer Reviewed
  • [Journal Article] HMM-based emphatic speech synthesis using unsupervised context labeling2011

    • Author(s)
      Yu Maeno, Takashi Nose, Takao Kobayashi, Yusuke Ijima, Hideharu Nakajima, Hideyuki Mizuno, Osamu Yoshioka
    • Journal Title

      Proceedings of 12th Annual Conference of the International Speech Communication Association (ISCA) (INTERSPEECH 2011)

      Pages: 1849-185

    • Peer Reviewed
  • [Journal Article] A perceptual expressivity modeling technique for speech synthesis based on multiple-regression HSMM2011

    • Author(s)
      Takashi Nose, Takao Kobayashi
    • Journal Title

      Proceedings of 12th Annual Conference of the International Speech Communication Association (ISCA) (INTERSPEECH 2011)

      Pages: 109-112

    • Peer Reviewed
  • [Presentation] Robust estimation of multiple-regression HMM parameters for dimension-based expressive dialogue speech synthesis2013

    • Author(s)
      Tomohiro Nagata, Hiroki Mori, Takashi Nose
    • Organizer
      INTERSPEECH 2013
    • Place of Presentation
      Lyon, France
    • Year and Date
      2013-08-27
  • [Presentation] Statistical nonparametric speech synthesis using sparse Gaussian processes2013

    • Author(s)
      Tomoki Koriyama, Takashi Nose, Takao Kobayashi
    • Organizer
      INTERSPEECH 2013
    • Place of Presentation
      Lyon, France
    • Year and Date
      2013-08-27
  • [Presentation] A style control technique for singing voice synthesis based on multiple-regression HSMM2013

    • Author(s)
      Takashi Nose, Misa Kanemoto, Tomoki Koriyama, Takao Kobayashi
    • Organizer
      INTERSPEECH 2013
    • Place of Presentation
      Lyon, France
    • Year and Date
      2013-08-26
  • [Presentation] Frame-level acoustic modeling based on Gaussian process regression for statistical nonparametric speech synthesis2013

    • Author(s)
      Tomoki Koriyama, Takashi Nose, Takao Kobayashi
    • Organizer
      ICASSP 2013
    • Place of Presentation
      Vancouver, Canada
    • Year and Date
      2013-05-31
  • [Presentation] Speaker-independent style conversion for HMM-based expressive speech synthesis2013

    • Author(s)
      Hiroki Kanagawa, Takashi Nose, Takao Kobayashi
    • Organizer
      ICASSP 2013
    • Place of Presentation
      Vancouver, Canada
    • Year and Date
      2013-05-31
  • [Presentation] HMM-based expressive speech synthesis based on phrase-level F0 context labeling2013

    • Author(s)
      Yu Maeno, Takashi Nose, Takao Kobayashi, Tomoki Koriyama, Yusuke Ijima, Hideharu Nakajima, Hideyuki Mizuno, Osamu Yoshioka
    • Organizer
      ICASSP 2013
    • Place of Presentation
      Vancouver, Canada
    • Year and Date
      2013-05-31
  • [Presentation] A speech parameter generation algorithm using local variance for HMM-based speech synthesis2012

    • Author(s)
      Vataya Chunwijitra, Takashi Nose, Takao Kobayashi
    • Organizer
      INTERSPEECH 2012
    • Place of Presentation
      Portland, USA
    • Year and Date
      2012-09-11
  • [Presentation] Discontinuous observation HMM for prosodic-event-based F0 generation2012

    • Author(s)
      Tomoki Koriyama, Takashi Nose, Takao Kobayashi
    • Organizer
      INTERSPEECH 2012
    • Place of Presentation
      Portland, USA
    • Year and Date
      2012-09-10
  • [Presentation] An F0 modeling technique based on prosodic events for spontaneous speech synthesis2012

    • Author(s)
      Tomoki Koriyama, Takashi Nose, Takao Kobayashi
    • Organizer
      ICASSP 2012
    • Place of Presentation
      Kyoto, Japan
    • Year and Date
      2012-03-29
  • [Presentation] Recent development of HMM-based expressive speech synthesis and its applications2011

    • Author(s)
      Takashi Nose, Takao Kobayashi
    • Organizer
      APSIPA ASC 2011
    • Place of Presentation
      Xi'an, China
    • Year and Date
      2011-10-19
  • [Presentation] On the use of extended context for HMM-based spontaneous conversational speech synthesis2011

    • Author(s)
      Tomoki Koriyama, Takashi Nose, Takao Kobayashi
    • Organizer
      INTERSPEECH 2011
    • Place of Presentation
      Florence, Italy
    • Year and Date
      2011-08-30
  • [Presentation] Performance prediction of speech recognition using average-voice-based speech synthesis2011

    • Author(s)
      Tatsuhiko Saito, Takashi Nose, Takao Kobayashi, Yohei Okato, Akio Horii
    • Organizer
      INTERSPEECH 2011
    • Place of Presentation
      Florence, Italy
    • Year and Date
      2011-08-29
  • [Presentation] HMM-based emphatic speech synthesis using unsupervised context labeling2011

    • Author(s)
      Yu Maeno, Takashi Nose, Takao Kobayashi, Yusuke Ijima, Hideharu Nakajima, Hideyuki Mizuno, Osamu Yoshioka
    • Organizer
      INTERSPEECH 2011
    • Place of Presentation
      Florence, Italy
    • Year and Date
      2011-08-29
  • [Presentation] A perceptual expressivity modeling technique for speech synthesis based on multiple-regression HSMM2011

    • Author(s)
      Takashi Nose, Takao Kobayashi
    • Organizer
      INTERSPEECH 2011
    • Place of Presentation
      Florence, Italy
    • Year and Date
      2011-08-28

URL: 

Published: 2015-06-25  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi