• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Realization of HMM-based text-to-speech Synthesis Systems

Research Project

Project/Area Number 10555125
Research Category

Grant-in-Aid for Scientific Research (B).

Allocation TypeSingle-year Grants
Section展開研究
Research Field 情報通信工学
Research InstitutionTokyo Institute of Technology

Principal Investigator

KOBAYASHI Takao  Tokyo Institute of Technology, Dept. of Information Processing, Professor, 大学院・総合理工学研究科, 教授 (70153616)

Co-Investigator(Kenkyū-buntansha) MASUKO Takashi  Tokyo Institute of Technology, Dept. of Information Processing, Research Associate, 大学院・総合理工学研究科, 助手 (90272715)
TOKUDA Keiichi  Nagoya Institute of Technology, Dept. of Computer Science, Associate Professor, 工学部, 助教授 (20217483)
Project Period (FY) 1998 – 2000
Project Status Completed (Fiscal Year 2000)
Budget Amount *help
¥4,500,000 (Direct Cost: ¥4,500,000)
Fiscal Year 2000: ¥1,100,000 (Direct Cost: ¥1,100,000)
Fiscal Year 1999: ¥3,400,000 (Direct Cost: ¥3,400,000)
Keywordstext-to-speech synthesis (TTS) / hidden Markov model (HMM) / multi-space probability distribution HMM (MSD-HMM) / HMM-based speech synthesis / voice characteristics conversion / speaker interpolation / pitch pattern / speech parameter generation / 形態素解析 / 話者適応 / 音声合成 / 隠れマルコフモデル / テキスト解析 / MSLRパーザ / 音声合成システム / ピッチ / 音質変換
Research Abstract

The main purpose of this research is to realize a text-to-speech synthesis system which can generate speech with various voice characteristics based on hidden Markov models (HMMs). We have obtained the following results.
1. Modeling of phonetic and prosodic information of speech based on HMM
We have proposed a new kind of HMM, called multi-space probability distribution HMM (MSD-HMM), which can model pitch pattern of speech without heuristic assumption. Then we have also proposed a technique in which spectrum, pitch, and state duration are modeled simultaneously in a unified framework of HMM.
2. Speech parameter generation from HMM
We have extended the parameter generation algorithm from HMM to a general case in which the state sequence or a part of it is latent and derived a new algorithm. We have also derived a pitch pattern generation algorithm based on MSD-HMM
3. Realization of text-to-speech synthesis system based on HMMs
We have developed a Japanese text-to-speech synthesis system, which works on workstations and PCs, based on the simultaneous modeling of spectrum, pitch, and duration by HMM and the speech parameter generation from HMM.
4. Speech synthesis with various voice characteristics
We have proposed voice characteristics conversion techniques for the HMM-based speech synthesis system using speaker adaptation techniques for HMMs, such as MAP/VFS and MLLR.We have also proposed a speaker interpolation technique by interpolating HMM parameters among representative speakers' HMM sets. Using these techniques, we have shown that the HMM-based speech synthesis system can generate speech with various voice characteristics.

Report

(4 results)
  • 2000 Annual Research Report   Final Research Report Summary
  • 1999 Annual Research Report
  • 1998 Annual Research Report
  • Research Products

    (52 results)

All Other

All Publications (52 results)

  • [Publications] 広井順,徳田恵一,益子貴史,小林隆夫,北村正: "HMMに基づいた極低ビットレート音声符号化"電子情報通信学会論文誌. J82-D-II・11. 1857-1864 (1999)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] T.Yoshimura,K.Tokuda,T.Masuko,T.Kobayashi,T.Kitamura: "Speaker interpolation for HMM-based speech synthesis system"J.of Acoustical Society of Japan (E). 21-4. 199-206 (2000)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] 徳田恵一,益子貴史,宮崎昇,小林隆夫: "多空間上の確率分布に基づいたHMM"電子情報通信学会論文誌. J83-D-II・7. 1579-1589 (2000)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] 益子貴史,徳田恵一,宮崎昇,小林隆夫: "多空間確率分布HMMによるピッチパターン生成"電子情報通信学会論文誌. J83-D-II・7. 1600-1609 (2000)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] 吉村貴克,徳田恵一,益子貴史,小林隆夫,北村正: "HMMに基づく音声合成におけるスペクトル・ピッチ・継続長の同時モデル化"電子情報通信学会論文誌. J83-D-II・11. 2099-2107 (2000)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] 益子貴史,徳田恵一,小林隆夫: "話者照合システムに対する合成音声による詐称"電子情報通信学会論文誌. J83-D-II・11. 2283-2290 (2000)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] 益子貴史,田村正統,徳田恵一,小林隆夫: "HMMに基づく音声合成システムにおけるMAP-VFSを用いた声質変換"電子情報通信学会論文誌. J83-D-II・12. 2509-2516 (2000)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] K.Tokuda,T.Masuko,J.Hiroi,T.Kobayashi,T.Kitamura: "A very low bit rate speech coder using HMM-based Speech recognition/synthesis"Proc.of 1998 IEEE, International Conference on Acoustics, Speech, and Signal Processing. 2. 609-612 (1998)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] T.Masuko,T.Kobayashi,M.Tamura,J.Masubuchi,K.Tokuda: "Text-to-visual speech synthesis based on parameter generation from HMM"Proc.of 1998 IEEE, International Conference on Acoustics, Speech, and Signal Processing. 6. 3745-3748 (1998)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] M.Tamura,T.Masuko,K.Tokuda,T.Kobayashi: "Speaker adaptation for HMM-based speech synthesis system using MLLR"Proc.of ESCA/COCOSDA, International Workshop on Speech Synthesis. 273-276 (1998)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] T.Yoshimura,T.Masuko,K.Tokuda,T.Kobayashi,T.Kitamura: "Duration modeling for HMM-based speech synthesis"Proc.of 5th International Conference on Spoken Language Processing. 2. 29-32 (1998)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] T.Masuko,K.Tokuda,T.Kobayashi: "A very low bit rate Speech coder using HMM with speaker adaptation"Proc.of 5th International Conference on Spoken Language Processing. 2. 507-510 (1998)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] M.Tamura,T.Masuko,T.Kobayashi,K.Tokuda: "Visual speech synthesis based on parameter generation from HMM : Speech-Driven and text-and-speech-driven approaches"Proc.of International Conference on Auditory-Visual Speech Processing. 219-224 (1998)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] K.Tokuda,T.Masuko,N.Miyazaki,T.Kobayashi: "Hidden Markov models based on multi-space probability distribution for pitch pattern modeling"Proc.1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. 1. 229-232 (1998)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] T.Yoshimura,K.Tokuda,T.Masuko,T.Kobayashi,T.Kitamura: "Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis"Proc.of 6th European Conf.on Speech Communication and Technology. 6. 2347-2350 (1999)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] M.Tamura,S.Kondo,T.Masuko T.Kobayashi: "Text-to-audio visual speech synthesis based on parameter generation from HMM"Proc.of 6th European Conference on Speech Communication and Technology. 1. 959-962 (1999)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] T.Masuko,T.Hitotsumatsu,K.Tokuda,T.Kobayashi: "On the security of HMM-based speaker verification systems against imposture using synthetic speech"Proc.of 6th European Conf.on Speech Communication and Technology. 3. 1223-1226 (1999)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] K.Tokuda,T.Yoshimura,T.Masuko,T.Kobayashi,T.Kitamura: "Speech parameter generation algorithms for HMM-based speech synthesis "Proc.of 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. 3. 1315-1318 (2000)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] T.Masuko,K.Tokuda,T.Kobayashi: "Imposture using Synthetic speech against text-prompted speaker verification based on spectrum and pitch"Proc.of 6th International Conference on Spoken Language Processing. 2. 302-305 (2000)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] S.Sako,K.Tokuda,T.Masuko,T.Kobayashi,T.Kitamura: "HMM-based text-to-audio-visual speech synthesis"Proc.of 6th International Conference on Spoken Language Processing. 3. 25-28 (1998)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] Jun Hiroi, Keiichi Tokuda, Takashi Masuko, Takao Kobayashi, Tadashi Kitamura: "Very low bit rate speech coding based on HMMs"IEICE Trans.. D-II,11. 1857-1864 (1999)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] Takayoshi Yoshimura, Keiichi Tokuda, Takashi Masuko, Takao Kobayashi, Tadashi Kitamura: "Speaker interpolation for HMM-based speech synthesis system"J.Acoust. Soc. Jpn. (E). 21,4. 199-206 (2000)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] Takashi Masuko, Keiichi Tokuda, Noboru Miyazaki, Takao Kobayashi: "Multi-space probability distribution HMM"IEICE Trans.. D-II,7. 1579-1589 (2000)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] Keiichi Tokuda, Takashi Masuko, Noboru Miyazaki, Takao Kobayashi: "Pitch pattern generation using multi-space probability distribution HMM"IEICE Trans.. D-II,7. 1600-1609 (2000)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] Takayoshi Yoshimura, Keiichi Tokuda, Takashi Masuko, Takao Kobayashi, Tadashi Kitamura: "Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis"IEICE Trans.. D-II, 11. 2099-2107 (2000)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] Takashi Masuko, Keiichi Tokuda, Takao Kobayashi: "Imposture against a speaker verification system using synthetic speech"IEICE Trans.. D-II, 11. 2283-2290 (2000)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] Takashi Masuko, Masatsune Tamura, Keiichi Tokuda, Takao Kobayashi: "Voice characteristics conversion for HMM-based speech synthesis system Using MAP-VFS"IEICE Trans.. D-II, 12. 2509-2516 (2000)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] Keiichi Tokuda, Takasi Masuko, Jun Hiroi, Takao Kobayashi, Tadasi Kitamura: "A very low bit rate speech coder using HMM-based speech recognition/synthesis"Proc. of 1998 IEEE, International Conference on Acoustics, Speech, and Signal Processing. 2. 609-612 (1998)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] Takasi Masuko, Takao Kobayashi, Masatsune Tamura, Jun Masubuchi, Keiichi Tokuda: "Text-to-visual speech synthesis based on parameter generation from HMM"Proc. of 1998 IEEE, International Conference on Acoustics, Speech, and Signal Processing. 6. 3745-3748 (1998)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] Masatsune Tamura, Takashi Masuko, Keiichi Tokuda, Takao Kobayashi: "Speaker adaptation for HMM-based speech synthesis system using MLLR"Proc. of ESCA/COCOSDA International Workshop on Speech Synthesis. 273-276 (1998)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] Takayoshi Yoshimura, Takashi Masuko, Keiichi Tokuda, Takao Kobayashi, Tadashi Kitamura: "Duration modeling for HMM-based speech synthesis"Proc. of 5th International Conference on Spoken Language Processing. 2. 29-32 (1998)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] Takashi Masuko, Keiichi Tokuda, Takao Kobayashi: "A very low bit rate Speech coder using HMM with speaker adaptation"Proc. of 5th International Conference on Spoken Language Processing. 2. 507-510 (1998)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] Masatsune Tamura, Takashi Masuko, Takao Kobayashi, Keiichi Tokuda: "Visual speech synthesis based on parameter generation from HMM : Speech-Driven and text-and-speech-driven approaches"Proc. of International Conference on Auditory-Visual Speech Processing. 219-224 (1998)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] Keiichi Tokuda, Takashi Masuko, Noboru Miyazaki, Takao Kobayashi: "Hidden Markov models based on multi-space probability distribution for pitch pattern modeling"Proc. 1999 IEEE International Conf. on Acoustics, Speech, and Signal Processing. 1. 229-232 (1999)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] Masatsune Tamura, Shigekazu Kondo, Takashi Masuko, Takao Kobayashi: "Text-to-audio-visual speech synthesis based on parameter generation from HMM"Proc. of 6th European Conf. on Speech Communication and Technology. 2. 959-962 (1999)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] Takashi Masuko, Takafumi Hitotsumatsu, Keiichi Tokuda, Takao Kobayashi: "On the security of HMM-based speaker verification systems against imposture using synthetic speech"Proc. of 6th European Conf. on Speech Communication and Technology. 3. 1223-1226 (1999)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] Takayoshi Yoshimura, Keiichi Tokuda, Takashi Masuko, Takao Kobayashi, Tadashi Kitamura: "Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis"Proc. of 6th European Conf. on Speech Communication and Technology. 6. 2347-2350 (1999)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] Keiichi Tokuda, Takayoshi Yoshimura, Takashi Masuko, Takao Kobayashi, Tadashi Kitamura: "Speech parameter generation algorithms for HMM-based speech synthesis"Proc. of 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. 3. 1315-1318 (2000)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] Takashi Masuko, Keiichi Tokuda, Takao Kobayashi: "Imposture using Synthetic speech against text-prompted speaker verification based on spectrum and pitch"Proc. of 6th International Conference on Spoken Language Processing. 2. 302-305 (2000)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] Shuji Sako, Keiichi Tokuda, Takashi Masuko, Takao Kobayashi, Tadashi Kitamura: "HMM-based text-to-audio-visual speech synthesis"Proc. of 6th International Conference on Spoken Language Processing. 3. 25-28 (2000)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] 益子貴史,田村正統,徳田恵一,小林隆夫: "HMMに基づく音声合成システムにおけるMAP-VFSを用いた声質変換"電子情報通信学会論文誌. J83-D-II・12. 2509-2516 (2000)

    • Related Report
      2000 Annual Research Report
  • [Publications] 吉村貴克,徳田恵一,益子貴史,小林隆夫,北村正: "HMMに基づく音声合成におけるスペクトル・ピッチ・継続長の同時モデル化"電子情報通信学会論文誌. J83-D-II・11. 2099-2107 (2000)

    • Related Report
      2000 Annual Research Report
  • [Publications] 徳田恵一,益子貴史,宮崎昇,小林隆夫: "多空間上の確率分布に基づいたHMM"電子情報通信学会論文誌. J83-D-II・7. 1579-1589 (2000)

    • Related Report
      2000 Annual Research Report
  • [Publications] 益子貴史,徳田恵一,宮崎昇,小林隆夫: "多空間確率分布HMMによるピッチパターン生成"電子情報通信学会論文誌. J83-D-II・7. 1600-1609 (2000)

    • Related Report
      2000 Annual Research Report
  • [Publications] T.Yoshimura,K.Tokuda,T.Masuko,T.Kobayashi,T.Kitamura: "Speaker interpolation for HMM-based speech synthesis system"J.Acoust.Soc.Jpn.(E). 21・4. 199-206 (2000)

    • Related Report
      2000 Annual Research Report
  • [Publications] K.Tokuda,T.Masuko,T.Kobayashi,T.Kitamura: "Parameter generation algorithms for HMM-based speech synthesis"Proc.IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2000. III. 1315-1318 (2000)

    • Related Report
      2000 Annual Research Report
  • [Publications] T.Yoshimura,K.Tokuda,T.Masuko,T.Kobayashi,T.Kitamura: "Simultaneous modeling of spectrum,pitch and duration in HMM-based speech synthesis"Proc.6th European Conference on Speech Communication and Technology,EUROSPEECH'99. EUROSPEECH-99・5. 2347-2350 (1999)

    • Related Report
      1999 Annual Research Report
  • [Publications] 吉村 貴克,徳田 恵一,益子 貴史,小林 隆夫,北村 正: "HMMに基づく音声合成におけるスペクトル・ピッチ・状態継続長の同時モデル化"電子情報通信学会技術研究報告(SP). 99・255. 33-38 (1999)

    • Related Report
      1999 Annual Research Report
  • [Publications] 吉村 貴克,徳田 恵一,益子 貴史,小林 隆夫,北村 正: "HMMに基づく音声合成のためのスペクトラム,ピッチ,状態継続長のモデル化"日本音響学会平成11年度春季研究発表会講演論文集. 241-242 (1999)

    • Related Report
      1999 Annual Research Report
  • [Publications] 田村 正統,益子 貴史,徳田 恵一,小林 隆夫: "MLLRおよびMAP/VFSを用いたHMM音声合成における話者適応"日本音響学会平成11年度春季研究発表会講演論文集. 243-244 (1999)

    • Related Report
      1999 Annual Research Report
  • [Publications] 一ツ松 孝文,益子 貴史,小林 隆夫,徳田 恵一: "合成音声を用いたテキスト指定型話者照合システムにおける詐称の検討"日本音響学会平成11年度春季研究発表会講演論文集. 265-266 (1999)

    • Related Report
      1999 Annual Research Report
  • [Publications] 吉村 貴克,徳田 恵一,益子 貴史,小林 隆夫,北村 正: "HMMに基づくピッチパターン生成における動的特徴量の効果"日本音響学会平成11年度秋季研究発表会講演論文集. 215-216 (1999)

    • Related Report
      1999 Annual Research Report

URL: 

Published: 1999-04-01   Modified: 2016-04-21  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi