• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Establishment of speech synthesis framework based on Gaussian process regression

Research Project

Project/Area Number 15H02724
Research Category

Grant-in-Aid for Scientific Research (B)

Allocation TypeSingle-year Grants
Section一般
Research Field Perceptual information processing
Research InstitutionTokyo Institute of Technology

Principal Investigator

Kobayashi Takao  東京工業大学, 工学院, 教授 (70153616)

Co-Investigator(Kenkyū-buntansha) 郡山 知樹  東京工業大学, 工学院, 助教 (50749124)
Research Collaborator MOUNGSRI Decha  
NAGAHAMA Daiki  
NOSE Takashi  
ARIFIANTO Dhany  
Project Period (FY) 2015-04-01 – 2018-03-31
Project Status Completed (Fiscal Year 2017)
Budget Amount *help
¥13,000,000 (Direct Cost: ¥10,000,000、Indirect Cost: ¥3,000,000)
Fiscal Year 2017: ¥4,160,000 (Direct Cost: ¥3,200,000、Indirect Cost: ¥960,000)
Fiscal Year 2016: ¥4,030,000 (Direct Cost: ¥3,100,000、Indirect Cost: ¥930,000)
Fiscal Year 2015: ¥4,810,000 (Direct Cost: ¥3,700,000、Indirect Cost: ¥1,110,000)
Keywordsテキスト音声合成 / 統計的パラメトリック音声合成 / 韻律生成 / ガウス過程回帰 / GPR音声合成 / HMM音声合成 / 機械学習 / 深層学習 / 音声情報処理 / 深層ガウス過程
Outline of Final Research Achievements

The purpose of the research is to develop a novel statistical parametric speech synthesis framework based on Gaussian process regression (GPR). We have proposed prosody generation techniques including pitch pattern prediction and phone duration prediction as well as the spectral parameter generation technique based on GPR. We developed a GPR-based speech synthesis system and showed its effectiveness through assessment of synthetic speech quality. Furthermore, we examined the proposed framework for generating expressive speech. We also examined it for generating more natural-sounding prosody in speech synthesis of a tonal language.

Report

(4 results)
  • 2017 Annual Research Report   Final Research Report ( PDF )
  • 2016 Annual Research Report
  • 2015 Annual Research Report
  • Research Products

    (49 results)

All 2018 2017 2016 2015

All Journal Article (25 results) (of which Peer Reviewed: 9 results,  Open Access: 5 results,  Acknowledgement Compliant: 17 results) Presentation (24 results) (of which Int'l Joint Research: 7 results,  Invited: 1 results)

  • [Journal Article] GPR-based Thai speech synthesis using multi-level duration prediction2018

    • Author(s)
      Decha Moungsri, Tomoki Koriyama, Takao Kobayashi
    • Journal Title

      Speech Communication

      Volume: 99 Pages: 114-123

    • DOI

      10.1016/j.specom.2018.03.005

    • Related Report
      2017 Annual Research Report
    • Peer Reviewed
  • [Journal Article] GP-DNNハイブリッドモデルに基づく統計的音声合成の検討2018

    • Author(s)
      郡山知樹, 小林隆夫
    • Journal Title

      電子情報通信学会技術研究報告(SP)

      Volume: 117(393) Pages: 5-10

    • NAID

      40021473756

    • Related Report
      2017 Annual Research Report
  • [Journal Article] GPR音声合成における深層ガウス過程の利用の検討2018

    • Author(s)
      郡山知樹, 小林隆夫
    • Journal Title

      電子情報通信学会技術研究報告(SP)

      Volume: 117(517) Pages: 27-32

    • NAID

      120006705503

    • Related Report
      2017 Annual Research Report
  • [Journal Article] GPR音声合成における区分線形変換を用いたスタイル適応のためのデータ分割法の検討2018

    • Author(s)
      前野雄也, 郡山知樹, 小林隆夫
    • Journal Title

      日本音響学会2018年春季研究発表会講演論文集

      Volume: - Pages: 295-296

    • Related Report
      2017 Annual Research Report
  • [Journal Article] GPR音声合成における深層構造の利用の検討2018

    • Author(s)
      郡山知樹, 小林隆夫
    • Journal Title

      日本音響学会2018年春季研究発表会講演論文集

      Volume: - Pages: 1507-1508

    • NAID

      120006705491

    • Related Report
      2017 Annual Research Report
  • [Journal Article] Speaker Adaptation Using Shared Context Clustering for Cross-lingual Speech Synthesis2017

    • Author(s)
      長濱大樹, 能勢隆, 郡山知樹, 小林隆夫
    • Journal Title

      電子情報通信学会論文誌D 情報・システム

      Volume: J100-D Issue: 3 Pages: 385-393

    • DOI

      10.14923/transinfj.2016PDP0020

    • ISSN
      1880-4535, 1881-0225
    • Year and Date
      2017-03-01
    • Related Report
      2016 Annual Research Report
    • Peer Reviewed / Acknowledgement Compliant
  • [Journal Article] Enhanced F0 generation for GPR-based speech synthesis considering syllable-based prosodic features2017

    • Author(s)
      Decha Moungsri, Tomoki Koriyama, Takao Kobayashi
    • Journal Title

      Proceedings of APSIPA Annual Summit and Conference 2017

      Volume: - Pages: 1-4

    • Related Report
      2017 Annual Research Report
    • Peer Reviewed / Open Access
  • [Journal Article] GPR音声合成のためのフレームコンテキストカーネルに基づく決定木構築の検討2017

    • Author(s)
      郡山知樹, 小林隆夫
    • Journal Title

      日本音響学会2017年秋季研究発表会講演論文集

      Volume: - Pages: 177-178

    • NAID

      120006705316

    • Related Report
      2017 Annual Research Report
  • [Journal Article] ガウス過程回帰に基づく歌声合成の検討2017

    • Author(s)
      郡山知樹, 岡野祐紀, 小林隆夫
    • Journal Title

      日本音響学会2017年秋季研究発表会講演論文集

      Volume: - Pages: 295-296

    • NAID

      120006705394

    • Related Report
      2017 Annual Research Report
  • [Journal Article] Duration prediction using multiple Gaussian process experts for GPR-based speech synthesis2017

    • Author(s)
      Decha Moungsri, Tomoki Koriyama, Takao Kobayashi
    • Journal Title

      Proc. 2017 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2017)

      Volume: - Pages: 5945-5948

    • Related Report
      2016 Annual Research Report
    • Peer Reviewed / Acknowledgement Compliant
  • [Journal Article] アクセント情報自動ラベリングの音声合成品質への影響に関する検討2017

    • Author(s)
      増子理菜, 郡山知樹, 小林隆夫
    • Journal Title

      日本音響学会2017年春季研究発表会講演論文集

      Volume: CD-ROM Pages: 283-284

    • Related Report
      2016 Annual Research Report
    • Acknowledgement Compliant
  • [Journal Article] GPR音声合成に基づいたオーディオブック音声の合成2017

    • Author(s)
      津野駿幸, 郡山知樹, 小林隆夫
    • Journal Title

      日本音響学会2017年春季研究発表会講演論文集

      Volume: CD-ROM Pages: 295-296

    • Related Report
      2016 Annual Research Report
    • Acknowledgement Compliant
  • [Journal Article] コンテキストを考慮した音素マッチングに基づく非パラレルデータGMM声質変換2017

    • Author(s)
      高橋亮, 郡山知樹, 小林隆夫
    • Journal Title

      日本音響学会2017年春季研究発表会講演論文集

      Volume: CD-ROM Pages: 367-368

    • Related Report
      2016 Annual Research Report
    • Acknowledgement Compliant
  • [Journal Article] Tone modeling using Gaussian process latent variable model for statistical speech synthesis2016

    • Author(s)
      Decha Moungsri, Tomoki Koriyama, Takao Kobayashi
    • Journal Title

      Proc. the 8th International Conference on Speech Prosody (SPEECH PROSODY 2016)

      Volume: - Pages: 1014-1018

    • DOI

      10.21437/speechprosody.2016-208

    • Related Report
      2016 Annual Research Report
    • Peer Reviewed / Open Access / Acknowledgement Compliant
  • [Journal Article] Unsupervised stress information labeling using Gaussian process latent variable model for statistical speech synthesis2016

    • Author(s)
      Decha Moungsri, Tomoki Koriyama, Takao Kobayashi
    • Journal Title

      Proc. 17th Annual Conference of the International Speech Communication Association (INTERSPEECH 2016)

      Volume: - Pages: 1591-1595

    • DOI

      10.21437/interspeech.2016-273

    • Related Report
      2016 Annual Research Report
    • Peer Reviewed / Open Access / Acknowledgement Compliant
  • [Journal Article] GPR音声合成における区分線形特徴量変換を用いたスタイル適応の検討2016

    • Author(s)
      前野雄也, 郡山知樹, 小林隆夫
    • Journal Title

      日本音響学会2016年秋季研究発表会講演論文集

      Volume: CD-ROM Pages: 213-214

    • Related Report
      2016 Annual Research Report
    • Acknowledgement Compliant
  • [Journal Article] 非パラレルデータを用いるGMM声質変換の検討2016

    • Author(s)
      高橋亮, 郡山知樹, 小林隆夫
    • Journal Title

      日本音響学会2016年秋季研究発表会講演論文集

      Volume: CD-ROM Pages: 267-268

    • Related Report
      2016 Annual Research Report
    • Acknowledgement Compliant
  • [Journal Article] A speaker adaptation technique for Gaussian process regression based speech synthesis using feature space transform2016

    • Author(s)
      Tomoki Koriyama, Syohei Oshio, Takao Kobayashi
    • Journal Title

      Proc. 2016 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

      Volume: ICASSP Pages: 5610-5614

    • NAID

      120006704514

    • Related Report
      2015 Annual Research Report
    • Peer Reviewed / Acknowledgement Compliant
  • [Journal Article] 音声合成のためのCRF/HMMに基づく自動アクセント推定の評価2016

    • Author(s)
      増子 理菜, 郡山 知樹, 小林 隆夫
    • Journal Title

      電子情報通信学会技術研究報告〔音声〕

      Volume: 115/SP2015-85 Pages: 1-6

    • Related Report
      2015 Annual Research Report
    • Acknowledgement Compliant
  • [Journal Article] GPR音声合成におけるスタイル適応の検討2016

    • Author(s)
      前野 雄也, 郡山 知樹, 小林 隆夫
    • Journal Title

      日本音響学会2016年春季研究発表会講演論文集

      Volume: CD-ROM Pages: 233-234

    • Related Report
      2015 Annual Research Report
    • Acknowledgement Compliant
  • [Journal Article] 多様なスタイルによるGPR音声合成の検討2016

    • Author(s)
      岡元 伶洋, 郡山 知樹, 小林 隆夫
    • Journal Title

      日本音響学会2016年春季研究発表会講演論文集

      Volume: CD-ROM Pages: 361-362

    • Related Report
      2015 Annual Research Report
    • Acknowledgement Compliant
  • [Journal Article] Duration prediction using multi-level model for GPR-based speech synthesis2015

    • Author(s)
      Decha Moungsri, Tomoki Koriyama, Takao Kobayashi
    • Journal Title

      Proc. 16th Annual Conference of the International Speech Communication Association (INTERSPEECH)

      Volume: INTERSPEECH Pages: 1591-1595

    • Related Report
      2015 Annual Research Report
    • Peer Reviewed / Open Access / Acknowledgement Compliant
  • [Journal Article] A comparison of speech synthesis systems based on GPR, HMM, and DNN with a small amount of training data2015

    • Author(s)
      Tomoki Koriyama, Takao Kobayashi
    • Journal Title

      Proc. 16th Annual Conference of the International Speech Communication Association (INTERSPEECH)

      Volume: INTERSPEECH Pages: 3496-3500

    • Related Report
      2015 Annual Research Report
    • Peer Reviewed / Open Access / Acknowledgement Compliant
  • [Journal Article] GPR音声合成における話者適応手法の検討2015

    • Author(s)
      押尾 翔平, 郡山 知樹, 小林 隆夫
    • Journal Title

      日本音響学会2015年秋季研究発表会講演論文集

      Volume: CD-ROM Pages: 219-220

    • Related Report
      2015 Annual Research Report
    • Acknowledgement Compliant
  • [Journal Article] ガウス過程回帰に基づく音声合成システムの評価2015

    • Author(s)
      郡山 知樹, 小林 隆夫
    • Journal Title

      日本音響学会2015年秋季研究発表会講演論文集

      Volume: CD-ROM Pages: 235-236

    • NAID

      120006704045

    • Related Report
      2015 Annual Research Report
    • Acknowledgement Compliant
  • [Presentation] GP-DNNハイブリッドモデルに基づく統計的音声合成の検討2018

    • Author(s)
      郡山知樹
    • Organizer
      電子情報通信学会音声研究会
    • Related Report
      2017 Annual Research Report
  • [Presentation] GPR音声合成における深層ガウス過程の利用の検討2018

    • Author(s)
      郡山知樹
    • Organizer
      電子情報通信学会音声研究会
    • Related Report
      2017 Annual Research Report
  • [Presentation] GPR音声合成における区分線形変換を用いたスタイル適応のためのデータ分割法の検討2018

    • Author(s)
      前野雄也
    • Organizer
      日本音響学会2018年春季研究発表会
    • Related Report
      2017 Annual Research Report
  • [Presentation] GPR音声合成における深層構造の利用の検討2018

    • Author(s)
      郡山知樹
    • Organizer
      日本音響学会2018年春季研究発表会
    • Related Report
      2017 Annual Research Report
  • [Presentation] アクセント情報自動ラベリングの音声合成品質への影響に関する検討2017

    • Author(s)
      増子理菜
    • Organizer
      日本音響学会2017年春季研究発表会
    • Place of Presentation
      明治大学生田キャンパス(神奈川県川崎市)
    • Year and Date
      2017-03-15
    • Related Report
      2016 Annual Research Report
  • [Presentation] GPR音声合成に基づいたオーディオブック音声の合成2017

    • Author(s)
      津野駿幸
    • Organizer
      日本音響学会2017年春季研究発表会
    • Place of Presentation
      明治大学生田キャンパス(神奈川県川崎市)
    • Year and Date
      2017-03-15
    • Related Report
      2016 Annual Research Report
  • [Presentation] コンテキストを考慮した音素マッチングに基づく非パラレルデータGMM声質変換2017

    • Author(s)
      高橋亮
    • Organizer
      日本音響学会2017年春季研究発表会
    • Place of Presentation
      明治大学生田キャンパス(神奈川県川崎市)
    • Year and Date
      2017-03-15
    • Related Report
      2016 Annual Research Report
  • [Presentation] Duration prediction using multiple Gaussian process experts for GPR-based speech synthesis2017

    • Author(s)
      Decha Moungsri
    • Organizer
      2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017
    • Place of Presentation
      ヒルトンニューオーリンズリバーサイド(米国)
    • Year and Date
      2017-03-05
    • Related Report
      2016 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Enhanced F0 generation for GPR-based speech synthesis considering syllable-based prosodic features2017

    • Author(s)
      Decha Moungsri
    • Organizer
      APSIPA Annual Summit and Conference 2017
    • Related Report
      2017 Annual Research Report
    • Int'l Joint Research
  • [Presentation] 表現豊かな音声合成に向けた多様な話者性とスタイルによる音声合成への取組み2017

    • Author(s)
      小林隆夫
    • Organizer
      第19回音声言語シンポジウム
    • Related Report
      2017 Annual Research Report
    • Invited
  • [Presentation] GPR音声合成のためのフレームコンテキストカーネルに基づく決定木構築の検討2017

    • Author(s)
      郡山知樹
    • Organizer
      日本音響学会2017年秋季研究発表会
    • Related Report
      2017 Annual Research Report
  • [Presentation] ガウス過程回帰に基づく歌声合成の検討2017

    • Author(s)
      郡山知樹
    • Organizer
      日本音響学会2017年秋季研究発表会
    • Related Report
      2017 Annual Research Report
  • [Presentation] GPR音声合成における区分線形特徴量変換を用いたスタイル適応の検討2016

    • Author(s)
      前野雄也
    • Organizer
      日本音響学会2016年秋季研究発表会
    • Place of Presentation
      富山大学(富山県富山市)
    • Year and Date
      2016-09-14
    • Related Report
      2016 Annual Research Report
  • [Presentation] 非パラレルデータを用いるGMM声質変換の検討2016

    • Author(s)
      高橋亮
    • Organizer
      日本音響学会2016年秋季研究発表会
    • Place of Presentation
      富山大学(富山県富山市)
    • Year and Date
      2016-09-14
    • Related Report
      2016 Annual Research Report
  • [Presentation] Unsupervised stress information labeling using Gaussian process latent variable model for statistical speech synthesis2016

    • Author(s)
      Decha Moungsri
    • Organizer
      17th Annual Conference of the International Speech Communication Association, INTERSPEECH 2016
    • Place of Presentation
      ハイアットリージェンシーサンフランシスコ(米国)
    • Year and Date
      2016-09-08
    • Related Report
      2016 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Tone modeling using Gaussian process latent variable model for statistical speech synthesis2016

    • Author(s)
      Decha Moungsri
    • Organizer
      the 8th International Conference on Speech Prosody, SPEECH PROSODY 2016
    • Place of Presentation
      ボストン大学(米国)
    • Year and Date
      2016-05-31
    • Related Report
      2016 Annual Research Report
    • Int'l Joint Research
  • [Presentation] A speaker adaptation technique for Gaussian process regression based speech synthesis using feature space transform2016

    • Author(s)
      郡山 知樹, 小林 隆夫
    • Organizer
      2016 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2016
    • Place of Presentation
      上海国際会議中心(中国)
    • Year and Date
      2016-03-20
    • Related Report
      2015 Annual Research Report
    • Int'l Joint Research
  • [Presentation] 多様なスタイルによるGPR音声合成の検討2016

    • Author(s)
      岡元 伶洋, 郡山 知樹, 小林 隆夫
    • Organizer
      日本音響学会2016年春季研究発表会
    • Place of Presentation
      桐蔭横浜大学(神奈川県横浜市)
    • Year and Date
      2016-03-09
    • Related Report
      2015 Annual Research Report
  • [Presentation] GPR音声合成におけるスタイル適応の検討2016

    • Author(s)
      前野 雄也, 郡山 知樹, 小林 隆夫
    • Organizer
      日本音響学会2016年春季研究発表会
    • Place of Presentation
      桐蔭横浜大学(神奈川県横浜市)
    • Year and Date
      2016-03-09
    • Related Report
      2015 Annual Research Report
  • [Presentation] 音声合成のためのCRF/HMMに基づく自動アクセント推定の評価2016

    • Author(s)
      増子 理菜, 郡山 知樹, 小林 隆夫
    • Organizer
      電子情報通信学会・日本音響学会 音声研究会
    • Place of Presentation
      サンピアンかわさき(神奈川県川崎市)
    • Year and Date
      2016-01-14
    • Related Report
      2015 Annual Research Report
  • [Presentation] GPR音声合成における話者適応手法の検討2015

    • Author(s)
      押尾 翔平, 郡山 知樹, 小林 隆夫
    • Organizer
      日本音響学会2015年秋季研究発表会
    • Place of Presentation
      会津大学(福島県会津若松市)
    • Year and Date
      2015-09-16
    • Related Report
      2015 Annual Research Report
  • [Presentation] ガウス過程回帰に基づく音声合成システムの評価2015

    • Author(s)
      郡山 知樹, 小林 隆夫
    • Organizer
      日本音響学会2015年秋季研究発表会
    • Place of Presentation
      会津大学(福島県会津若松市)
    • Year and Date
      2015-09-16
    • Related Report
      2015 Annual Research Report
  • [Presentation] Duration prediction using multi-level model for GPR-based speech synthesis2015

    • Author(s)
      Decha Moungsri, 郡山 知樹, 小林 隆夫
    • Organizer
      16th Annual Conference of the International Speech Communication Association, INTERSPEECH 2015
    • Place of Presentation
      ドレスデンインターナショナルコングレスセンター(ドイツ)
    • Year and Date
      2015-09-06
    • Related Report
      2015 Annual Research Report
    • Int'l Joint Research
  • [Presentation] A comparison of speech synthesis systems based on GPR, HMM, and DNN with a small amount of training data2015

    • Author(s)
      郡山 知樹, 小林 隆夫
    • Organizer
      16th Annual Conference of the International Speech Communication Association, INTERSPEECH 2015
    • Place of Presentation
      ドレスデンインターナショナルコングレスセンター(ドイツ)
    • Year and Date
      2015-09-06
    • Related Report
      2015 Annual Research Report
    • Int'l Joint Research

URL: 

Published: 2015-04-16   Modified: 2019-03-29  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi