• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Harnessing Latent Variation in DNN-Based Speech Synthesis

Research Project

Project/Area Number 17K12720
Research Category

Grant-in-Aid for Young Scientists (B)

Allocation TypeMulti-year Fund
Research Field Perceptual information processing
Research InstitutionNational Institute of Informatics

Principal Investigator

Henter Gustav  国立情報学研究所, コンテンツ科学研究系, 特任研究員 (30793096)

Project Period (FY) 2017-04-01 – 2018-03-31
Project Status Discontinued (Fiscal Year 2017)
Budget Amount *help
¥3,250,000 (Direct Cost: ¥2,500,000、Indirect Cost: ¥750,000)
Fiscal Year 2018: ¥1,300,000 (Direct Cost: ¥1,000,000、Indirect Cost: ¥300,000)
Fiscal Year 2017: ¥1,950,000 (Direct Cost: ¥1,500,000、Indirect Cost: ¥450,000)
KeywordsSpeech synthesis / Latent variables / Controllable synthesis / Deep learning / Emotional speech / 音声合成 / ディープラーニング / 潜在変数 / 制御
Outline of Annual Research Achievements

With this grant, I have derived and published theoretical connections between common (heuristic) practical methods for unsupervised learning of controllable speech synthesisers, and latent variables in Bayesian probability, including how common extensions of the practical approach can be given a probabilistic interpretation. Related work (published as well as submitted) explored the optimal supervised methods for annotating the same data, and (separately) considered speech synthesis with multilingual phonetic control. A listening test is currently comparing the aforementioned supervised and unsupervised approaches against variational autoencoders (VAE) and a journal manuscript with the results, and new theoretical connections between VAE and common synthesis heuristics, is in preparation.

Report

(1 results)
  • 2017 Annual Research Report
  • Research Products

    (3 results)

All 2018 2017

All Presentation (3 results) (of which Int'l Joint Research: 2 results)

  • [Presentation] Cyborg speech: Deep multilingual speech synthesis for generating segmental foreign accent with natural prosody2018

    • Author(s)
      Gustav Eje Henter, Jaime Lorenzo-Trueba, Xin Wang, Mariko Kondo, Junichi Yamagishi
    • Organizer
      IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
    • Place of Presentation
      Calgary, Alberta, Canada
    • Year and Date
      2018-04-15
    • Related Report
      2017 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Generating segment-level foreign-accented synthetic speech with natural speech prosody2018

    • Author(s)
      Gustav Eje Henter, Jaime Lorenzo-Trueba, Xin Wang, Mariko Kondo, Junichi Yamagishi
    • Organizer
      第120回音声言語情報処理合同研究発表会
    • Place of Presentation
      筑波山江戸屋(茨城県・つくば市)
    • Year and Date
      2018-02-20
    • Related Report
      2017 Annual Research Report
  • [Presentation] Principles for learning controllable TTS from annotated and latent variation2017

    • Author(s)
      Gustav Eje Henter, Jaime Lorenzo-Trueba, Xin Wang, Junichi Yamagishi
    • Organizer
      Annual Conference of the International Speech Communication Association (Interspeech)
    • Place of Presentation
      Stockholm, Sweden
    • Year and Date
      2017-08-20
    • Related Report
      2017 Annual Research Report
    • Int'l Joint Research

URL: 

Published: 2017-04-28   Modified: 2018-12-17  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi