• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to project page

2020 Fiscal Year Final Research Report

Encoder Factorization for Capturing Dialect and Articulation Level in End-to-End Speech Synthesis

Research Project

  • PDF
Project/Area Number 19K24372
Research Category

Grant-in-Aid for Research Activity Start-up

Allocation TypeMulti-year Fund
Review Section 1002:Human informatics, applied informatics and related fields
Research InstitutionNational Institute of Informatics

Principal Investigator

Cooper Erica  国立情報学研究所, コンテンツ科学研究系, 特任助教 (30843156)

Project Period (FY) 2019-08-30 – 2021-03-31
KeywordsSpeech synthesis / Speaker modeling / Deep learning / Neural network
Outline of Final Research Achievements

Synthesizing speech in many voices and styles has long been a goal in speech research. While current state-of-the-art synthesizers can produce very natural sounding speech, matching the voice of a target speaker when only a small amount of that speaker's data is available is still a challenge, especially for characteristics such as dialect. We conducted experiments to determine what kind of speaker embeddings work best for synthesis in the voice of a new speaker, and found that Learnable Dictionary Encoding (LDE) based speaker representations worked well, based on a crowdsourced listening test. We also found that similarly obtaining LDE-based dialect representations helped to improve the dialect of the synthesized speech. Finally, we explored data augmentation techniques using both artificially modified data as well as real data from non-ideal recording conditions, and found that including the found data in model training could further improve naturalness of synthesized speech.

Free Research Field

Text-to-speech synthesis

Academic Significance and Societal Importance of the Research Achievements

本課題では、end-to-end音声合成における合成音声の話者性や方言再現性の向上のため、エンコーダの因子を制御する方法を調査した。話者の個性や特性をより適切に再現することにより、より多くの目標話者を音声合成システムにおいて利用することが可能になり、技術の応用先が広がると期待される。

URL: 

Published: 2022-01-27  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi