• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to project page

2018 Fiscal Year Final Research Report

Direct modeling of speech waveform using a DNN for text-to-speech synthesis

Research Project

  • PDF
Project/Area Number 16K16096
Research Category

Grant-in-Aid for Young Scientists (B)

Allocation TypeMulti-year Fund
Research Field Perceptual information processing
Research InstitutionNational Institute of Informatics

Principal Investigator

Takaki Shinji  国立情報学研究所, コンテンツ科学研究系, 特任助教 (50735090)

Project Period (FY) 2016-04-01 – 2019-03-31
Keywords音声合成 / DNN
Outline of Final Research Achievements

The purpose of this work is to realize text-to-speech synthesis based on direct modeling of speech waveform using a deep neural network. In this work, we exclude heuristic processing included in conventional text-to-speech synthesis. Modeling of amplitude spectra obtained by utilizing simple windowing and Fourier transform, modeling of spectra including phase information and direct modeling of speech waveform were investigated. We realized a direct modeling method of speech waveform for text-to-speech synthesis.

Free Research Field

音声情報処理

Academic Significance and Societal Importance of the Research Achievements

音声インターフェースの核となる技術であるテキスト音声合成の性能改善のため、Deep Neural Networkを用いた音声波形モデリングが盛んに研究されている。本課題では、非常に注目されているこの研究トピックについて取り組み、テキスト音声合成の性能改善を行った。テキスト音声合成を用いる既存のシステムの性能改善,性能改善に伴う応用アプリの普及等多くの波及効果を期待できる。

URL: 

Published: 2020-03-30  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi