同時通訳のための音声合成に関する研究

Research Project

Project/Area Number	14J10354
Research Category	Grant-in-Aid for JSPS Fellows
Allocation Type	Single-year Grants
Section	国内
Research Field	Intelligent informatics
Research Institution	Nara Institute of Science and Technology
Principal Investigator	高道慎之介奈良先端科学技術大学院大学, 情報科学研究科, 特別研究員(DC2)
Project Period (FY)	2014-04-25 – 2016-03-31
Project Status	Completed (Fiscal Year 2015)
Budget Amount *help	¥1,600,000 (Direct Cost: ¥1,600,000) Fiscal Year 2015: ¥700,000 (Direct Cost: ¥700,000) Fiscal Year 2014: ¥900,000 (Direct Cost: ¥900,000)
Keywords	音声合成 / 同時通訳 / HMM音声合成 / GMM声質変換 / 統計的パラメトリック音声合成
Outline of Annual Research Achievements	本年度は，① 高速かつ高音質な音声合成・声質変換の実現，及び，②元話者の声質をできる限り反映する英語音声合成に着手した． ①は，同時通訳の実現に必要不可欠な技術である．現在の主流である統計的音声合成は，少ない計算時間で音声を合成する利点を持つが，著しく音質の低い音声を生成する．この音質劣化問題に対して，我々は昨年度までに，変調スペクトルに基づくフィルタ法と音声パラメータ生成法を提案し，有効性を確認した．しかしながら，これらの手法により生成時間の増加は免れない．そこで，本年度は，変調スペクトル制約の下で音声合成器を学習する手法を提案した．実験的評価により，従来の計算時間を保持しながら，変調スペクトルの考慮による音質改善効果を得られることを確認した．また，変調スペクトルの効果を多言語音声合成において確認するため，インド言語の音声を合成する国際コンペティションに参加した．その結果，いくつかの言語において，変調スペクトルに基づく手法が世界最高品質だと評価された． ②は，同時通訳で生成される合成音声に，元話者（例えば，講演者）の声質を反映する技術である．ある言語の話者の声質を別言語（ターゲット言語）の音声に反映するクロスリンガル音声合成が従来存在するが，合成音声の声質は，元話者の声質と大きく異なる．そこで我々は，声質をできるだけ反映する方法として，元話者の発話した非流暢なターゲット言語音声を用いて，ターゲット言語の合成音声を生成する方法を提案した．本年度は，元話者を日本語話者，ターゲット言語を英語に絞り研究を実施した．実験的評価の結果，非流暢な英語音声の音韻・韻律的特徴を補正することで，声質を保存したまま，自然な英語音声を合成できることを確認した．
Research Progress Status	27年度が最終年度であるため、記入しない。
Strategy for Future Research Activity	27年度が最終年度であるため、記入しない。

Report

(2 results)

2015 Annual Research Report
2014 Annual Research Report

Research Products
(26 results)

All 2016 2015 2014 Other

All Int'l Joint Research (1 results) Journal Article (3 results) (of which Int'l Joint Research: 2 results, Peer Reviewed: 2 results, Acknowledgement Compliant: 2 results, Open Access: 1 results) Presentation (20 results) (of which Int'l Joint Research: 5 results) Remarks (2 results)

[Int'l Joint Research] Carnegie Mellon University(米国)
- Related Report
  2015 Annual Research Report
[Journal Article] Post-filters to Modify the Modulation Spectrum for Statistical Parametric Speech Synthesis2016
- Author(s)
  Shinnosuke Takamichi, Tomoki Toda, Alan W. Black, Graham Neubig, Sakriani Sakti, and Satoshi Nakamura
- Journal Title
  
  IEEE Transactions on Audio, Speech, and Language Processing
  
  Volume: 24 Issue: 4 Pages: 755-767
- DOI
  10.1109/taslp.2016.2522655
- Related Report
  2015 Annual Research Report
- Peer Reviewed / Int'l Joint Research / Acknowledgement Compliant
[Journal Article] Post-Filter Using Modulation Spectrum as a Metric to Quantify Over-Smoothing Effects in Statistical Parametric Speech Synthesis2015
- Author(s)
  Shinnosuke Takamichi, Tomoki Toda, Alan W. Black, and Satoshi Nakamura
- Journal Title
  
  APSIPA newsletter
  
  Volume: 9 Pages: 14-16
- Related Report
  2015 Annual Research Report 2014 Annual Research Report
- Peer Reviewed / Open Access / Int'l Joint Research / Acknowledgement Compliant
[Journal Article] コーヒーブレーク： z変換の概念と考え方を教えてください2014
- Author(s)
  高道慎之介
- Journal Title
  
  日本音響学会誌
  
  Volume: 70
- Related Report
  2014 Annual Research Report
[Presentation] The NAIST Text-to-Speech System for the Blizzard Challenge 20152015
- Author(s)
  Shinnosuke Takamichi, Kazuhiro Kobayashi, Kou Tanaka, Tomoki Toda, and Satoshi Nakamura
- Organizer
  Blizzard Challenge Workshop
- Place of Presentation
  Berlin, Germany
- Year and Date
  2015-09-11
- Related Report
  2015 Annual Research Report
- Int'l Joint Research
[Presentation] Non-native Speech Synthesis Preserving Speaker Individuality Based on Partial Correction of Prosodic and Phonetic Characteristics2015
- Author(s)
  Yuji Oshima, Shinnosuke Takamichi, Tomoki Toda, Graham Neubig, Sakriani Sakti, and Satoshi Nakamura
- Organizer
  INTERSPEECH
- Place of Presentation
  Dresden, Germany
- Year and Date
  2015-09-06
- Related Report
  2015 Annual Research Report
- Int'l Joint Research
[Presentation] Modulation Spectrum-Constrained Trajectory Training Algorithm for HMM-Based Speech Synthesis2015
- Author(s)
  Shinnosuke Takamichi, Tomoki Toda, Alan W. Black, and Satoshi Nakamura
- Organizer
  INTERSPEECH
- Place of Presentation
  Dresden, Germany
- Year and Date
  2015-09-06
- Related Report
  2015 Annual Research Report
- Int'l Joint Research
[Presentation] 統計的パラメトリック音声合成のための変調スペクトルに基づく音質改善法2015
- Author(s)
  高道慎之介, 戸田智基, Alan W. Black, 中村哲
- Organizer
  情報処理学会
- Place of Presentation
  東京都調布市電気通信大学
- Year and Date
  2015-05-23
- Related Report
  2015 Annual Research Report
[Presentation] Modulation Spectrum-Constrained Trajectory Training for GMM-Based Voice Conversion2015
- Author(s)
  Shinnosuke Takamichi, Tomoki Toda, Alan W. Black, and Satoshi Nakamura
- Organizer
  ICASSP
- Place of Presentation
  Australia
- Year and Date
  2015-04-19 – 2015-04-24
- Related Report
  2014 Annual Research Report
[Presentation] Parameter generation algorithm considering modulation spectrum for HMM-based speech synthesis2015
- Author(s)
  Shinnosuke Takamichi, Tomoki Toda, Alan W. Black, and Satoshi Nakamura
- Organizer
  ICASSP
- Place of Presentation
  Australia
- Year and Date
  2015-04-19 – 2015-04-24
- Related Report
  2014 Annual Research Report
[Presentation] Parameter generation algorithm considering modulation spectrum for HMM-based speech synthesis2015
- Author(s)
  Shinnosuke Takamichi, Tomoki Toda, Alan W. Black, and Satoshi Nakamura
- Organizer
  ICASSP
- Place of Presentation
  Brisbane, Austraria
- Year and Date
  2015-04-19
- Related Report
  2015 Annual Research Report
- Int'l Joint Research
[Presentation] Modulation Spectrum-Constrained Trajectory Training for GMM-Based Voice Conversion2015
- Author(s)
  Shinnosuke Takamichi, Tomoki Toda, Alan W. Black, and Satoshi Nakamura
- Organizer
  ICASSP
- Place of Presentation
  Brisbane, Austraria
- Year and Date
  2015-04-19
- Related Report
  2015 Annual Research Report
- Int'l Joint Research
[Presentation] 統計的パラメトリック音声合成における変調スペクトルを考慮したパラメータ生成法2015
- Author(s)
  高道慎之介, 戸田智基, Alan W. Black, 中村哲
- Organizer
  ASJ
- Place of Presentation
  東京
- Year and Date
  2015-03-16 – 2015-03-18
- Related Report
  2014 Annual Research Report
[Presentation] 統計的パラメトリック音声合成における変調スペクトル制約付きトラジェクトリ学習2015
- Author(s)
  高道慎之介, 戸田智基, Alan W. Black, 中村哲
- Organizer
  ASJ
- Place of Presentation
  東京
- Year and Date
  2015-03-16 – 2015-03-18
- Related Report
  2014 Annual Research Report
[Presentation] 日本人英語音声合成における話者性を保持した韻律補正法と英語習熟度が与える影響2015
- Author(s)
  大島悠司, 高道慎之介, 戸田智基, Graham Neubig, Sakriani Sakti, 中村哲
- Organizer
  ASJ
- Place of Presentation
  東京
- Year and Date
  2015-03-16 – 2015-03-18
- Related Report
  2014 Annual Research Report
[Presentation] 非母国語話者の外国語音声に対する継続長補正の評価2015
- Author(s)
  倶羅真也, 高道慎之介, 戸田智基, Graham Neubig, Sakriani Sakti, 中村哲
- Organizer
  ASJ
- Place of Presentation
  東京
- Year and Date
  2015-03-16 – 2015-03-18
- Related Report
  2014 Annual Research Report
[Presentation] 統計的パラメトリック音声合成のための変調スペクトル制約付きトラジェクトリ学習アルゴリズム2015
- Author(s)
  高道慎之介, 戸田智基, Alan W. Black, 中村哲
- Organizer
  IEICE
- Place of Presentation
  沖縄
- Year and Date
  2015-03-02 – 2015-03-03
- Related Report
  2014 Annual Research Report
[Presentation] 統計的パラメトリック音声合成のための変調スペクトルを考慮した音声パラメータ生成アルゴリズム2015
- Author(s)
  高道慎之介, 戸田智基, Alan W. Black, 中村哲
- Organizer
  ISPJ
- Place of Presentation
  三重
- Year and Date
  2015-02-27 – 2015-02-28
- Related Report
  2014 Annual Research Report
[Presentation] 韻律・音韻の部分補正に基づく話者性を保持した日本人英語音声合成と英語習熟度が与える影響2015
- Author(s)
  大島悠司, 高道慎之介, 戸田智基, Graham Neubig, Sakriani Sakti, 中村哲
- Organizer
  IPSJ
- Place of Presentation
  三重
- Year and Date
  2015-02-27 – 2015-02-28
- Related Report
  2014 Annual Research Report
[Presentation] HMMを用いた日本人英語音声合成における話者性を保持した韻律補正," 電子情報通信学会技術研究報告2014
- Author(s)
  大島悠司, 高道慎之介, 戸田智基, Graham Neubig, Sakriani Sakti, 中村哲
- Organizer
  IEICE
- Place of Presentation
  東京
- Year and Date
  2014-12-15 – 2014-12-16
- Related Report
  2014 Annual Research Report
[Presentation] Modulation Spectrum-based Post-filter for GMM-based Voice Conversion2014
- Author(s)
  Shinnosuke Takamichi, Tomoki Toda, Alan W. Black, and Satoshi Nakamura
- Organizer
  APSIPA ASC
- Place of Presentation
  Cambodia
- Year and Date
  2014-12-10 – 2014-12-12
- Related Report
  2014 Annual Research Report
[Presentation] Modified Modulation Spectrum-based Post-filter for HMM-based Speech Synthesis2014
- Author(s)
  Shinnosuke Takamichi, Tomoki Toda, Alan W. Black, and Satoshi Nakamura
- Organizer
  GlobalSIP
- Place of Presentation
  USA
- Year and Date
  2014-12-03 – 2014-12-05
- Related Report
  2014 Annual Research Report
[Presentation] 日本人英語音声合成における話者性を保持した韻律補正2014
- Author(s)
  大島悠司, 高道慎之介, 戸田智基, Graham Neubig, Sakriani Sakti, 中村哲
- Organizer
  ASJ
- Place of Presentation
  北海道
- Year and Date
  2014-09-03 – 2014-09-05
- Related Report
  2014 Annual Research Report
[Presentation] A Postfilter to Modify The Modulation Spectrum in HMM-based Speech Synthesis2014
- Author(s)
  Shinnosuke Takamichi, Tomoki Toda, Graham Neubig, Sakriani Sakti, and Satoshi Nakamura
- Organizer
  ICASSP
- Place of Presentation
  Italy
- Year and Date
  2014-05-04 – 2014-05-09
- Related Report
  2014 Annual Research Report
[Remarks] Speech synthesis for language learning
- URL
  https://sites.google.com/site/shinnosuketakamichi/research-topics/erj-tts
- Related Report
  2015 Annual Research Report
[Remarks] Blizzard Challenge 2015
- URL
  https://sites.google.com/site/shinnosuketakamichi/research-topics/blizzard-challenge-2015
- Related Report
  2015 Annual Research Report

同時通訳のための音声合成に関する研究

Principal Investigator

高道 慎之介 奈良先端科学技術大学院大学, 情報科学研究科, 特別研究員(DC2)

¥1,600,000 (Direct Cost: ¥1,600,000)

Report

Research Products

[Int'l Joint Research] Carnegie Mellon University(米国)

Related Report

[Journal Article] Post-filters to Modify the Modulation Spectrum for Statistical Parametric Speech Synthesis2016

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Post-Filter Using Modulation Spectrum as a Metric to Quantify Over-Smoothing Effects in Statistical Parametric Speech Synthesis2015

Author(s)

Journal Title

Related Report

[Journal Article] コーヒーブレーク： z変換の概念と考え方を教えてください2014

Author(s)

Journal Title

Related Report

[Presentation] The NAIST Text-to-Speech System for the Blizzard Challenge 20152015

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] Non-native Speech Synthesis Preserving Speaker Individuality Based on Partial Correction of Prosodic and Phonetic Characteristics2015

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] Modulation Spectrum-Constrained Trajectory Training Algorithm for HMM-Based Speech Synthesis2015

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 統計的パラメトリック音声合成のための変調スペクトルに基づく音質改善法2015

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] Modulation Spectrum-Constrained Trajectory Training for GMM-Based Voice Conversion2015

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] Parameter generation algorithm considering modulation spectrum for HMM-based speech synthesis2015

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] Parameter generation algorithm considering modulation spectrum for HMM-based speech synthesis2015

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] Modulation Spectrum-Constrained Trajectory Training for GMM-Based Voice Conversion2015

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 統計的パラメトリック音声合成における変調スペクトルを考慮したパラメータ生成法2015

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 統計的パラメトリック音声合成における変調スペクトル制約付きトラジェクトリ学習2015

Author(s)

Organizer

Place of Presentation

Year and Date

高道慎之介奈良先端科学技術大学院大学, 情報科学研究科, 特別研究員(DC2)