歌声知覚を考慮した統計的歌声声質制御法に関する研究

Research Project

Project/Area Number	16J10726
Research Category	Grant-in-Aid for JSPS Fellows
Allocation Type	Single-year Grants
Section	国内
Research Field	Intelligent informatics
Research Institution	Nagoya University (2017) Nara Institute of Science and Technology (2016)
Principal Investigator	小林和弘名古屋大学, 情報基盤センター, 特別研究員(PD)
Project Period (FY)	2016-04-22 – 2018-03-31
Project Status	Completed (Fiscal Year 2017)
Budget Amount *help	¥1,900,000 (Direct Cost: ¥1,900,000) Fiscal Year 2017: ¥900,000 (Direct Cost: ¥900,000) Fiscal Year 2016: ¥1,000,000 (Direct Cost: ¥1,000,000)
Keywords	知覚情報 / sprocket / 歌声声質変換 / 歌声声質制御 / 知覚年齢 / 混合正規分布モデル
Outline of Annual Research Achievements	本年度の研究実績の概要は下記のとおりである． [フリーライセンスの声質変換・制御基盤フレームワーク”sprocket”の開発と公開]　統計的声質変換・制御法に関するオープンソースソフトウェアとして，sprocketの開発・公開を実施した．本ソフトウェアに関する解説論文を執筆した．Voice Conversion Challenge2018において，ベースラインシステムとして利用される事が決まっており，今後，幅広い活躍が期待される． [歌声声質変換法に関する論文の執筆と投稿]　差分スペクトル補正に基づく歌声声質変換の研究成果を論文として執筆し，Speech Communication誌へと投稿した． [知覚情報を考慮した統計的声質制御法に関する国際会議論文の執筆]　本研究課題の核となる手法として，統計的声質制御法における，声質制御パラメータの設計法に関する研究成果を国際会議論文にまとめた．本論文は，声質制御を担う声質制御ベクトル空間において，複数の声質制御パラメータの独立性を確保し，よりユーザの知覚に合致した声質制御が実現する手法を提案する論文である．本研究成果は，評価対象として音声を用いて実施しているが，歌声声質制御においても適用可能な枠組みである．今後，歌声声質制御にも適用し，その性能を評価する予定である． [WaveNet vocoderによる声質変換・制御品質の向上]　深層学習を利用した，音声波形生成技術の一つとしてWaveNetがある．本研究では，このWaveNetのネットワークアーキテクチャを応用する枠組みとして，F0，スペクトル包絡情報，非周期性指標を補助特徴量とし，音声波形を生成するWaveNetボコーダを提案した．本提案法により，従来のボコーダの枠組みに比べて，より高い音質を持つ音声波形の生成が可能となった．
Research Progress Status	29年度が最終年度であるため、記入しない。
Strategy for Future Research Activity	29年度が最終年度であるため、記入しない。

Report

(2 results)

2017 Annual Research Report
2016 Annual Research Report

Research Products
(13 results)

All 2018 2017 2016 Other

All Journal Article (3 results) (of which Int'l Joint Research: 3 results, Peer Reviewed: 3 results, Acknowledgement Compliant: 1 results) Presentation (7 results) (of which Int'l Joint Research: 6 results) Remarks (3 results)

[Journal Article] Intra-gender statistical singing voice conversion with direct waveform modification using log-spectral differential2018
- Author(s)
  K. Kobayashi, T. Toda, S. Nakamura
- Journal Title
  
  Speech Communication
  
  Volume: 99 Pages: 211-220
- Related Report
  2017 Annual Research Report
- Peer Reviewed / Int'l Joint Research
[Journal Article] Articulatory controllable speech modification based on statistical inversion and production mappings2017
- Author(s)
  P.L. Tobing, K. Kobayashi, T. Toda
- Journal Title
  
  IEEE Transactions on Audio, Speech and Language Processing
  
  Volume: 25
- Related Report
  2017 Annual Research Report
- Peer Reviewed / Int'l Joint Research
[Journal Article] Improvements of Voice Timbre Control Based on Perceived Age in Singing Voice Conversion2016
- Author(s)
  Kazuhiro Kobayashi, Tomoki Toda, Tomoyasu Nakano, Masataka Goto, Satoshi Nakamura
- Journal Title
  
  IEICE Transactions on Information and Systems
  
  Volume: E99.D Issue: 11 Pages: 2767-2777
- DOI
  10.1587/transinf.2016EDP7234
- NAID
  130005268277
- ISSN
  0916-8532, 1745-1361
- Related Report
  2016 Annual Research Report
- Peer Reviewed / Int'l Joint Research / Acknowledgement Compliant
[Presentation] 差分スペクトル補正に基づく声質変換におけるF0変換法の調査2017
- Author(s)
  小林和弘, 戸田智基, 中村哲
- Organizer
  日本音響学会春季研究発表会
- Place of Presentation
  明治大学生田キャンパス（神奈川県川崎市）
- Year and Date
  2017-03-09
- Related Report
  2016 Annual Research Report
[Presentation] Speaker-dependent WaveNet vocoder2017
- Author(s)
  A. Tamamori, K.Kobayashi, T. Hayashi, K. Takeda, T. Toda
- Organizer
  INTERSPEECH
- Related Report
  2017 Annual Research Report
- Int'l Joint Research
[Presentation] Statistical voice conversion with WaveNet-based waveform generation2017
- Author(s)
  K. Kobayashi, T. Hayashi, A. Tamamori, T. Toda
- Organizer
  INTERSPEECH
- Related Report
  2017 Annual Research Report
- Int'l Joint Research
[Presentation] An Investigation of how to design control parameters for statistical voice timbre control2017
- Author(s)
  K. Kubo, K. Kobayashi, T. Toda, G. Neubig, S. Sakti, S. Nakamura
- Organizer
  APSIPA
- Related Report
  2017 Annual Research Report
- Int'l Joint Research
[Presentation] F0 transformation techniques for statistical voice conversion with direct waveform modification with spectral differential2016
- Author(s)
  Kazuhiro Kobayashi, Tomoki Toda and Satoshi Nakamura
- Organizer
  Proc. SLT
- Place of Presentation
  San Diego, USA
- Year and Date
  2016-12-13
- Related Report
  2016 Annual Research Report
- Int'l Joint Research
[Presentation] Low delay statistical singing voice conversion with direct waveform modification based on spectral differential considering global variance2016
- Author(s)
  Kazuhiro Kobayashi, Tomoki Toda, Satoshi Nakamura
- Organizer
  5th Joint Meeting of the ASA and the ASJ
- Place of Presentation
  Hawaii, USA
- Year and Date
  2016-11-28
- Related Report
  2016 Annual Research Report
- Int'l Joint Research
[Presentation] The NU-NAIST voice conversion system for the Voice Conversion Challenge 20162016
- Author(s)
  Kazuhiro Kobayashi, Shinnosuke Takamichi, Tomoki Toda and Satoshi Nakamura
- Organizer
  Proc. INTERSPEECH
- Place of Presentation
  San Francisco, USA
- Year and Date
  2016-09-08
- Related Report
  2016 Annual Research Report
- Int'l Joint Research
[Remarks] 研究室HP
- URL
  https://www.toda.is.i.nagoya-u.ac.jp/publications_FY2017.html
- Related Report
  2017 Annual Research Report
[Remarks] 個人HP
- URL
  https://scholar.google.co.jp/citations?user=c-AwXZQAAAAJ&hl=ja
- Related Report
  2017 Annual Research Report
[Remarks] 知能コミュニケーション研究室のHP
- URL
  http://ahclab.naist.jp/index.html
- Related Report
  2016 Annual Research Report

歌声知覚を考慮した統計的歌声声質制御法に関する研究

Principal Investigator

小林 和弘 名古屋大学, 情報基盤センター, 特別研究員(PD)

¥1,900,000 (Direct Cost: ¥1,900,000)

Report

Research Products

[Journal Article] Intra-gender statistical singing voice conversion with direct waveform modification using log-spectral differential2018

Author(s)

Journal Title

Related Report

[Journal Article] Articulatory controllable speech modification based on statistical inversion and production mappings2017

Author(s)

Journal Title

Related Report

[Journal Article] Improvements of Voice Timbre Control Based on Perceived Age in Singing Voice Conversion2016

Author(s)

Journal Title

DOI

NAID

ISSN

Related Report

[Presentation] 差分スペクトル補正に基づく声質変換におけるF0変換法の調査2017

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] Speaker-dependent WaveNet vocoder2017

Author(s)

Organizer

Related Report

[Presentation] Statistical voice conversion with WaveNet-based waveform generation2017

Author(s)

Organizer

Related Report

[Presentation] An Investigation of how to design control parameters for statistical voice timbre control2017

Author(s)

Organizer

Related Report

[Presentation] F0 transformation techniques for statistical voice conversion with direct waveform modification with spectral differential2016

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] Low delay statistical singing voice conversion with direct waveform modification based on spectral differential considering global variance2016

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] The NU-NAIST voice conversion system for the Voice Conversion Challenge 20162016

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Remarks] 研究室HP

URL

Related Report

[Remarks] 個人HP

URL

Related Report

[Remarks] 知能コミュニケーション研究室のHP

URL

Related Report

小林和弘名古屋大学, 情報基盤センター, 特別研究員(PD)