2020 年度実績報告書

Can we reduce misperceptions of emotional content of speech in the noisy environments?

研究課題

研究課題/領域番号	19K24373
研究機関	国立情報学研究所
研究代表者	Zhao Yi 国立情報学研究所, コンテンツ科学研究系, 特任研究員 (10843162)
研究期間 (年度)	2019-08-30 – 2021-03-31
キーワード	VQVAE / emotional enhancement / neural networks / voice conversion / Lombard speech / Adversarial network
研究実績の概要	Under the real-life condition, people often need to express their emotions with appropriate speech in the noisy environments. In the past year, we mainly explored to reduce misperceptions of the emotional content of speech in the noisy environments. We found that VQ-VAE-based speech waveforms typically have inappropriate prosodic structure. Thus we introduced an important extension to VQ-VAE for learning F0-related suprasegmental information simultaneously along with phone features. We have published a conference paper on this work. We have tried to convert the emotional speech in the clean environment to the emotional speech with Lombard effect under the VQVAE. We have also investigated various adversarial networks to improve the emotional intelligibility of the decoded speech.

研究成果

(8件)

すべて 2021 2020 その他

すべて国際共同研究 (4件) 雑誌論文 (3件) (うち国際共著 3件、査読あり 3件、オープンアクセス 3件) 学会発表 (1件) (うち招待講演 1件)

[国際共同研究] Massachusetts Institute of Technology(米国)
- 国名
  米国
- 外国機関名
  Massachusetts Institute of Technology
[国際共同研究] University of Edinburgh(英国)
- 国名
  英国
- 外国機関名
  University of Edinburgh
[国際共同研究] National University of Singapore(シンガポール)
- 国名
  シンガポール
- 外国機関名
  National University of Singapore
[国際共同研究] USTC(中国)
- 国名
  中国
- 外国機関名
  USTC
[雑誌論文] Improved Prosody from Learned F0 Codebook Representations for VQ-VAE Speech Waveform Reconstruction2020
- 著者名/発表者名
  Zhao Yi、Li Haoyu、Lai Cheng-I、Williams Jennifer、Cooper Erica、Yamagishi Junichi
- 雑誌名
  
  Proc. Interspeech 2020
  
  巻: 2020 ページ: 4417--4421
- DOI
  10.21437/Interspeech.2020-1615
- 査読あり / オープンアクセス / 国際共著
[雑誌論文] Voice Conversion Challenge 2020: Intra-lingual semi-parallel and cross-lingual voice conversion2020
- 著者名/発表者名
  Zhao Yi, Wen-Chin Huang, Xiaohai Tian, Junichi Yamagishi, Rohan Kumar Das, Tomi Kinnunen, Zhen-Hua Ling, Tomoki Toda
- 雑誌名
  
  Proc. Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020
  
  巻: 2020 ページ: 80--98
- DOI
  10.21437/VCC_BC.2020-14
- 査読あり / オープンアクセス / 国際共著
[雑誌論文] Predictions of Subjective Ratings and Spoofing Assessments of Voice Conversion Challenge 2020 Submissions2020
- 著者名/発表者名
  Rohan Kumar Das, Tomi Kinnunen, Wen-Chin Huang, Zhen-Hua Ling, Junichi Yamagishi, Zhao Yi, Xiaohai Tian, Tomoki Toda
- 雑誌名
  
  Proc. Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020
  
  巻: 2020 ページ: 99--120
- DOI
  10.21437/VCC_BC.2020-15
- 査読あり / オープンアクセス / 国際共著
[学会発表] Modeling and evaluation methods in current voice conversion tasks2021
- 著者名/発表者名
  Yi Zhao
- 学会等名
  言語処理学会第27回年次大会
- 招待講演

2020 年度 実績報告書

Can we reduce misperceptions of emotional content of speech in the noisy environments?

研究代表者

Zhao Yi 国立情報学研究所, コンテンツ科学研究系, 特任研究員 (10843162)

研究成果

[国際共同研究] Massachusetts Institute of Technology(米国)

国名

外国機関名

[国際共同研究] University of Edinburgh(英国)

国名

外国機関名

[国際共同研究] National University of Singapore(シンガポール)

国名

外国機関名

[国際共同研究] USTC(中国)

国名

外国機関名

[雑誌論文] Improved Prosody from Learned F0 Codebook Representations for VQ-VAE Speech Waveform Reconstruction2020

著者名/発表者名

雑誌名

DOI

[雑誌論文] Voice Conversion Challenge 2020: Intra-lingual semi-parallel and cross-lingual voice conversion2020

著者名/発表者名

雑誌名

DOI

[雑誌論文] Predictions of Subjective Ratings and Spoofing Assessments of Voice Conversion Challenge 2020 Submissions2020

著者名/発表者名

雑誌名

DOI

[学会発表] Modeling and evaluation methods in current voice conversion tasks2021

著者名/発表者名

学会等名

2020 年度実績報告書