2020 Fiscal Year Annual Research Report

Can we reduce misperceptions of emotional content of speech in the noisy environments?

Research Project

Project/Area Number	19K24373
Research Institution	National Institute of Informatics
Principal Investigator	Zhao Yi 国立情報学研究所, コンテンツ科学研究系, 特任研究員 (10843162)
Project Period (FY)	2019-08-30 – 2021-03-31
Keywords	VQVAE / emotional enhancement / neural networks / voice conversion / Lombard speech / Adversarial network
Outline of Annual Research Achievements	Under the real-life condition, people often need to express their emotions with appropriate speech in the noisy environments. In the past year, we mainly explored to reduce misperceptions of the emotional content of speech in the noisy environments. We found that VQ-VAE-based speech waveforms typically have inappropriate prosodic structure. Thus we introduced an important extension to VQ-VAE for learning F0-related suprasegmental information simultaneously along with phone features. We have published a conference paper on this work. We have tried to convert the emotional speech in the clean environment to the emotional speech with Lombard effect under the VQVAE. We have also investigated various adversarial networks to improve the emotional intelligibility of the decoded speech.

Research Products
(8 results)

All 2021 2020 Other

All Int'l Joint Research (4 results) Journal Article (3 results) (of which Int'l Joint Research: 3 results, Peer Reviewed: 3 results, Open Access: 3 results) Presentation (1 results) (of which Invited: 1 results)

[Int'l Joint Research] Massachusetts Institute of Technology(米国)
- Country Name
  U.S.A.
- Counterpart Institution
  Massachusetts Institute of Technology
[Int'l Joint Research] University of Edinburgh(英国)
- Country Name
  UNITED KINGDOM
- Counterpart Institution
  University of Edinburgh
[Int'l Joint Research] National University of Singapore(シンガポール)
- Country Name
  SINGAPORE
- Counterpart Institution
  National University of Singapore
[Int'l Joint Research] USTC(中国)
- Country Name
  CHINA
- Counterpart Institution
  USTC
[Journal Article] Improved Prosody from Learned F0 Codebook Representations for VQ-VAE Speech Waveform Reconstruction2020
- Author(s)
  Zhao Yi、Li Haoyu、Lai Cheng-I、Williams Jennifer、Cooper Erica、Yamagishi Junichi
- Journal Title
  
  Proc. Interspeech 2020
  
  Volume: 2020 Pages: 4417--4421
- DOI
  10.21437/Interspeech.2020-1615
- Peer Reviewed / Open Access / Int'l Joint Research
[Journal Article] Voice Conversion Challenge 2020: Intra-lingual semi-parallel and cross-lingual voice conversion2020
- Author(s)
  Zhao Yi, Wen-Chin Huang, Xiaohai Tian, Junichi Yamagishi, Rohan Kumar Das, Tomi Kinnunen, Zhen-Hua Ling, Tomoki Toda
- Journal Title
  
  Proc. Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020
  
  Volume: 2020 Pages: 80--98
- DOI
  10.21437/VCC_BC.2020-14
- Peer Reviewed / Open Access / Int'l Joint Research
[Journal Article] Predictions of Subjective Ratings and Spoofing Assessments of Voice Conversion Challenge 2020 Submissions2020
- Author(s)
  Rohan Kumar Das, Tomi Kinnunen, Wen-Chin Huang, Zhen-Hua Ling, Junichi Yamagishi, Zhao Yi, Xiaohai Tian, Tomoki Toda
- Journal Title
  
  Proc. Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020
  
  Volume: 2020 Pages: 99--120
- DOI
  10.21437/VCC_BC.2020-15
- Peer Reviewed / Open Access / Int'l Joint Research
[Presentation] Modeling and evaluation methods in current voice conversion tasks2021
- Author(s)
  Yi Zhao
- Organizer
  言語処理学会第27回年次大会
- Invited

2020 Fiscal Year Annual Research Report

Can we reduce misperceptions of emotional content of speech in the noisy environments?

Principal Investigator

Zhao Yi 国立情報学研究所, コンテンツ科学研究系, 特任研究員 (10843162)

Research Products

[Int'l Joint Research] Massachusetts Institute of Technology(米国)

Country Name

Counterpart Institution

[Int'l Joint Research] University of Edinburgh(英国)

Country Name

Counterpart Institution

[Int'l Joint Research] National University of Singapore(シンガポール)

Country Name

Counterpart Institution

[Int'l Joint Research] USTC(中国)

Country Name

Counterpart Institution

[Journal Article] Improved Prosody from Learned F0 Codebook Representations for VQ-VAE Speech Waveform Reconstruction2020

Author(s)

Journal Title

DOI

[Journal Article] Voice Conversion Challenge 2020: Intra-lingual semi-parallel and cross-lingual voice conversion2020

Author(s)

Journal Title

DOI

[Journal Article] Predictions of Subjective Ratings and Spoofing Assessments of Voice Conversion Challenge 2020 Submissions2020

Author(s)

Journal Title

DOI

[Presentation] Modeling and evaluation methods in current voice conversion tasks2021

Author(s)

Organizer