2022 Fiscal Year Research-status Report

Speech privacy protection by high-quality, invertible, and extendable speech anonymization and de-anonymization

Research Project

Project/Area Number	21K17775
Research Institution	National Institute of Informatics
Principal Investigator	Wang Xin 国立情報学研究所, コンテンツ科学研究系, 特任助教 (60843141)
Project Period (FY)	2021-04-01 – 2024-03-31
Keywords	speech privacy / speaker anonymization / speech waveform modeling / neural network / deep learning
Outline of Annual Research Achievements	The second year's work consists of three parts: Part 1) Based on the previous year's work, the second VoicePrivacy challenge was organized by us and other universities. We defined new evaluation frameworks and conducted solid evaluations. In addition to many findings, we found that the new baseline, which was the research outcome of the previous year, outperformed the legacy baseline. We also saw submissions that outperformed the new baseline, which indicates the advancement of the research field brought by the VoicePrivacy challenge. Part 2) Based on the framework of the voice privacy challenge, we did a deep analysis of the common approaches to generating anonymized speaker identity representation (i.e., pseudo speaker embedding). Through a large-scale experiment, we identified good strategies to choose and assign the pseudo-speaker, including random gender selection and utterance-level anonymization. We also found that a simple percentile-based pitch conversion reduced the risk against the strongest (Semi-Informed) attacker. These findings were published in a top IEEE journal. Part 3) We followed the research plan and extended the language-independent speaker anonymization framework. Although the framework is language-independent, its performance degrades when processing unseen languages. We found that using multilingual training data for the waveform generator was helpful. We also proposed a correlation-alignment-based strategy to alleviate channel mismatch. Additionally, we extended the framework to hide gender information. Both works were published in top conferences.
Current Status of Research Progress	Current Status of Research Progress 2: Research has progressed on the whole more than it was originally planned. Reason The efforts of the VoicePrivacy Challenge 2022 produced good outcomes. The challenge attracted 43 registered teams from 17 countries, which led to 16 successful submissions. We also organized a special session in the Interspeech 2022 satellite workshop and had presentations from participants and ourselves. The results are released on VoicePrivacy Challenge's official website: https://www.voiceprivacychallenge.org/results-2022/. The experimental study analyzing the shortcomings and optimal strategy for speaker anonymization under (Part 2 of the research outcome) was published in a top IEEE journal. We followed the research plan and investigated the language-independent speaker anonymization framework (Part 3 of the research outcome), and the work was accepted by the Interspeech 2022 conference (CORE rank A) and ICASSP 2023 conference (CORE rank B).
Strategy for Future Research Activity	Following the research plan made in the previous year, we will work on the language-independent speaker anonymization framework. Although it performs well in different languages (research outcome of 1st year) and other speaker attributes (Part 3 of the research outcome), there are issues left: 1) The quality of the anonymized voice is still inferior to the natural voice. Findings from the research outcome (Part 2) indicate that the selection-based generate pseudo speaker embedding is one bottleneck. We plan to investigate generative approaches for better performance. 2) The optimization of the speaker anonymization framework lacks a solid mathematical description. We plan to derive a unified mathematical description to consider multiple goals of the optimization and improve the current framework accordingly. The final year research plan also includes work on the VoicePrivacy Challenge series: 1) post-challenge analysis on VoicePrivacy Challenge 2022 and how the progress of the research field has been made since the previous challenge. 2) whether stronger attacker models can recognize the speaker identity in the anonymized speech waveforms.
Causes of Carryover	The budget planned for buying GPU devices was not executed because of the price increase in the market. However, we used the budget to afford the fees to attend International conferences and present research outcomes. The budget remaining will be used for attending international conferences and paying a few listening tests that evaluate the quality of anonymized voice and so on.

Research Products
(7 results)

All 2023 2022 Other

All Int'l Joint Research (1 results) Journal Article (1 results) (of which Int'l Joint Research: 1 results, Peer Reviewed: 1 results, Open Access: 1 results) Presentation (3 results) (of which Int'l Joint Research: 3 results, Invited: 1 results) Remarks (2 results)

[Int'l Joint Research] Avignon University/Inria/University of Lorraine(フランス)
- Country Name
  FRANCE
- Counterpart Institution
  Avignon University/Inria/University of Lorraine
- # of Other Institutions
  1
[Journal Article] Privacy and Utility of X-Vector Based Speaker Anonymization2022
- Author(s)
  Srivastava Brij Mohan Lal、Maouche Mohamed、Sahidullah Md、Vincent Emmanuel、Bellet Aurelien、Tommasi Marc、Tomashenko Natalia、Wang Xin、Yamagishi Junichi
- Journal Title
  
  IEEE/ACM Transactions on Audio, Speech, and Language Processing
  
  Volume: 30 Pages: 2383～2395
- DOI
  10.1109/TASLP.2022.3190741
- Peer Reviewed / Open Access / Int'l Joint Research
[Presentation] Hiding Speaker’s Sex in Speech Using Zero-Evidence Speaker Representation in an Analysis/Synthesis Pipeline2023
- Author(s)
  Paul-Gauthier Noe, Xiaoxiao Miao, Xin Wang, Junichi Yamagishi, Jean-Francois Bonastre, and Driss Matrouf
- Organizer
  ICASSP 2023
- Int'l Joint Research
[Presentation] Analyzing Language-Independent Speaker Anonymization Framework under Unseen Conditions2022
- Author(s)
  Xiaoxiao Miao, Xin Wang, Erica Cooper, Junichi Yamagishi, and Natalia Tomashenko
- Organizer
  Interspeech 2022
- Int'l Joint Research
[Presentation] Tutorial on speaker anonymization (software part)2022
- Author(s)
  Xin Wang
- Organizer
  2nd Symposium on Security and Privacy in Speech Communication joined with 2nd VoicePrivacy Challenge Workshop
- Int'l Joint Research / Invited
[Remarks] VoicePrivacy Challenge 2022 results and outcomes
- URL
  https://www.voiceprivacychallenge.org/results-2022/
[Remarks] Tutorial on speaker anonymization (software)
- URL
  https://colab.research.google.com/drive/1_zRL_f9iyDvl_5Y2Rdakg0hYAl_5Rgyq

2022 Fiscal Year Research-status Report

Speech privacy protection by high-quality, invertible, and extendable speech anonymization and de-anonymization

Principal Investigator

Wang Xin 国立情報学研究所, コンテンツ科学研究系, 特任助教 (60843141)

Current Status of Research Progress

Reason

Research Products

[Int'l Joint Research] Avignon University/Inria/University of Lorraine(フランス)

Country Name

Counterpart Institution

# of Other Institutions

[Journal Article] Privacy and Utility of X-Vector Based Speaker Anonymization2022

Author(s)

Journal Title

DOI

[Presentation] Hiding Speaker’s Sex in Speech Using Zero-Evidence Speaker Representation in an Analysis/Synthesis Pipeline2023

Author(s)

Organizer

[Presentation] Analyzing Language-Independent Speaker Anonymization Framework under Unseen Conditions2022

Author(s)

Organizer

[Presentation] Tutorial on speaker anonymization (software part)2022

Author(s)

Organizer

[Remarks] VoicePrivacy Challenge 2022 results and outcomes

URL

[Remarks] Tutorial on speaker anonymization (software)

URL