Language-independent speaker anonymization with multiple privacy-related attributes
Project/Area Number |
22K21319
|
Research Category |
Grant-in-Aid for Research Activity Start-up
|
Allocation Type | Multi-year Fund |
Review Section |
1002:Human informatics, applied informatics and related fields
|
Research Institution | National Institute of Informatics |
Principal Investigator |
Miao Xiaoxiao 国立情報学研究所, コンテンツ科学研究系, 特任研究員 (10962508)
|
Project Period (FY) |
2022-08-31 – 2024-03-31
|
Project Status |
Discontinued (Fiscal Year 2022)
|
Budget Amount *help |
¥2,860,000 (Direct Cost: ¥2,200,000、Indirect Cost: ¥660,000)
Fiscal Year 2023: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000)
Fiscal Year 2022: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000)
|
Keywords | speaker anonymization / language independent / gender netural / Speech processing / Speech privacy / Voice transformation / Anonymization / Deep learning |
Outline of Research at the Start |
Exposure of speech data without taking any measures would cause privacy issues. The goal of the project is to perform a user-centric approach to hide multiple privacy-related speech attributes including speaker identity, age, gender, and dialect information, leaving non-private attributes unchange.
|
Outline of Annual Research Achievements |
The first year's work on speaker anonymization includes three parts: Part 1) Following our plan, we propose modifying the speaker embedding and pitch to conceal the speaker's gender. This approach has been accepted by ICASSP 2023. Our code and audio samples is available. (see https://github.com/ nii- yamagishilab/speaker_sex_attribute_privacy) Part 2) I contributed to the VoicePrivacy 2022 challenge and built a self-supervised learning (SSL) -based speaker anonymization system (SAS). It uses an SSL-based content encoder to extract general context representations regardless of the language of the input speech. This model is released for free (see https://www.voiceprivacychallenge.org). Part 3) We made changes to the SSL-based SAS program mentioned earlier, with the goal of making it easier for users to operate. We added several new features, including the ability to adjust pitch and select speakers to anonymize in a flexible way. This updated program was then applied to Japanese speech data. A broadcasting company in Japan has expressed interest in using this model for a real TV program.
|
Current Status of Research Progress |
Current Status of Research Progress
1: Research has progressed more than it was originally planned.
Reason
1) As planned for the first year, we modified speaker embedding and pitch to conceal the speaker's gender while preserving speech content. We tested it for gender recognition and downstream tasks and conducted listening tests. The related work has been accepted by ICASSP 2023. 2) We contributed to the VoicePrivacy 2022 challenge by developing a self-supervised learning (SSL) based speaker anonymization system (SAS). 3) The above SAS has been proven to be effective for Japanese speaker anonymization after updating several new features. There is a possibility that it may be used by a broadcasting company in the future.
|
Strategy for Future Research Activity |
The original research plans were: 2nd year: anonymization of dialect, age, or other speaker-related information; 2) 3rd year: subjective and objective evaluations of our proposed approach to confirm that different privacy-related attributes can be successfully protected. After examining the results of the first year, we looked into protecting speaker privacy by considering gender as one of the attributes. For the second year, we will extended our latestet proposed system, called language-independent speaker anonymization approach to modify other speaker privacy-related attributes such as dialect and age. The advantage of this new framework is that it does not require language-dependent components like ASR, and can be easily used for unseen language speaker anonymization. Subsequently, we will execute the plan for the third year, which involves conducting both subjective and objective evaluations to conduct an in-depth analysis.
|
Report
(1 results)
Research Products
(6 results)