研究課題/領域番号 |
22K21319
|
研究種目 |
研究活動スタート支援
|
配分区分 | 基金 |
審査区分 |
1002:人間情報学、応用情報学およびその関連分野
|
研究機関 | 国立情報学研究所 |
研究代表者 |
Miao Xiaoxiao 国立情報学研究所, コンテンツ科学研究系, 特任研究員 (10962508)
|
研究期間 (年度) |
2022-08-31 – 2024-03-31
|
研究課題ステータス |
中途終了 (2022年度)
|
配分額 *注記 |
2,860千円 (直接経費: 2,200千円、間接経費: 660千円)
2023年度: 1,430千円 (直接経費: 1,100千円、間接経費: 330千円)
2022年度: 1,430千円 (直接経費: 1,100千円、間接経費: 330千円)
|
キーワード | speaker anonymization / language independent / gender netural / Speech processing / Speech privacy / Voice transformation / Anonymization / Deep learning |
研究開始時の研究の概要 |
Exposure of speech data without taking any measures would cause privacy issues. The goal of the project is to perform a user-centric approach to hide multiple privacy-related speech attributes including speaker identity, age, gender, and dialect information, leaving non-private attributes unchange.
|
研究実績の概要 |
The first year's work on speaker anonymization includes three parts: Part 1) Following our plan, we propose modifying the speaker embedding and pitch to conceal the speaker's gender. This approach has been accepted by ICASSP 2023. Our code and audio samples is available. (see https://github.com/ nii- yamagishilab/speaker_sex_attribute_privacy) Part 2) I contributed to the VoicePrivacy 2022 challenge and built a self-supervised learning (SSL) -based speaker anonymization system (SAS). It uses an SSL-based content encoder to extract general context representations regardless of the language of the input speech. This model is released for free (see https://www.voiceprivacychallenge.org). Part 3) We made changes to the SSL-based SAS program mentioned earlier, with the goal of making it easier for users to operate. We added several new features, including the ability to adjust pitch and select speakers to anonymize in a flexible way. This updated program was then applied to Japanese speech data. A broadcasting company in Japan has expressed interest in using this model for a real TV program.
|
現在までの達成度 (区分) |
現在までの達成度 (区分)
1: 当初の計画以上に進展している
理由
1) As planned for the first year, we modified speaker embedding and pitch to conceal the speaker's gender while preserving speech content. We tested it for gender recognition and downstream tasks and conducted listening tests. The related work has been accepted by ICASSP 2023. 2) We contributed to the VoicePrivacy 2022 challenge by developing a self-supervised learning (SSL) based speaker anonymization system (SAS). 3) The above SAS has been proven to be effective for Japanese speaker anonymization after updating several new features. There is a possibility that it may be used by a broadcasting company in the future.
|
今後の研究の推進方策 |
The original research plans were: 2nd year: anonymization of dialect, age, or other speaker-related information; 2) 3rd year: subjective and objective evaluations of our proposed approach to confirm that different privacy-related attributes can be successfully protected. After examining the results of the first year, we looked into protecting speaker privacy by considering gender as one of the attributes. For the second year, we will extended our latestet proposed system, called language-independent speaker anonymization approach to modify other speaker privacy-related attributes such as dialect and age. The advantage of this new framework is that it does not require language-dependent components like ASR, and can be easily used for unseen language speaker anonymization. Subsequently, we will execute the plan for the third year, which involves conducting both subjective and objective evaluations to conduct an in-depth analysis.
|