Language-independent speaker anonymization with multiple privacy-related attributes
Project/Area Number |
22K21319
|
Research Category |
Grant-in-Aid for Research Activity Start-up
|
Allocation Type | Multi-year Fund |
Review Section |
1002:Human informatics, applied informatics and related fields
|
Research Institution | National Institute of Informatics |
Principal Investigator |
Miao Xiaoxiao 国立情報学研究所, コンテンツ科学研究系, 特任研究員 (10962508)
|
Project Period (FY) |
2022-08-31 – 2024-03-31
|
Project Status |
Discontinued (Fiscal Year 2023)
|
Budget Amount *help |
¥2,860,000 (Direct Cost: ¥2,200,000、Indirect Cost: ¥660,000)
Fiscal Year 2023: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000)
Fiscal Year 2022: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000)
|
Keywords | OHNN / VoicePAT / SynVox2 / speaker anonymization / language independent / gender netural / Speech processing / Speech privacy / Voice transformation / Anonymization / Deep learning |
Outline of Research at the Start |
Exposure of speech data without taking any measures would cause privacy issues. The goal of the project is to perform a user-centric approach to hide multiple privacy-related speech attributes including speaker identity, age, gender, and dialect information, leaving non-private attributes unchange.
|
Outline of Annual Research Achievements |
Part 1) Improved anonymizer: we proposed orthogonal Householder neural network (OHNN)-based anonymizer that rotates the original speaker vectors to anonymized ones to maintain the diversity and strengthen privacy protection. The related work has been accepted to IEEE/ACM Transactions on Audio, Speech, and Language Processing.
Part 2) User-friendly voice anonymization framework: Along with the popularity of speaker anonymization topic is increasing, the comparison and combination of different anonymization approaches remains challenging due to the complexity of evaluation and the absence of user-friendly research frameworks. We therefore propose an efficient speaker anonymization and evaluation framework based on a modular and easily extendable structure called VoicePAT. Our code is fully open source. Related work has been submitted to OJSP.
Part 3) ASV speech dataset anonymization: The legal and ethical concerns has led to the withdrawal of the widely-used VoxCeleb2 dataset for speaker recognition, we employ the our proposed OHNN-based speaker anonymization technique to create a privacy-friendly VoxCeleb2 dataset called SynVox2. In addition, we define several metrics for evaluating the use of SynVox2 in terms of privacy, utility, and fairness. These metrics may serve as a protocol for future research, enabling researchers to assess whether a synthetic dataset is suitable for their ASV research. Furthermore, we discuss the challenges of using synthetic data for the downstream task of speaker verification. Related work has been submitted to ICASSP2024.
|
Report
(2 results)
Research Products
(9 results)