Project/Area Number |
21K17775
|
Research Category |
Grant-in-Aid for Early-Career Scientists
|
Allocation Type | Multi-year Fund |
Review Section |
Basic Section 61010:Perceptual information processing-related
|
Research Institution | National Institute of Informatics |
Principal Investigator |
Wang Xin 国立情報学研究所, コンテンツ科学研究系, 特任准教授 (60843141)
|
Project Period (FY) |
2021-04-01 – 2024-03-31
|
Project Status |
Completed (Fiscal Year 2023)
|
Budget Amount *help |
¥4,550,000 (Direct Cost: ¥3,500,000、Indirect Cost: ¥1,050,000)
Fiscal Year 2023: ¥1,560,000 (Direct Cost: ¥1,200,000、Indirect Cost: ¥360,000)
Fiscal Year 2022: ¥1,560,000 (Direct Cost: ¥1,200,000、Indirect Cost: ¥360,000)
Fiscal Year 2021: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000)
|
Keywords | プライバシー / 音声匿名化 / 話者識別 / 音声情報処理 / 深層学習 / speech privacy / speaker anonymization / speech waveform modeling / neural network / deep learning |
Outline of Research at the Start |
Human speech contains not only verbal contents but also private information about the speaker such as the speaker identity. This proposal is on protecting the speaker’s privacy in speech data for two scenarios: 1) Speech anonymization: when the speaker shares the speech data in untrusted cyberspace, this speech data should be protected so that the audience can understand the speech but cannot infer who the speaker is; 2) Speech de-anonymization: when the speaker further shares the speech data to trusted audience, the original natural speech can be reconstructed from protected version.
|
Outline of Final Research Achievements |
Protecting the personally identifiable information encoded in speech waveform is urgent for many SNS applications. Although there are quite a few deep-learning-based methods trying to project or anonymize the speaker identity in speech data, their solutions are not satisfying. The main contributions of this project can be summarized in three aspects. First, this project proposed language-independent speaker identity anonymization using self-supervised learning speech models. The proposed system was applied to both English and Mandarin data. Second, this project proposed a new anonymization algorithm based on vector rotation. This alleviates the issue of the k-anonymity anonymization in existing methods. Third, this project took the initiative to anonymize a real large-scale speech database called VoxCeleb2 and investigated the utility and privacy protection performance. The research outcomes were published in top journals and conferences in the speech field.
|
Academic Significance and Societal Importance of the Research Achievements |
学術的成果として、現存の深層学習に基づく話者匿名化技術の言語依存性を着目し、複数の言語にも適用できる話者匿名化技術を開発した。また、従来のk-匿名化手法より、話者ベクトルの全体の分布を維持しながら匿名化が可能な手法を提案した。最後に、音声分野において初めてデータベース全体の匿名化を行い、有用性とプライバシー保護性能を調査した。いずれもの成果は音声分野のトップジャーナルや学会で発表された。そのほか、国際的なVoicePrivacyChallengeの運営にも貢献した。提案された技術はテレビ放送に使われたこともあった。
|