研究課題/領域番号 |
22K21304
|
研究種目 |
研究活動スタート支援
|
配分区分 | 基金 |
審査区分 |
1002:人間情報学、応用情報学およびその関連分野
|
研究機関 | 北陸先端科学技術大学院大学 |
研究代表者 |
MAWALIM CandyOlivia 北陸先端科学技術大学院大学, 先端科学技術研究科, 助教 (10963720)
|
研究期間 (年度) |
2022-08-31 – 2024-03-31
|
研究課題ステータス |
交付 (2022年度)
|
配分額 *注記 |
2,860千円 (直接経費: 2,200千円、間接経費: 660千円)
2023年度: 1,430千円 (直接経費: 1,100千円、間接経費: 330千円)
2022年度: 1,430千円 (直接経費: 1,100千円、間接経費: 330千円)
|
キーワード | voice privacy / time scale modification / phase vocoder / speaker anonymization / gender / HCI / speech security |
研究開始時の研究の概要 |
This study mainly aims to develop a privacy-aware computing system for assisting speech communication. Unlike most existing systems that only focus on performance accuracy, this study addresses the protection of voice privacy in system development by a novel speaker anonymization method.
|
研究実績の概要 |
In this fiscal year, we proposed speaker anonymization methods based on time scale modification (TSM) algorithms. The study finds that using the phase vocoder-based TSM method is more suitable for speaker anonymization due to the human voice's harmonic structures. The proposed method balances privacy and utility metrics better than baseline systems. Besides, we also analyzed the effect of anonymization on the perception of gender by utilizing a gender classifier model that was built using x-vector speaker embedding. The results of our study were presented at the Voice Privacy Challenge 2022, joint with the Interspeech 2022 conference and the 14th annual conference organized by Asia-Pacific Signal and Information Processing Association 2022.
|
現在までの達成度 (区分) |
現在までの達成度 (区分)
2: おおむね順調に進展している
理由
The progress of this project is going well as planned. The speech analysis has been performed to obtain the features related to personally identified information (PII). We investigate pitch shifting using two major categories of TSM algorithms for speaker anonymization. Our recent finding from this study is that the human voice contains harmonic structures; thus, applying PV-TSM, which is more suited to a harmonic component, could benefit speaker anonymization. Subsequently, the phase adaptation may manipulate not only fundamental frequency but also the PII-related acoustics features. Our method outperformed the x-vector-based speaker method, which has limitations in its complex training process, low privacy in an a-a scenario, and low voice distinctiveness.
|
今後の研究の推進方策 |
In the currently proposed methods, several remaining issues exist. For instance, the speaker anonymization target needs to be clearly defined. As a result, the application for speaker anonymization has several limitations on the attack models. In the future, the development of more secure and robust speaker anonymization with attack models will be the focus. Hence, it can be applied for broader applications. Important ethical and privacy concerns will also be considered when developing speaker anonymization techniques.
|