Study on Audio Information Hiding Based on Human Auditory Perception with Phase Modulation

Research Project

Project/Area Number	20J20580
Research Category	Grant-in-Aid for JSPS Fellows
Allocation Type	Single-year Grants
Section	国内
Review Section	Basic Section 61010:Perceptual information processing-related
Research Institution	Japan Advanced Institute of Science and Technology
Principal Investigator	MAWALIM CANDY OLIVIA 北陸先端科学技術大学院大学, 先端科学技術研究科, 特別研究員(DC1)
Project Period (FY)	2020-04-24 – 2023-03-31
Project Status	Granted (Fiscal Year 2021)
Budget Amount *help	¥2,500,000 (Direct Cost: ¥2,500,000) Fiscal Year 2021: ¥800,000 (Direct Cost: ¥800,000) Fiscal Year 2020: ¥900,000 (Direct Cost: ¥900,000)
Keywords	information hiding / voice privacy / speaker anonymization / watermarking / authentication / speech coding
Outline of Research at the Start	To achieve research goal, the following major steps will be conducted: First step is to design embedding system for audio information hiding with consideration of psychoacoustics and phase modulation concepts. Second step is to determine the way for detecting the embedded information. At this step also, the medium of audio information hiding will be considered such as VoIP or mobile communication. Lastly, the thorough evaluation will be conducted so that the proposed system satisfies all the requirements and can be applied in speech communication (e.g. as tampering or spoofing detection).
Outline of Annual Research Achievements	The major milestone in FY2021 is developing a framework to improve the security of speaker anonymization. Speaker anonymization aims to address the voice privacy issue by suppressing the original speaker's personally identified information (PII). The output anonymized speech should be able to authenticate by the authorized parties. However, since the mapping between speaker and pseudo-speaker is not necessarily one-to-one correspondence, recognizing genuine anonymized speech is difficult. To deal with this issue, the proposed framework integrates the information hiding approach to simultaneously secure PII and verify the content via an embedded watermark. The related publications consist of one international conference and two journals.
Current Status of Research Progress	Current Status of Research Progress 2: Research has progressed on the whole more than it was originally planned. Reason The progress of this study is going well as planned. At this stage, the proposed framework has been developed by integrating the information hiding approach to protecting content and securing speaker individuality information. It consists of an encoder and a decoder. The encoder aims to protect the speaker's identity by using an anonymization approach while embedding a parameter that represents a watermark. The decoder seeks to protect the authentication of the speech by accurately detecting the embedded watermarks. An extensive evaluation has been conducted to validate the proposed framework's performance compared to the existing methods. The results of this study in FY2021 were reported in APSIPA Proceeding 2021, MDPI Entropy Journal 2021, and Computer Speech and Language Journal 2022.
Strategy for Future Research Activity	In future work, the remaining issues, especially those related to subjective and objective evaluations for intelligibility and naturalness requirements, will be addressed. The results obtained by using existing objective evaluations could give general information about a speaker anonymization method, but it is still inadequate to show the significance of each method. Besides, x-vector-based information hiding and the investigation of other prospective speech features will be considered. By controlling the less significant eigenstructure of the x-vector, we expect better protection for speech signals. Finally, the workflow for the real application will be investigated for speech tampering and spoofing countermeasure.

Report

(2 results)

2021 Annual Research Report
2020 Annual Research Report

Research Products
(8 results)

All 2022 2021 2020

All Journal Article (4 results) (of which Int'l Joint Research: 4 results, Peer Reviewed: 4 results, Open Access: 3 results) Presentation (4 results) (of which Int'l Joint Research: 3 results)

[Journal Article] Speaker anonymization by modifying fundamental frequency and x-vector singular value2022
- Author(s)
  Mawalim Candy Olivia、Galajit Kasorn、Karnjana Jessada、Kidani Shunsuke、Unoki Masashi
- Journal Title
  
  Computer Speech & Language
  
  Volume: 73 Pages: 101326-101326
- DOI
  10.1016/j.csl.2021.101326
- Related Report
  2021 Annual Research Report
- Peer Reviewed / Open Access / Int'l Joint Research
[Journal Article] Speech Watermarking Method Using McAdams Coefficient Based on Random Forest Learning2021
- Author(s)
  Mawalim Candy Olivia、Unoki Masashi
- Journal Title
  
  Entropy
  
  Volume: 23 Issue: 10 Pages: 1246-1246
- DOI
  10.3390/e23101246
- Related Report
  2021 Annual Research Report
- Peer Reviewed / Open Access / Int'l Joint Research
[Journal Article] X-Vector Singular Value Modification and Statistical-Based Decomposition with Ensemble Regression Modeling for Speaker Anonymization System2020
- Author(s)
  Mawalim Candy Olivia、Galajit Kasorn、Karnjana Jessada、Unoki Masashi
- Journal Title
  
  Proc. Interspeech 2020
  
  Volume: - Pages: 1703-1707
- DOI
  10.21437/interspeech.2020-1887
- Related Report
  2020 Annual Research Report
- Peer Reviewed / Open Access / Int'l Joint Research
[Journal Article] Speech Information Hiding by Modification of LSF Quantization Index in CELP Codec2020
- Author(s)
  Candy Olivia Mawalim, Shengbei Wang, Masashi Unoki
- Journal Title
  
  Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, {APSIPA} 2020, Auckland, New Zealand, December 7-10, 2020
  
  Volume: - Pages: 1321-1330
- Related Report
  2020 Annual Research Report
- Peer Reviewed / Int'l Joint Research
[Presentation] Improving Security in McAdams Coefficient-Based Speaker Anonymization by Watermarking Method2021
- Author(s)
  Candy Olivia Mawalim, Masashi Unoki
- Organizer
  APSIPA2021
- Related Report
  2021 Annual Research Report
- Int'l Joint Research
[Presentation] X-vector anonymization using regression modeling with statistical and singular value decomposition2021
- Author(s)
  Candy Olivia Mawalim, Kasorn Galajit, Jessada Karnjana, Masashi Unoki
- Organizer
  電子情報通信学会EMM研究会
- Related Report
  2020 Annual Research Report
[Presentation] X-Vector Singular Value Modification and Statistical-Based Decomposition with Ensemble Regression Modeling for Speaker Anonymization System2020
- Author(s)
  Candy Olivia Mawalim, Kasorn Galajit, Jessada Karnjana, Masashi Unoki
- Organizer
  Interspeech2020
- Related Report
  2020 Annual Research Report
- Int'l Joint Research
[Presentation] Speech Information Hiding by Modification of LSF Quantization Index in CELP Codec2020
- Author(s)
  Candy Olivia Mawalim, Shengbei Wang, Masashi Unoki
- Organizer
  APSIPA2020
- Related Report
  2020 Annual Research Report
- Int'l Joint Research

Study on Audio Information Hiding Based on Human Auditory Perception with Phase Modulation

Principal Investigator

MAWALIM CANDY OLIVIA 北陸先端科学技術大学院大学, 先端科学技術研究科, 特別研究員(DC1)

¥2,500,000 (Direct Cost: ¥2,500,000)

Current Status of Research Progress

Reason

Report

Research Products

[Journal Article] Speaker anonymization by modifying fundamental frequency and x-vector singular value2022

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Speech Watermarking Method Using McAdams Coefficient Based on Random Forest Learning2021

Author(s)

Journal Title

DOI

Related Report

[Journal Article] X-Vector Singular Value Modification and Statistical-Based Decomposition with Ensemble Regression Modeling for Speaker Anonymization System2020

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Speech Information Hiding by Modification of LSF Quantization Index in CELP Codec2020

Author(s)

Journal Title

Related Report

[Presentation] Improving Security in McAdams Coefficient-Based Speaker Anonymization by Watermarking Method2021

Author(s)

Organizer

Related Report

[Presentation] X-vector anonymization using regression modeling with statistical and singular value decomposition2021

Author(s)

Organizer

Related Report

[Presentation] X-Vector Singular Value Modification and Statistical-Based Decomposition with Ensemble Regression Modeling for Speaker Anonymization System2020

Author(s)

Organizer

Related Report

[Presentation] Speech Information Hiding by Modification of LSF Quantization Index in CELP Codec2020

Author(s)

Organizer

Related Report