Speech security on human-computer interaction

Research Project

Project/Area Number	22K21304
Research Category	Grant-in-Aid for Research Activity Start-up
Allocation Type	Multi-year Fund
Review Section	1002:Human informatics, applied informatics and related fields
Research Institution	Japan Advanced Institute of Science and Technology
Principal Investigator	MAWALIM CandyOlivia 北陸先端科学技術大学院大学, 先端科学技術研究科, 助教 (10963720)
Project Period (FY)	2022-08-31 – 2024-03-31
Project Status	Completed (Fiscal Year 2023)
Budget Amount *help	¥2,860,000 (Direct Cost: ¥2,200,000、Indirect Cost: ¥660,000) Fiscal Year 2023: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000) Fiscal Year 2022: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000)
Keywords	voice privacy / phase vocoder / speaker anonymization / speaker verification / spoof attacks / speech intelligibility / auditory model / intelligibility / time scale modification / gender / HCI / speech security
Outline of Research at the Start	This study mainly aims to develop a privacy-aware computing system for assisting speech communication. Unlike most existing systems that only focus on performance accuracy, this study addresses the protection of voice privacy in system development by a novel speaker anonymization method.
Outline of Final Research Achievements	In FY2022, we developed speaker anonymization methods using time-scale modification. The phase vocoder method is most effective for preserving voice characteristics. This method offered a better balance between privacy and speech intelligibility. Additionally, we analyzed the impact of anonymization on gender perception using a machine learning model. These findings were presented at three international conferences. In FY2023, research focused on two areas: (1) addressing unclear goals in speaker anonymization and the variety of speech attacks. New methods for tackling spoofing in speaker verification systems were developed. These findings were presented at two conferences. (2) investigating human speech perception to understand how we perceive intelligibility. This research, published in the Journal of Applied Acoustics, lays the groundwork for detecting changes caused by speech synthesis. Finally, the project is expanding its scope to include developing a Thai language spoof database.
Academic Significance and Societal Importance of the Research Achievements	Innovative techniques for speaker anonymization and spoofing detection open up new possibilities for voice privacy and security research. This research will greatly contributes to securing voice communication, strengthening authentication systems, and improving human-computer interaction.

Report

(3 results)

2023 Annual Research Report Final Research Report ( PDF )
2022 Research-status Report

Research Products
(17 results)

All 2023 2022

All Journal Article (3 results) (of which Int'l Joint Research: 1 results, Peer Reviewed: 3 results, Open Access: 3 results) Presentation (14 results) (of which Int'l Joint Research: 12 results, Invited: 2 results)

[Journal Article] A Ranking Model for Evaluation of Conversation Partners Based on Rapport Levels2023
- Author(s)
  Hayashi Takato、Mawalim Candy Olivia、Ishii Ryo、Morikawa Akira、Fukayama Atsushi、Nakamura Takao、Okada Shogo
- Journal Title
  
  IEEE Access
  
  Volume: 11 Pages: 73024-73035
- DOI
  10.1109/access.2023.3287984
- Related Report
  2023 Annual Research Report
- Peer Reviewed / Open Access
[Journal Article] Non-intrusive speech intelligibility prediction using an auditory periphery model with hearing loss2023
- Author(s)
  Mawalim Candy Olivia、Titalim Benita Angela、Okada Shogo、Unoki Masashi
- Journal Title
  
  Applied Acoustics
  
  Volume: 214 Pages: 109663-109663
- DOI
  10.1016/j.apacoust.2023.109663
- Related Report
  2023 Annual Research Report
- Peer Reviewed / Open Access
[Journal Article] Personality trait estimation in group discussions using multimodal analysis and speaker embedding2023
- Author(s)
  Mawalim Candy Olivia、Okada Shogo、Nakano Yukiko I.、Unoki Masashi
- Journal Title
  
  Journal on Multimodal User Interfaces
  
  Volume: - Issue: 2 Pages: 47-63
- DOI
  10.1007/s12193-023-00401-0
- Related Report
  2022 Research-status Report
- Peer Reviewed / Open Access / Int'l Joint Research
[Presentation] Analysis of Spectro-Temporal Modulation Representation for Deep-Fake Speech Detection2023
- Author(s)
  Haowei Cheng, Candy Olivia Mawalim, Kai Li, Lijun Wang, and Masashi Unoki
- Organizer
  2023 Asia-Pasific Signal and Information Processing Association Annual Summit and Conference
- Related Report
  2023 Annual Research Report
- Int'l Joint Research
[Presentation] Auditory Model Optimization with Wavegram-CNN and Acoustic Parameter Models for Nonintrusive Speech Intelligibility Prediction in Hearing Aids2023
- Author(s)
  Candy Olivia Mawalim, Benita Angela Titalim, Shogo Okada, and Masashi Unoki
- Organizer
  The 31st European Signal Processing Conference (EUSIPCO 2023)
- Related Report
  2023 Annual Research Report
- Int'l Joint Research
[Presentation] Speech signal processing for privacy and security protection2023
- Author(s)
  Candy Olivia Mawalim
- Organizer
  APSIPA Workshop on Signal and Information Processing in Indonesia
- Related Report
  2023 Annual Research Report
- Invited
[Presentation] Incorporating the Digit Triplet Test in A Lightweight Speech Intelligibility Prediction for Hearing Aids2023
- Author(s)
  Xiajie Zhou, Candy Olivia Mawalim, and Masashi Unoki
- Organizer
  2023 Asia-Pasific Signal and Information Processing Association Annual Summit and Conference
- Related Report
  2023 Annual Research Report
- Int'l Joint Research
[Presentation] AI for Speech Processing: “Threats and Opportunities of Voice-Interactive Applications”2023
- Author(s)
  Candy Olivia Mawalim
- Organizer
  AI Talks ITB #10 (Online Webinar)
- Related Report
  2023 Annual Research Report
- Invited
[Presentation] Inter-person Intra-modality Attention Based Model for Dyadic Interaction Engagement Prediction2023
- Author(s)
  Xiguang Li, Shogo Okada, and Candy Olivia Mawalim
- Organizer
  The 25th HCI International Conference
- Related Report
  2023 Annual Research Report
- Int'l Joint Research
[Presentation] Investigating the Effect of Linguistic Features on Personality and Job Performance Predictions2023
- Author(s)
  Hung Le, Sixia Li, Candy Olivia Mawalim, Hung-Hsuan Huang, Chee Wee Leong, and Shogo Okada
- Organizer
  The 25th HCI International Conference
- Related Report
  2023 Annual Research Report
- Int'l Joint Research
[Presentation] ThaiSpoof: A Database for Spoof Detection in Thai Language2023
- Author(s)
  Kasorn Galajit, Thunpisit Kosolsriwiwat, Candy Olivia Mawalim, Pakinee Aimmanee, Waree Kongprawechnon, Win Pa Pa, Anuwat Chaiwongyen, Teeradaj Racharak, Hayati Yassin, Jessada Karnjana, Surasak Boonkla, and Masashi Unoki
- Organizer
  The 18th International Joint Symposium on Artificial Intelligence and Natural Language Processing and The International Conference on Artificial Intelligence and Internet of Things (iSAI-NLP 2023)
- Related Report
  2023 Annual Research Report
- Int'l Joint Research
[Presentation] Voice Contribution on LFCC feature and ResNet-34 for Spoof Detection2023
- Author(s)
  Khaing Zar Mon, Kasorn Galajit, Candy Olivia Mawalim, Jessada Karnjana, Tsuyoshi Isshiki, and Pakinee Aimmanee
- Organizer
  The 18th International Joint Symposium on Artificial Intelligence and Natural Language Processing and The International Conference on Artificial Intelligence and Internet of Things (iSAI-NLP 2023)
- Related Report
  2023 Annual Research Report
- Int'l Joint Research
[Presentation] Multimodal Analysis for Communication Skill and Self-Efficacy Level Estimation in Job Interview Scenario2022
- Author(s)
  Ohba Tomoya、Mawalim Candy Olivia、Katada Shun、Kuroki Haruki、Okada Shogo
- Organizer
  The 21st International Conference on Mobile and Ubiquitous Multimedia (MUM 2022)
- Related Report
  2022 Research-status Report
- Int'l Joint Research
[Presentation] F0 Modification via PV-TSM Algorithm for Speaker Anonymization Across Gender2022
- Author(s)
  Mawalim Candy Olivia、Okada Shogo、Unoki Masashi
- Organizer
  2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference
- Related Report
  2022 Research-status Report
- Int'l Joint Research
[Presentation] Speaker Anonymization by Pitch Shifting Based on Time-Scale Modification2022
- Author(s)
  Mawalim Candy Olivia、Okada Shogo、Unoki Masashi
- Organizer
  The 2nd SPSC joined with 2nd VoicePrivacy Challenge Workshop, as a satellite to Interspeech 2022
- Related Report
  2022 Research-status Report
- Int'l Joint Research
[Presentation] Speech Intelligibility Prediction for Hearing Aids Using an Auditory Model and Acoustic Parameters2022
- Author(s)
  Titalim Benita Angela、Mawalim Candy Olivia、Okada Shogo、Unoki Masashi
- Organizer
  2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference
- Related Report
  2022 Research-status Report
- Int'l Joint Research
[Presentation] OBISHI: Objective Binaural Intelligibility Score for the Hearing Impaired2022
- Author(s)
  Mawalim Candy Olivia、Titalim Benita Angela、Okada Shogo、Unoki Masashi
- Organizer
  The 18th Australasian International Conference on Speech Science and Technology
- Related Report
  2022 Research-status Report
- Int'l Joint Research

Speech security on human-computer interaction

Principal Investigator

MAWALIM CandyOlivia 北陸先端科学技術大学院大学, 先端科学技術研究科, 助教 (10963720)

¥2,860,000 (Direct Cost: ¥2,200,000、Indirect Cost: ¥660,000)

Report

Research Products

[Journal Article] A Ranking Model for Evaluation of Conversation Partners Based on Rapport Levels2023

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Non-intrusive speech intelligibility prediction using an auditory periphery model with hearing loss2023

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Personality trait estimation in group discussions using multimodal analysis and speaker embedding2023

Author(s)

Journal Title

DOI

Related Report

[Presentation] Analysis of Spectro-Temporal Modulation Representation for Deep-Fake Speech Detection2023

Author(s)

Organizer

Related Report

[Presentation] Auditory Model Optimization with Wavegram-CNN and Acoustic Parameter Models for Nonintrusive Speech Intelligibility Prediction in Hearing Aids2023

Author(s)

Organizer

Related Report

[Presentation] Speech signal processing for privacy and security protection2023

Author(s)

Organizer

Related Report

[Presentation] Incorporating the Digit Triplet Test in A Lightweight Speech Intelligibility Prediction for Hearing Aids2023

Author(s)

Organizer

Related Report

[Presentation] AI for Speech Processing: “Threats and Opportunities of Voice-Interactive Applications”2023

Author(s)

Organizer

Related Report

[Presentation] Inter-person Intra-modality Attention Based Model for Dyadic Interaction Engagement Prediction2023

Author(s)

Organizer

Related Report

[Presentation] Investigating the Effect of Linguistic Features on Personality and Job Performance Predictions2023

Author(s)

Organizer

Related Report

[Presentation] ThaiSpoof: A Database for Spoof Detection in Thai Language2023

Author(s)

Organizer

Related Report

[Presentation] Voice Contribution on LFCC feature and ResNet-34 for Spoof Detection2023

Author(s)

Organizer

Related Report

[Presentation] Multimodal Analysis for Communication Skill and Self-Efficacy Level Estimation in Job Interview Scenario2022

Author(s)

Organizer

Related Report

[Presentation] F0 Modification via PV-TSM Algorithm for Speaker Anonymization Across Gender2022

Author(s)

Organizer

Related Report

[Presentation] Speaker Anonymization by Pitch Shifting Based on Time-Scale Modification2022

Author(s)

Organizer

Related Report

[Presentation] Speech Intelligibility Prediction for Hearing Aids Using an Auditory Model and Acoustic Parameters2022

Author(s)

Organizer

Related Report

[Presentation] OBISHI: Objective Binaural Intelligibility Score for the Hearing Impaired2022

Author(s)

Organizer

Related Report