A detection method using relative phase information for spoofed speech based on speech synthesis, speaker adaptation and edited speech

Research Project

Project/Area Number	16K12461
Research Category	Grant-in-Aid for Challenging Exploratory Research
Allocation Type	Multi-year Fund
Research Field	Perceptual information processing
Research Institution	Chubu University (2018) Toyohashi University of Technology (2016-2017)
Principal Investigator	NAKAGAWA Seiichi 中部大学, 工学部, 教授 (20115893)
Co-Investigator(Kenkyū-buntansha)	王龍標長岡技術科学大学, 工学研究科, 准教授 (30510458) 岩橋政宏長岡技術科学大学, 工学研究科, 教授 (30251854)
Project Period (FY)	2016-04-01 – 2019-03-31
Project Status	Completed (Fiscal Year 2018)
Budget Amount *help	¥3,510,000 (Direct Cost: ¥2,700,000、Indirect Cost: ¥810,000) Fiscal Year 2018: ¥1,040,000 (Direct Cost: ¥800,000、Indirect Cost: ¥240,000) Fiscal Year 2017: ¥1,170,000 (Direct Cost: ¥900,000、Indirect Cost: ¥270,000) Fiscal Year 2016: ¥1,300,000 (Direct Cost: ¥1,000,000、Indirect Cost: ¥300,000)
Keywords	話者照合 / 相対位相情報 / 振幅スペクトラム / 位相スペクトラム / 詐称音声 / 再生音 / spoofed speech challenge / 録音再生音 / なりすまし音声 / 話者認識 / 位相情報の正規化
Outline of Final Research Achievements	A serious problem for speaker verification is spoofed speech, which is classified into (1) mimic speech (impersonation), (2) speech synthesis using target speaker's speech, (3) voice conversion to target speaker's speech, and (4) record-replay speech of target speaker's speech. In this study, we improved relative phase information for spoofed speech detection, which was invented by the proposer. The improvement points are the extension of frequency range to higher frequency to extract relative phase and optimal nonlinear scale of frequency axis. We obtained the best feature parameter, that is, improved relative phase, as single feature in the world. Furthermore, we obtained the higher detection rate by combining this relative phase feature and conventional feature parameters.
Academic Significance and Societal Importance of the Research Achievements	生体認証技術の一つとして話者照合技術がある。本研究では、各話者が約40秒の声を登録しておけば、4秒程度発声した声で、270人の話者から99.7％の精度で正しく発声した話者を識別する技術を開発した。この技術で声による「鍵」などの多くの応用が実現できる。一方、声真似や本人の一部の声を用いた音声合成技術や声質変換技術、録音再生技術による、なりすまし音声と本人の音声との区別ができなくなる問題が実用化への妨げとなる。本研究では、このなりすまし音声を高精度に検出する技術を開発した。この技術によって、話者照合技術のセキュリティ分野への応用も可能となった。

Report

(4 results)

2018 Annual Research Report Final Research Report ( PDF )
2017 Research-status Report
2016 Research-status Report

Research Products
(15 results)

All 2019 2018 2017 2016 Other

All Int'l Joint Research (3 results) Journal Article (3 results) (of which Int'l Joint Research: 2 results, Peer Reviewed: 3 results, Open Access: 2 results) Presentation (9 results) (of which Int'l Joint Research: 6 results)

[Int'l Joint Research] Tianjin University(中国)
- Related Report
  2018 Annual Research Report
[Int'l Joint Research] Tianjin University (天津大学）(China)
- Related Report
  2017 Research-status Report
[Int'l Joint Research] 天津大学(中国)
- Related Report
  2016 Research-status Report
[Journal Article] 最近の音声言語処理研究の動向　－筆者の音声認識、音声翻訳、話者認識の研究を中心として－2019
- Author(s)
  中川　聖一
- Journal Title
  
  中部大学工学部紀要
  
  Volume: 54 Pages: 1-14
- NAID
  120007116371
- Related Report
  2018 Annual Research Report
- Peer Reviewed / Open Access
[Journal Article] Spoofing speech detection using modified relative phase information2017
- Author(s)
  L. Wang, S. Nakagawa, Z. Zhang, Y. Yoshida, Y. Kawakami
- Journal Title
  
  IEEE Journal of Selected Topics in Signal Processing
  
  Volume: 11 Issue: 4 Pages: 660-670
- DOI
  10.1109/jstsp.2017.2694139
- Related Report
  2017 Research-status Report 2016 Research-status Report
- Peer Reviewed / Int'l Joint Research
[Journal Article] Noise robust voice activity detection using joint phase and magnitude based feature enhancement2017
- Author(s)
  K. Phapatanaburi, L. Wang, Z. Oo, W. Li, S. Nakagawa, M. Iwahashi
- Journal Title
  
  Journal of Ambient Intelligence and Humanized Computing
  
  Volume: 8 Issue: 6 Pages: 845-859
- DOI
  10.1007/s12652-017-0482-8
- Related Report
  2017 Research-status Report
- Peer Reviewed / Open Access / Int'l Joint Research
[Presentation] 音声波形と残差波形からのMFCCと位相情報による話者認識の比較2019
- Author(s)
  山本滉己、山本一公、中川聖一
- Organizer
  電子情報通信学会、総合全国大会
- Related Report
  2018 Annual Research Report
[Presentation] 残差波形の相対位相情報の話者認識への有効性の検討2019
- Author(s)
  中川聖一、山本滉己、山本一公
- Organizer
  電子情報通信学会、音声研究会
- Related Report
  2018 Annual Research Report
[Presentation] Replay attack detection using magnitude and phase information with attention-based adaptive filters2019
- Author(s)
  M. Liu, L. Wang, J. Dang, S. Nakagawa, H. Guan, X. Li
- Organizer
  IEEE ICASSP
- Related Report
  2018 Annual Research Report
- Int'l Joint Research
[Presentation] Multiple phase information combination for replay attacks detection2018
- Author(s)
  D. Li, L. Wang, J. Dang, M. Liu, Z. Oo, S. Nakagawa, H. Guan, X. Li
- Organizer
  ESCA Interspeech
- Related Report
  2018 Annual Research Report
- Int'l Joint Research
[Presentation] Replay attacks detection using phase and magnitude features with various frequency resolutions2018
- Author(s)
  M. Liu, L. Wang, Z. Oo, J. Dang, D. Li, S. Nakagawa
- Organizer
  ISCSLP
- Related Report
  2018 Annual Research Report
- Int'l Joint Research
[Presentation] Automatic speaker verification for reply attacks using Mel-scale phase and magnitude features2018
- Author(s)
  Z. Oo, L. Wang, L. Meng, S. Nakagawa, M. Iwahashi
- Organizer
  日本音響学会、春季研究発表会
- Related Report
  2017 Research-status Report
[Presentation] Phase aware deep neural network for noise robust voice activity detection2017
- Author(s)
  L. Wang, K. Phapatanaburi, Z. Oo, S. Nakagawa, M. Iwahashi, J. Dang
- Organizer
  IEEE ICME
- Related Report
  2017 Research-status Report
- Int'l Joint Research
[Presentation] Pseudo-pitch-syncronized phase information extraction and its application for robust speaker recognition2017
- Author(s)
  L. Wang, S. Nakagawa, J. Dang, J. Wei, T. Shen
- Organizer
  GCCE
- Related Report
  2017 Research-status Report
- Int'l Joint Research
[Presentation] DNN-based amplitude and phase feature enhancement for noise robust speaker identification2016
- Author(s)
  Z. Oo, Y. Kawakami, L. Wang, S. Nakagawa, X. Xiao, M. Iwahashi
- Organizer
  Proc. Interspeech, ISCA
- Place of Presentation
  サンフランシスコ、アメリカ
- Year and Date
  2016-09-11
- Related Report
  2016 Research-status Report
- Int'l Joint Research

A detection method using relative phase information for spoofed speech based on speech synthesis, speaker adaptation and edited speech

Principal Investigator

NAKAGAWA Seiichi 中部大学, 工学部, 教授 (20115893)

¥3,510,000 (Direct Cost: ¥2,700,000、Indirect Cost: ¥810,000)

Report

Research Products

[Int'l Joint Research] Tianjin University(中国)

Related Report

[Int'l Joint Research] Tianjin University (天津大学）(China)

Related Report

[Int'l Joint Research] 天津大学(中国)

Related Report

[Journal Article] 最近の音声言語処理研究の動向 －筆者の音声認識、音声翻訳、話者認識の研究を中心として－2019

Author(s)

Journal Title

NAID

Related Report

[Journal Article] Spoofing speech detection using modified relative phase information2017

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Noise robust voice activity detection using joint phase and magnitude based feature enhancement2017

Author(s)

Journal Title

DOI

Related Report

[Presentation] 音声波形と残差波形からのMFCCと位相情報による話者認識の比較2019

Author(s)

Organizer

Related Report

[Presentation] 残差波形の相対位相情報の話者認識への有効性の検討2019

Author(s)

Organizer

Related Report

[Presentation] Replay attack detection using magnitude and phase information with attention-based adaptive filters2019

Author(s)

Organizer

Related Report

[Presentation] Multiple phase information combination for replay attacks detection2018

Author(s)

Organizer

Related Report

[Presentation] Replay attacks detection using phase and magnitude features with various frequency resolutions2018

Author(s)

Organizer

Related Report

[Presentation] Automatic speaker verification for reply attacks using Mel-scale phase and magnitude features2018

Author(s)

Organizer

Related Report

[Presentation] Phase aware deep neural network for noise robust voice activity detection2017

Author(s)

Organizer

Related Report

[Presentation] Pseudo-pitch-syncronized phase information extraction and its application for robust speaker recognition2017

Author(s)

Organizer

Related Report

[Presentation] DNN-based amplitude and phase feature enhancement for noise robust speaker identification2016

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Journal Article] 最近の音声言語処理研究の動向　－筆者の音声認識、音声翻訳、話者認識の研究を中心として－2019