Safe and secure speech information processing based on liveness detection and ASVspoof challenge
Project/Area Number |
18KT0051
|
Research Category |
Grant-in-Aid for Scientific Research (B)
|
Allocation Type | Multi-year Fund |
Section | 特設分野 |
Research Field |
The Information Society and Trust
|
Research Institution | National Institute of Informatics |
Principal Investigator |
Yamagishi Junichi 国立情報学研究所, コンテンツ科学研究系, 教授 (70709352)
|
Co-Investigator(Kenkyū-buntansha) |
大木 哲史 静岡大学, 情報学部, 准教授 (80537407)
|
Project Period (FY) |
2018-07-18 – 2021-03-31
|
Project Status |
Completed (Fiscal Year 2020)
|
Budget Amount *help |
¥18,460,000 (Direct Cost: ¥14,200,000、Indirect Cost: ¥4,260,000)
Fiscal Year 2020: ¥5,590,000 (Direct Cost: ¥4,300,000、Indirect Cost: ¥1,290,000)
Fiscal Year 2019: ¥5,720,000 (Direct Cost: ¥4,400,000、Indirect Cost: ¥1,320,000)
Fiscal Year 2018: ¥7,150,000 (Direct Cost: ¥5,500,000、Indirect Cost: ¥1,650,000)
|
Keywords | 音声情報処理 / 話者照合 / 生体検知 / 生体認証 / 音声インタフェース / なりすまし / ASVspoofチャレンジ |
Outline of Final Research Achievements |
As speech processing became more widespread in society, attacks on speaker verification and speech recognition began to occur. The purpose of this research is to improve the liveness detection technology of speech and to present a solution to the problem. Liveness detection is a machine learning technology that distinguishes between "voice obtained by another person without permission, processed, and reproduced by an external device" and "live voice uttered on the spot from the living body". Therefore, we built a DB containing a large amount of audio files synthesized by the latest speech synthesis and voice conversion technology, held a competition, and advanced the liveness detection technology in the field. It has become possible to detect artificial voices for which there is no perceived audible difference, and a solution has been obtained that realizes safe and secure voice information processing.
|
Academic Significance and Societal Importance of the Research Achievements |
音声情報処理は多くのスマートデバイスで利用されており、社会を支える基盤技術である。音声の生体検知は音声インターフェースの手軽さとトラストの両方を同時に実現する技術であり、社会的意義は高い。実際、本研究を通して構築し、一般公開したDBは、世界のアカデミック組織のみならず、多くの企業にも利用されている。学術的意義も高く、多くの国際会議論文が本DBを利用している。現在AI技術により生成されたメディアが悪用される事が危惧され、deepfakeと呼ばれることもある。本研究は、音声を対象に研究を行ったが、その成果は映像や文字等にも応用可能であると考えられ、今後さらに発展させることが可能であると期待される。
|
Report
(4 results)
Research Products
(40 results)
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
[Journal Article] ASVspoof 2019: a large-scale public database of synthesized, converted and replayed speech2020
Author(s)
Xin Wang, Junichi Yamagishi, Massimiliano Todisco, Hector Delgado, Andreas Nautsch, Nicholas Evans, Md Sahidullah, Ville Vestman, Tomi Kinnunen, Kong Aik Lee, Lauri Juvela, Paavo Alku, Yu-Huai Peng, Hsin-Te Hwang, Yu Tsao, Hsin-Min Wang, Sebastien Le Maguer, Markus Becker, Fergus Henderson他計40名
-
Journal Title
Computer Speech & Language
Volume: 64
Pages: 101114-101114
DOI
Related Report
Peer Reviewed / Open Access / Int'l Joint Research
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-