2021 Fiscal Year Final Research Report

Unification of Deep Learning and Generalized Mathematical Model for Independence-Based Audio Source Separation

Research Project

PDF

Project/Area Number	19K20306
Research Category	Grant-in-Aid for Early-Career Scientists
Allocation Type	Multi-year Fund
Review Section	Basic Section 61010:Perceptual information processing-related
Research Institution	Kagawa National College of Technology
Principal Investigator	Kitamura Daichi 香川高等専門学校, 電気情報工学科, 講師 (40804745)
Project Period (FY)	2019-04-01 – 2022-03-31
Keywords	音源分離 / 統計的信号処理 / アレイ信号処理 / 深層学習
Outline of Final Research Achievements	This research project aims at extending an existing audio source separation techniques. Audio source separation is a technique that estimates specific audio sources from an observed mixture signal. It is expected that this technique can be applied to many applications using audio signals. In particular, this research project addresses mathematical deepening of the method called "independent low-rank matrix analysis (ILRMA)", which was proposed by the principal investigator. The framework established by this research project is a generalization of conventional techniques and improves performance of audio source separation. This framework provides a new investigation about the unification of "statistical independence between sources" and "structures of each source (source model)" and extends the source model from the viewpoint of both mathematical generalizations and data-driven approaches.
Free Research Field	音響信号処理
Academic Significance and Societal Importance of the Research Achievements	本成果により，従来の音源分離手法よりもさらに高性能なアルゴリズムを複数提案することができた．具体的には，(1)「一般化ガウス分布生成モデル」と「音源モデルplug-and-playな最適化法」の理論解析と確立，(2)「深層学習に基づく音源教師あり手法」への発展，(3)「ユーザの介入を組み合わせたインタラクティブ音源分離手法」の3点について，新しい理論として構築できた．(1)については，純粋な音源分離性能の向上に寄与し，(2)については，近年充実しつつある音響信号の学習データの有効活用へとつなげることが可能である．さらに，(3)によって，人と機械が協働する音源分離アルゴリズムが実現された．