2018 Fiscal Year Final Research Report
Extended theories of audio source separation based on statistical independence and various mathematical structures
Project/Area Number |
17H06572
|
Research Category |
Grant-in-Aid for Research Activity Start-up
|
Allocation Type | Single-year Grants |
Research Field |
Perceptual information processing
|
Research Institution | Kagawa National College of Technology (2018) The University of Tokyo (2017) |
Principal Investigator |
Kitamura Daichi 香川高等専門学校, 電気情報工学科, 助教 (40804745)
|
Project Period (FY) |
2017-08-25 – 2019-03-31
|
Keywords | 音響信号処理 / 統計的信号処理 / 音源分離 / 深層学習 |
Outline of Final Research Achievements |
This research project aims to improve the performance of conventional audio source separation techniques by extending their theories from mathematical and practical aspects. Audio source separation is a technique for extracting specific audio sources from the observed mixture signal. This technique can be applied for many devices and systems including hearing-aid system, smart speaker, speech recognition, and so on. In this project, the generalization of probabilistic model assumed in "independent low-rank matrix analysis (ILRMA)" (state-of-the-art audio source separation method) was carried out, and its validity was confirmed by practical experiments. Also, various types of mathematical model were introduced into ILRMA to enhance its separation quality. Furthermore, data-driven approach was newly employed to ILRMA, which was named as independent deeply learned matrix analysis. The efficacy of the proposed methods was confirmed.
|
Free Research Field |
音響信号処理
|
Academic Significance and Societal Importance of the Research Achievements |
音源分離技術の精度が向上すれば,補聴器等の人支援デバイスへと直接的に応用できる他,音楽の新しい楽しみ方やVR技術への援用など,これまでの芸術・文化の振興につながることが期待されている.また,近年は音声認識やスマートスピーカ等が身近な技術となったが,これらのデバイスが雑音の多い環境下でも頑健に動作するためにも,音源分離技術の応用が必須となる. このように,音源分離技術はあらゆる音響機器のフロントエンドとして必要な最も基本的な信号処理である.また,「混合信号から潜在的な因子を推定する」という観点では,音響信号のみならず,画像や電波などあらゆるメディアへの活用も期待される.
|