人間の聴覚特性を考慮した残響・雑音環境下における音声信号処理の研究

Research Project

Project/Area Number	18J20059
Research Category	Grant-in-Aid for JSPS Fellows
Allocation Type	Single-year Grants
Section	国内
Research Field	Perceptual information processing
Research Institution	University of Tsukuba
Principal Investigator	李莉筑波大学, システム情報工学研究科, 特別研究員(DC1)
Project Period (FY)	2018-04-25 – 2021-03-31
Project Status	Completed (Fiscal Year 2020)
Budget Amount *help	¥2,800,000 (Direct Cost: ¥2,800,000) Fiscal Year 2020: ¥900,000 (Direct Cost: ¥900,000) Fiscal Year 2019: ¥900,000 (Direct Cost: ¥900,000) Fiscal Year 2018: ¥1,000,000 (Direct Cost: ¥1,000,000)
Keywords	多チャネル音源分離 / 音声強調 / 多チャネル変分自己符号化器 / 独立ベクトル分析 / 深層学習 / 音響信号処理 / モノラル音声強調 / 非負値行列因子分解
Outline of Annual Research Achievements	本研究では，人間の聴覚上かつ機械の認識上の両方において，高品質な音源分離システムの構築を最終的な目標としており，信号処理・機械学習・聴覚にまたがる数理モデルの構築と拡張を行った．最終年度では，主に以下の研究課題に取り組んだ． 1．昨年度までに提案した多チャンネル音源分離手法である多チャンネル変分自己符号化器法の高速アルゴリズム（FastMVAE法）の改良を行い，従来のFastMVAE法における未知データに対する性能劣化の問題を改善し，より高精度かつ高速なアルゴリズムを開発した．その結果はIEEE Accessに掲載された．本研究はIEEE Signal Processing Society Japan Chapterにより高く評価され，Student Conference Paper Awardを受賞した． 2．実験データを増やして，初年度に進めた非負値行列因子分解に基づく音声強調手法である識別的非負値行列因子分解（DNMF）の性能および動作を確認した．その結果をまとめた論文はIEEE Accessに掲載された． 3．昨年度に補助関数法を用いた独立ベクトル分析（AuxIVA）と呼ぶ多チャンネルブラインド音源分離手法にマイクと話者の空間情報を利用した幾何的正則化を取り入れたGCIVAを提案した．本年度は，実用化アプリケーションに向けて，提案手法のオンラインアルゴリズムの開発を行い，提案手法はリアルタイム処理で高性能な音声強調を行えることをシミュレーション実験で検証した．その結果をまとめた論文をトップカンファレンスであるINTERSPEECH2020で発表した．また，実環境における提案法の有効性も車室内で録音したデータにより検証した． 4．実用アプリケーションを目指し，AuxIVAおよびGCIVAのオンラインアルゴリズムを小型パソコンJetson Nanoに実装し，動作を確認した．
Research Progress Status	令和2年度が最終年度であるため、記入しない。
Strategy for Future Research Activity	令和2年度が最終年度であるため、記入しない。

Report

(3 results)

Research Products
(41 results)

All 2021 2020 2019 2018 Other

All Journal Article (4 results) (of which Peer Reviewed: 4 results, Open Access: 4 results) Presentation (36 results) (of which Int'l Joint Research: 22 results, Invited: 5 results) Remarks (1 results)

[Journal Article] FastMVAE: A Fast Optimization Algorithm for the Multichannel Variational Autoencoder Method2020
- Author(s)
  Li Li, Hirokazu Kameoka, Shota Inoue, Shoji Makino
- Journal Title
  
  IEEE Access
  
  Volume: 8 Pages: 228740-228753
- DOI
  10.1109/access.2020.3045704
- Related Report
  2020 Annual Research Report
- Peer Reviewed / Open Access
[Journal Article] Majorization-Minimization Algorithm for Discriminative Non-Negative Matrix Factorization2020
- Author(s)
  Li Li, Hirokazu Kameoka, Shoji Makino
- Journal Title
  
  IEEE Access
  
  Volume: 8 Pages: 227399-227408
- DOI
  10.1109/access.2020.3045791
- Related Report
  2020 Annual Research Report
- Peer Reviewed / Open Access
[Journal Article] Supervised determined source separation with multichannel variational autoencoder2019
- Author(s)
  Hirokazu Kameoka, Li Li, Shota Inoue, Shoji Makino
- Journal Title
  
  Neural Computation
  
  Volume: Vol. 31, No. 9 Issue: 9 Pages: 1891-1914
- DOI
  10.1162/neco_a_01217
- Related Report
  2019 Annual Research Report
- Peer Reviewed / Open Access
[Journal Article] Underdetermined source separation based on generalized multichannel variational autoencoder2019
- Author(s)
  Shogo Seki, Hirokazu Kameoka, Li Li, Tomoki Toda, Kazuya Takeda
- Journal Title
  
  IEEE Access
  
  Volume: Vol. 7, No. 1 Pages: 168104-168115
- DOI
  10.1109/access.2019.2954120
- Related Report
  2019 Annual Research Report
- Peer Reviewed / Open Access
[Presentation] Single-channel multi-speaker separation via discriminative training of variational autoencoder spectrogram model2021
- Author(s)
  Naoya Murashima, Hirokazu Kameoka, Li Li, Shogo Seki, Shoji Makino
- Organizer
  RISP Internaonal Workshop on Nonlinear Circuits, Communicaions and Signal Processing (NCSP2021), pp. 149-152
- Related Report
  2020 Annual Research Report
- Int'l Joint Research
[Presentation] VMInNet: Interpolation of virtual microphones in optimal latent space explored by autoencoder2021
- Author(s)
  Riki Takahashi, Li Li, Shoji Makino, Takeshi Yamada
- Organizer
  RISP Internaonal Workshop on Nonlinear Circuits, Communicaions and Signal Processing (NCSP2021), pp. 93-96
- Related Report
  2020 Annual Research Report
- Int'l Joint Research
[Presentation] Teacher-student learning for low-latency online speech enhancement using wave-U-net2021
- Author(s)
  Sotaro Nakaoka, Li Li, Shota Inoue, Shoji Makino
- Organizer
  2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP2021)
- Related Report
  2020 Annual Research Report
- Int'l Joint Research
[Presentation] SepNet: A deep separation matrix prediction network for multichannel audio source separation2021
- Author(s)
  Shota Inoue, Hirokazu Kameoka, Li Li, Shoji Makino
- Organizer
  2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP2021)
- Related Report
  2020 Annual Research Report
- Int'l Joint Research
[Presentation] 車室内環境を想定したWave-U-Netによる雑音除去の検討2021
- Author(s)
  樋口隼太, 李莉, 井上翔太, 牧野昭二, 山田武志
- Organizer
  電子情報通信学会総合大会論文集, A-5-1
- Related Report
  2020 Annual Research Report
[Presentation] 車室内の三角マイクロフォンアレイへのヴァーチャルマイクロフォン技術の適用2021
- Author(s)
  瀬川華子, 髙橋理希, 李莉, 陣在遼河, 牧野昭二, 山田武志
- Organizer
  日本音響学会2021年春季研究発表会講演論文集, 2-1-14, pp. 253-256
- Related Report
  2020 Annual Research Report
[Presentation] 補助関数法に基づく幾何学的制約付き独立ベクトル分析の車室内音声強調への適用2021
- Author(s)
  後藤加奈, 李莉, 高橋理希, 牧野昭二, 山田武志
- Organizer
  日本音響学会2021年春季研究発表会講演論文集, 2-1-13, pp. 249-252
- Related Report
  2020 Annual Research Report
[Presentation] Teacher-Student学習を用いたWave-U-netによる低遅延リアルタイム音声強調2021
- Author(s)
  中岡想太郎, 井上翔太, 李莉, 牧野昭二
- Organizer
  日本音響学会2021年春季研究発表会講演論文集, 2-1-6, pp. 225-228
- Related Report
  2020 Annual Research Report
[Presentation] SepNet: 高速多チャンネル音源分離のための分離行列予測ネットワーク2021
- Author(s)
  井上翔太, 亀岡弘和, 李莉, 牧野昭二
- Organizer
  日本音響学会2021年春季研究発表会講演論文集, 2-1-5, pp. 221-224
- Related Report
  2020 Annual Research Report
[Presentation] 識別的変分自己符号化器学習による特定話者モノラル音声分離2021
- Author(s)
  村島允也, 牧野昭二, 亀岡弘和, 李莉, 関翔悟
- Organizer
  日本音響学会2021年春季研究発表会講演論文集, 2-1-1, pp. 205-208
- Related Report
  2020 Annual Research Report
[Presentation] Geometrically constrained independent vector analysis for directional speech enhancement2020
- Author(s)
  Li Li, Kazuhito Koishida
- Organizer
  2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP2020), pp. 846-850
- Related Report
  2020 Annual Research Report
- Int'l Joint Research
[Presentation] Determined audio source separation with multichannel star generative adversarial network2020
- Author(s)
  Li Li, Hirokazu Kameoka, Shoji Makino
- Organizer
  The 30th IEEE International Workshop on Machine Learning for Signal Processing (MLSP2020)
- Related Report
  2020 Annual Research Report
- Int'l Joint Research
[Presentation] Online directional speech enhancement using geometrially constrained independent vector analysis2020
- Author(s)
  Li Li, Kazuhito Koishida, Shoji Makino
- Organizer
  The 21th Annual Conference of the International Speech Communication Association (Interspeech2020), pp. 61-65
- Related Report
  2020 Annual Research Report
- Int'l Joint Research
[Presentation] Study on geometrically constrained IVA with auxiliary function approach and VCD for in-car communication2020
- Author(s)
  Kana Goto, Li Li, Riki Takahashi, Shoji Makino, Takeshi Yamada
- Organizer
  The 12th annual conference of Asia-Pacific Signal and Information Processing Association (APSIPA2020), pp. 858-862
- Related Report
  2020 Annual Research Report
- Int'l Joint Research
[Presentation] 一般化指令応答モデルを用いた変分自己符号化器に基づく歌唱F0パターンの生成2020
- Author(s)
  多賀遥香，関翔悟，李莉，武田一哉，戸田智基
- Organizer
  日本音響学会2020年秋季研究発表会講演論文集，1-2-16，pp. 731-732
- Related Report
  2020 Annual Research Report
[Presentation] Underdetermined multichannel speech enhancement using time-frequency-bin-wise switching beamformer and gated CNN-based time-frequency mask for reverberant environments2020
- Author(s)
  Riki Takahashi, Kouei Yamaoka, Li Li, Shoji Makino, Takeshi Yamada, Mitsuo Matsumoto
- Organizer
  RISP International Workshop on Nonlinear Circuits, Communications and Signal Processing (NCSP2020)
- Related Report
  2019 Annual Research Report
- Int'l Joint Research
[Presentation] Geometrically constrained independent vector analysis for directional speech enhancement2020
- Author(s)
  Li Li, Kazuhito Koishida
- Organizer
  2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP2020)
- Related Report
  2019 Annual Research Report
- Int'l Joint Research
[Presentation] Fast MVAE: Joint separation and classification of mixed sources based on multichannel variational autoencoder with auxiliary classifier2019
- Author(s)
  Li Li, Hirokazu Kameoka, Shoji Makino
- Organizer
  2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP2019), pp. 546-550
- Related Report
  2019 Annual Research Report
- Int'l Joint Research
[Presentation] Joint separation and dereverberation of reverberant mixtures with multichannel variational autoencoder2019
- Author(s)
  Shota Inoue, Hirokazu Kameoka, Li Li, Shogo Seki, Shoji Makino
- Organizer
  2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP2019), pp. 56-60
- Related Report
  2019 Annual Research Report
- Int'l Joint Research
[Presentation] Voice activity detection under high levels of noise using gated convolutional neural networks2019
- Author(s)
  Li Li, Kouei Yamaoka, Yuki Koshino, Mitsuo Matsumoto, Shoji Makino
- Organizer
  International Congress on Acoustics (ICA2019), pp.6988-6995
- Related Report
  2019 Annual Research Report
- Int'l Joint Research / Invited
[Presentation] Generalized multichannel variational autoencoder for underdetermined source separation2019
- Author(s)
  Shogo Seki, Hirokazu Kameoka, Li Li, Tomoki Toda, Kazuya Takeda
- Organizer
  The 2019 European Signal Processing Conference (EUSIPCO2019), pp. 1973-1977
- Related Report
  2019 Annual Research Report
- Int'l Joint Research
[Presentation] Joint separation, dereverberation and classification of mixed sources using multichannel variational autoencoder with auxiliary classifier2019
- Author(s)
  Shota Inoue, Li Li, Hirokazu Kameoka, Shoji Makino
- Organizer
  International Congress on Acoustics (ICA2019), pp.6988-6995
- Related Report
  2019 Annual Research Report
- Int'l Joint Research / Invited
[Presentation] CNN-based virtual microphone signal estimation for MPDR Beamforming in underdetermined situations2019
- Author(s)
  Kouei Yamaoka, Li Li, Nobutaka Ono, Shoji Makino, Takeshi Yamada
- Organizer
  The 2019 European Signal Processing Conference (EUSIPCO2019), pp. 1049-1053
- Related Report
  2019 Annual Research Report
- Int'l Joint Research / Invited
[Presentation] Improving singing aid system for laryngectomees with statistical voice conversion and VAE-SPACE2019
- Author(s)
  Li Li, Tomoki Toda, Kazuho Morikawa, Kazuhiro Kobayashi, Shoji Makino
- Organizer
  20th International Society for Music Information Retrieval Conference (ISMIR2019), pp. 784-790
- Related Report
  2019 Annual Research Report
- Int'l Joint Research
[Presentation] 多チャンネル変分自己符号化器法による任意話者の音源分離2019
- Author(s)
  李莉，亀岡弘和，井上翔太，牧野昭二
- Organizer
  電子情報通信学会技術研究報告, vol. 119, no. 334, EA2019-77, pp. 79-84
- Related Report
  2019 Annual Research Report
[Presentation] Fast MVAE: Joint separation and classification of mixed sources based on multichannel variational autoencoder with auxiliary classifier2019
- Author(s)
  Li Li, Hirokazu Kameoka, and Shoji Makino
- Organizer
  2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP2019), pp. 546-550
- Related Report
  2018 Annual Research Report
- Int'l Joint Research
[Presentation] Joint separation and dereverberation of reverberant mixtures with multichannel variational autoencoder2019
- Author(s)
  Shota Inoue, Hirokazu Kameoka, Li Li, Shogo Seki, and Shoji Makino
- Organizer
  2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP2019), pp. 96-100
- Related Report
  2018 Annual Research Report
- Int'l Joint Research
[Presentation] Voice activity detection under high levels of noise using gated convolutional neural networks2019
- Author(s)
  Li Li, Kouei Yamaoka, Yuki Koshino, Mitsuo Matsumoto, and Shoji Makino
- Organizer
  International Congress on Acoustics (ICA2019)
- Related Report
  2018 Annual Research Report
- Int'l Joint Research / Invited
[Presentation] Joint separation, dereverberation and classification of mixed sources using multichannel variational autoencoder with auxiliary classifier2019
- Author(s)
  Shota Inoue, Li Li, Hirokazu Kameoka, and Shoji Makino
- Organizer
  International Congress on Acoustics (ICA2019)
- Related Report
  2018 Annual Research Report
- Int'l Joint Research / Invited
[Presentation] 音源クラス識別器つき多チャンネル変分自己符号化器を用いた高速セミブラインド音源分離2019
- Author(s)
  李莉，亀岡弘和，牧野昭二
- Organizer
  日本音響学会2019年春季研究発表会，1-6-10，pp. 201-204
- Related Report
  2018 Annual Research Report
[Presentation] 多チャンネル変分自己符号化器を用いた劣決定音源分離2019
- Author(s)
  関翔悟，亀岡弘和，李莉，戸田智基，武田一哉
- Organizer
  日本音響学会2019年春季研究発表会，1-6-20，pp. 229-230
- Related Report
  2018 Annual Research Report
[Presentation] 多チャンネル変分自己符号化器を用いた音源分離と残響除去の統合的アプローチ2019
- Author(s)
  井上翔太，亀岡弘和，李莉，関翔悟，牧野昭二
- Organizer
  日本音響学会2019年春季研究発表会，2-Q-32，pp. 399-402
- Related Report
  2018 Annual Research Report
[Presentation] 時間周波数スイッチングビームフォーマとGated CNNを用いた時間周波数マスクの組み合わせによる劣決定音声強調2019
- Author(s)
  髙橋理希，山岡洸瑛，李莉，牧野昭二，山田武
- Organizer
  日本音響学会2019年春季研究発表会，1-6-5，pp. 181-184
- Related Report
  2018 Annual Research Report
[Presentation] Gated CNNを用いた劣悪な雑音環境下における音声区間検出2019
- Author(s)
  李莉，越野ゆき，松本光雄，牧野昭二
- Organizer
  電子情報通信学会電気音響研究会, EA2018-102, pp. 19-24
- Related Report
  2018 Annual Research Report
[Presentation] 多チャンネル変分自己符号化器を用いた劣決定音源分離の評価2019
- Author(s)
  関翔悟，亀岡弘和，李莉，戸田智基，武田一哉
- Organizer
  電子情報通信学会　電気音響研究会, EA2018-154, pp. 323-328
- Related Report
  2018 Annual Research Report
[Presentation] Deep clustering with gated convolutional networks2018
- Author(s)
  Li Li, and Hirokazu Kameoka
- Organizer
  2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP2018), pp. 16-20
- Related Report
  2018 Annual Research Report
- Int'l Joint Research
[Remarks] MVAE法とFastMVAE法のオープンソース
- URL
  https://github.com/lili-0805/MVAE
- Related Report
  2020 Annual Research Report

人間の聴覚特性を考慮した残響・雑音環境下における音声信号処理の研究

Principal Investigator

李 莉 筑波大学, システム情報工学研究科, 特別研究員(DC1)

¥2,800,000 (Direct Cost: ¥2,800,000)

Report

Research Products

[Journal Article] FastMVAE: A Fast Optimization Algorithm for the Multichannel Variational Autoencoder Method2020

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Majorization-Minimization Algorithm for Discriminative Non-Negative Matrix Factorization2020

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Supervised determined source separation with multichannel variational autoencoder2019

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Underdetermined source separation based on generalized multichannel variational autoencoder2019

Author(s)

Journal Title

DOI

Related Report

[Presentation] Single-channel multi-speaker separation via discriminative training of variational autoencoder spectrogram model2021

Author(s)

Organizer

Related Report

[Presentation] VMInNet: Interpolation of virtual microphones in optimal latent space explored by autoencoder2021

Author(s)

Organizer

Related Report

[Presentation] Teacher-student learning for low-latency online speech enhancement using wave-U-net2021

Author(s)

Organizer

Related Report

[Presentation] SepNet: A deep separation matrix prediction network for multichannel audio source separation2021

Author(s)

Organizer

Related Report

[Presentation] 車室内環境を想定したWave-U-Netによる雑音除去の検討2021

Author(s)

Organizer

Related Report

[Presentation] 車室内の三角マイクロフォンアレイへのヴァーチャルマイクロフォン技術の適用2021

Author(s)

Organizer

Related Report

[Presentation] 補助関数法に基づく幾何学的制約付き独立ベクトル分析の車室内音声強調への適用2021

Author(s)

Organizer

Related Report

[Presentation] Teacher-Student学習を用いたWave-U-netによる低遅延リアルタイム音声強調2021

Author(s)

Organizer

Related Report

[Presentation] SepNet: 高速多チャンネル音源分離のための分離行列予測ネットワーク2021

Author(s)

Organizer

Related Report

[Presentation] 識別的変分自己符号化器学習による特定話者モノラル音声分離2021

Author(s)

Organizer

Related Report

[Presentation] Geometrically constrained independent vector analysis for directional speech enhancement2020

Author(s)

Organizer

Related Report

[Presentation] Determined audio source separation with multichannel star generative adversarial network2020

Author(s)

Organizer

Related Report

[Presentation] Online directional speech enhancement using geometrially constrained independent vector analysis2020

Author(s)

Organizer

Related Report

[Presentation] Study on geometrically constrained IVA with auxiliary function approach and VCD for in-car communication2020

Author(s)

李莉筑波大学, システム情報工学研究科, 特別研究員(DC1)