2021 Fiscal Year Annual Research Report

音環境の認識と理解のための革新的マイクロホンアレー基盤技術の研究

Research Project

Project/Area Number	19H04131
Research Institution	Waseda University
Principal Investigator	牧野昭二早稲田大学, 理工学術院(情報生産システム研究科・センター), 特任教授 (60396190)
Co-Investigator(Kenkyū-buntansha)	猿渡洋東京大学, 大学院情報理工学系研究科, 教授 (30324974) 山田武志筑波大学, システム情報系, 准教授 (20312829)
Project Period (FY)	2019-04-01 – 2022-03-31
Keywords	ブラインド音源分離 / 音響イベント検出 / 音情景解析
Outline of Annual Research Achievements	[検討項目１] 音の伝播の物理的なモデルに基づいて観測信号を補間し、実際には存在しない、いわばバーチャルな観測信号を作り出して素子数を擬似的に増やすことにより、音源数に依存することなく高品質な出力を得るための統一的なアレー信号処理を検討した。擬似観測の振幅は非線形補間により推定した。擬似観測を用いた音声強調の劣決定拡張により、擬似観測の基本的な検証を行った。さらに、バーチャルマイクロホンの動作原理の解明と高性能化を図った。今期は、雑誌論文２件、国際会議発表１件、および、国内大会発表１件の研究成果を得た。 [検討項目２] 音環境からの情報を利用した多チャネル信号処理アルゴリズムを開発した。既存のアルゴリズムを分散型マイクロホンアレーに対応できるように一般化し、さらに強力な最適化規範を導入した。分散型マイクロホンアレーにおけるサブアレーの同期手法を開発した。ブラインド音源分離/抽出アルゴリズムや多チャネル残響除去アルゴリズムを分散型マイクロホンアレーに対応できるように開発した。さらに、必要なマイクロホンを最小化して演算量を削減しながら、性能を最適化するためのマイクロホン選択手法も検討した。今期は、雑誌論文１件、国際会議発表８件、および、国内大会発表２件の研究成果を得た。 [検討項目３] 強調された音源信号から抽出した特徴量に基づき、音環境を解析・理解した。音源信号に関する先見知識を利用し、特徴量次元での分類法も利用した。分類精度を向上させるために、深層学習などの最新の音声認識技術を活用した。今期は、雑誌論文１件、国際会議発表２件、および、国内大会発表３件の研究成果を得た。
Research Progress Status	令和3年度が最終年度であるため、記入しない。
Strategy for Future Research Activity	令和3年度が最終年度であるため、記入しない。

Research Products
(21 results)

All 2022 2021

All Journal Article (4 results) (of which Peer Reviewed: 4 results, Open Access: 4 results) Presentation (17 results) (of which Int'l Joint Research: 10 results, Invited: 2 results)

[Journal Article] Time-frequency-bin-wise linear combination of beamformers for distortionless signal enhancemen2021
- Author(s)
  K. Yamaoka, N. Ono, and S. Makino
- Journal Title
  
  IEEE/ACM Trans. Audio, Speech and Language Processing
  
  Volume: vol. 29 Pages: 3461-3475
- DOI
  10.1109/TASLP.2021.3126950
- Peer Reviewed / Open Access
[Journal Article] Single-channel multispeaker separation with variational autoencoder spectrogram model2021
- Author(s)
  N. Murashima, H. Kameoka, L. Li, S. Seki, and S. Makino
- Journal Title
  
  Journal of Signal Processing
  
  Volume: vol. 25 Pages: 145-149
- DOI
  10.2299/jsp.25.145
- Peer Reviewed / Open Access
[Journal Article] VMInNet: Interpolation of virtual microphones in optimal latent space explored by autoencode2021
- Author(s)
  R. Takahashi, L. Li, S. Makino, and T. Yamada
- Journal Title
  
  Journal of Signal Processing
  
  Volume: VOL. 25 Pages: 245-250
- DOI
  10.2299/jsp.25.245
- Peer Reviewed / Open Access
[Journal Article] Monitoring of domestic activities using multiple beamformers and attention mechanism2021
- Author(s)
  Y. Kaneko, T. Yamada, and S. Makino,
- Journal Title
  
  Journal of Signal Processing
  
  Volume: VOL. 25 Pages: 239-243
- DOI
  10.2299/jsp.25.239
- Peer Reviewed / Open Access
[Presentation] Blind Source Separation of Moving Sound Sources in Reverberant Indoor Environments, '' in Proc2022
- Author(s)
  T. Yu, T. Ueda, and S. Makino
- Organizer
  RISP International Workshop on Nonlinear Circuits, Communications and Signal Processing (NCSP)
- Int'l Joint Research
[Presentation] Semi-Supervised Learning Using Weakly Labeled Data Generated by GAN in Sound Event Detection, '' in Proc2022
- Author(s)
  K. Ouma, T. Yamada, and S. Makino
- Organizer
  RISP International Workshop on Nonlinear Circuits, Communications and Signal Processing (NCSP)
- Int'l Joint Research
[Presentation] Neutral/Emotional Speech Classification Using Autoencoder and Output of Intermediate Layer in Emotion Recognizer2022
- Author(s)
  J. Santoso, T. Yamada, K. Ishizuka, T. Hashimoto, and S. Makino
- Organizer
  日本音響学会 2022年春　季研究発表会講演論文集
[Presentation] Wave-U-Netと識別器のエンドツーエンド学習による音響シーン識別の検討2022
- Author(s)
  山田友紀, 山田武志, 牧野昭二
- Organizer
  日本音響学会 2022年春季研究発表会講演論文集
[Presentation] Reducing algorithmic delay using low-overlap window for online Wave-U-Net2021
- Author(s)
  S. Nakaoka, L. Li, S. Makino, and T. Yamada
- Organizer
  Invited in Proc. APSIPA
- Int'l Joint Research / Invited
[Presentation] Extension of virtual microphone technique to multiple real microphones and investigation of the impact of phase and amplitude interpolation on speech enhancement2021
- Author(s)
  H. Segawa, L. Li, S. Makino, and T. Yamada
- Organizer
  in Proc. APSIPA
- Int'l Joint Research
[Presentation] Speech enhancement by noise self-supervised rank-constrained spatial covariance matrix estimation via independent deeply learned matrix analysis2021
- Author(s)
  S. Misawa, N. Takamune, T. Nakamura, D. Kitamura, H. Saruwatari, M. Une, and S. Makino
- Organizer
  in Proc. APSIPA
- Int'l Joint Research
[Presentation] Speech emotion recognition based on attention weight correction using word-level confidence measure2021
- Author(s)
  J. Santoso, T. Yamada, S. Makino, K. Ishizuka, and T. Hiramura
- Organizer
  in Proc. INTERSPEECH
- Int'l Joint Research
[Presentation] 'Low latency online source separation and noise reduction based on joint optimization with dereverberation2021
- Author(s)
  T. Ueda, T. Nakatani, R. Ikeshita, K. Kinoshita, S. Araki, and S. Makino
- Organizer
  Invited in Proc. EUSIPCO
- Int'l Joint Research
[Presentation] SepNet: A deep separation matrix prediction network for multichannel audio source separation2021
- Author(s)
  S. Inoue, H. Kameoka, L. Li, and S. Makino
- Organizer
  in Proc. ICASSP2021
- Int'l Joint Research
[Presentation] Low latency online blind source separation based on joint optimization with blind dereverberation2021
- Author(s)
  T. Ueda, T. Nakatani, R. Ikeshita, K. Kinoshita, S. Araki, and S. Makino
- Organizer
  in Proc. ICASSP2021
- Int'l Joint Research / Invited
[Presentation] Teacher-student learning for low-latency online speech enhancement using wave-U-net2021
- Author(s)
  S. Nakaoka, L. Li, S. Inoue, and S. Makino
- Organizer
  in Proc. ICASSP2021
- Int'l Joint Research
[Presentation] FastMVAE2: On improving and accelerating the fast variational autoencoder-based source separation algorithm for determined mixtures2021
- Author(s)
  L. Li, H. Kameoka, and S. Makino
- Organizer
  arXiv:2109.13496
[Presentation] ChimeraACVAEによる高速多チャンネル変分自己符号化器法2021
- Author(s)
  李莉, 亀岡弘和, 牧野昭二
- Organizer
  日本音響学会 2021年秋季研究発表会講演論文集
[Presentation] Low-overlap window を用いたオンラインWave-U-Net のアルゴリズム遅延の削減2021
- Author(s)
  中岡想太郎, 李莉, 牧野昭二, 山田武志
- Organizer
  日本音響学会 2021年秋季研究発表会講演論文集
[Presentation] ヴァーチャルマイクロフォンの内挿における位相及び振幅補間の音声強調性能への影響の評価2021
- Author(s)
  瀬川華子, 李莉, 牧野昭二, 山田武志
- Organizer
  日本音響学会 2021年秋季研究発表会講演論文集
[Presentation] 音響イベント検出におけるGANを用いた弱ラベルデータ生成による半教師あり学習2021
- Author(s)
  合馬一弥, 山田武志, 牧野昭二
- Organizer
  日本音響学会 2021年秋季研究発表会講演論文集

2021 Fiscal Year Annual Research Report

音環境の認識と理解のための革新的マイクロホンアレー基盤技術の研究

Principal Investigator

牧野 昭二 早稲田大学, 理工学術院(情報生産システム研究科・センター), 特任教授 (60396190)

Research Products

[Journal Article] Time-frequency-bin-wise linear combination of beamformers for distortionless signal enhancemen2021

Author(s)

Journal Title

DOI

[Journal Article] Single-channel multispeaker separation with variational autoencoder spectrogram model2021

Author(s)

Journal Title

DOI

[Journal Article] VMInNet: Interpolation of virtual microphones in optimal latent space explored by autoencode2021

Author(s)

Journal Title

DOI

[Journal Article] Monitoring of domestic activities using multiple beamformers and attention mechanism2021

Author(s)

Journal Title

DOI

[Presentation] Blind Source Separation of Moving Sound Sources in Reverberant Indoor Environments, '' in Proc2022

Author(s)

Organizer

[Presentation] Semi-Supervised Learning Using Weakly Labeled Data Generated by GAN in Sound Event Detection, '' in Proc2022

Author(s)

Organizer

[Presentation] Neutral/Emotional Speech Classification Using Autoencoder and Output of Intermediate Layer in Emotion Recognizer2022

Author(s)

Organizer

[Presentation] Wave-U-Netと識別器のエンドツーエンド学習による音響シーン識別の検討2022

Author(s)

Organizer

[Presentation] Reducing algorithmic delay using low-overlap window for online Wave-U-Net2021

Author(s)

Organizer

[Presentation] Extension of virtual microphone technique to multiple real microphones and investigation of the impact of phase and amplitude interpolation on speech enhancement2021

Author(s)

Organizer

[Presentation] Speech enhancement by noise self-supervised rank-constrained spatial covariance matrix estimation via independent deeply learned matrix analysis2021

Author(s)

Organizer

[Presentation] Speech emotion recognition based on attention weight correction using word-level confidence measure2021

Author(s)

Organizer

[Presentation] 'Low latency online source separation and noise reduction based on joint optimization with dereverberation2021

Author(s)

Organizer

[Presentation] SepNet: A deep separation matrix prediction network for multichannel audio source separation2021

Author(s)

Organizer

[Presentation] Low latency online blind source separation based on joint optimization with blind dereverberation2021

Author(s)

Organizer

[Presentation] Teacher-student learning for low-latency online speech enhancement using wave-U-net2021

Author(s)

Organizer

[Presentation] FastMVAE2: On improving and accelerating the fast variational autoencoder-based source separation algorithm for determined mixtures2021

Author(s)

Organizer

[Presentation] ChimeraACVAEによる高速多チャンネル変分自己符号化器法2021

Author(s)

Organizer

[Presentation] Low-overlap window を用いたオンラインWave-U-Net のアルゴリズム遅延の削減2021

Author(s)

Organizer

[Presentation] ヴァーチャルマイクロフォンの内挿における位相及び振幅補間の音声強調性能への影響の評価2021

Author(s)

Organizer

[Presentation] 音響イベント検出におけるGANを用いた弱ラベルデータ生成による半教師あり学習2021

Author(s)

Organizer

牧野昭二早稲田大学, 理工学術院(情報生産システム研究科・センター), 特任教授 (60396190)