2019 Fiscal Year Annual Research Report

音環境の認識と理解のための革新的マイクロホンアレー基盤技術の研究

Research Project

Project/Area Number	19H04131
Research Institution	University of Tsukuba
Principal Investigator	牧野昭二筑波大学, システム情報系, 教授 (60396190)
Co-Investigator(Kenkyū-buntansha)	猿渡洋東京大学, 大学院情報理工学系研究科, 教授 (30324974) 山田武志筑波大学, システム情報系, 准教授 (20312829)
Project Period (FY)	2019-04-01 – 2022-03-31
Keywords	ブラインド音源分離 / 音響イベント検出 / 音情景解析
Outline of Annual Research Achievements	[検討項目１] 音の伝播の物理的なモデルに基づいて観測信号を補間し、実際には存在しない、いわばバーチャルな観測信号を作り出して素子数を擬似的に増やすことにより、音源数に依存することなく高品質な出力を得るための統一的なアレー信号処理を検討した。擬似観測の振幅は非線形補間により推定した。擬似観測を用いた音声強調の劣決定拡張により、擬似観測の基本的な検証を行った。さらに、バーチャルマイクロホンの動作原理の解明と高性能化を図った。今期は、国際会議発表４件、および、国内大会発表１件の研究成果を得た。 [検討項目２] 音環境からの情報を利用した多チャネル信号処理アルゴリズムを開発した。既存のアルゴリズムを分散型マイクロホンアレーに対応できるように一般化し、さらに強力な最適化規範を導入した。分散型マイクロホンアレーにおけるサブアレーの同期手法を開発した。ブラインド音源分離/抽出アルゴリズムや多チャネル残響除去アルゴリズムを分散型マイクロホンアレーに対応できるように開発した。さらに、必要なマイクロホンを最小化して演算量を削減しながら、性能を最適化するためのマイクロホン選択手法も検討した。今期は、雑誌論文１件、国際会議発表７件、および、国内大会発表４件の研究成果を得た。 [検討項目３] 強調された音源信号から抽出した特徴量に基づき、音環境を解析・理解した。音源信号に関する先見知識を利用し、特徴量次元での分類法も利用した。分類精度を向上させるために、深層学習などの最新の音声認識技術を活用した。今期は、国際会議発表２件、および、国内大会発表６件の研究成果を得た。
Current Status of Research Progress	Current Status of Research Progress 2: Research has progressed on the whole more than it was originally planned. Reason 研究は順調に進展し、雑誌論文１件、国際会議発表１３件、国内大会発表１１件の研究成果を得た。予算執行は、新型コロナウイルス感染症による影響により、遅れている。
Strategy for Future Research Activity	[検討項目１] 音の伝播の物理的なモデルに基づいて観測信号を補間し、実際には存在しない、いわばバーチャルな観測信号を作り出して素子数を擬似的に増やすことにより、音源数に依存することなく高品質な出力を得るための統一的なアレー信号処理を検討する。擬似観測の振幅は非線形補間により推定する。擬似観測を用いた音声強調の劣決定拡張により、擬似観測の基本的な検証を行う。さらに、バーチャルマイクロホンの動作原理の解明と高性能化を図る。 [検討項目２] 音環境からの情報を利用した多チャネル信号処理アルゴリズムを開発する。既存のアルゴリズムを分散型マイクロホンアレーに対応できるように一般化し、さらに強力な最適化規範を導入する。分散型マイクロホンアレーにおけるサブアレーの同期手法を開発する。ブラインド音源分離/抽出アルゴリズムや多チャネル残響除去アルゴリズムを分散型マイクロホンアレーに対応できるように開発する。さらに、必要なマイクロホンを最小化して演算量を削減しながら、性能を最適化するためのマイクロホン選択手法も検討する。 [検討項目３] 強調された音源信号から抽出した特徴量に基づき、音環境を解析・理解する。音源信号に関する先見知識を利用し、特徴量次元での分類法も利用する。分類精度を向上させるために、深層学習などの最新の音声認識技術を活用する。

Research Products
(25 results)

All 2020 2019

All Journal Article (1 results) (of which Peer Reviewed: 1 results, Open Access: 1 results) Presentation (24 results) (of which Int'l Joint Research: 13 results, Invited: 5 results)

[Journal Article] Supervised determined source separation with multichannel variational autoencoder2019
- Author(s)
  H. Kameoka, L. Li, S. Inoue, and S. Makino
- Journal Title
  
  Neural Computation
  
  Volume: vol. 31, no. 9 Pages: 1891-1914
- DOI
  10.1162/neco_a_01217
- Peer Reviewed / Open Access
[Presentation] Blind source separation with low latency for in-car communication2020
- Author(s)
  T. Ueda, S. Inoue, S. Makino, M. Matsumoto, and T. Yamada
- Organizer
  RISP International Workshop on Nonlinear Circuits, Communications and Signal Processing (NCSP2020)
- Int'l Joint Research
[Presentation] Underdetermined multichannel speech enhancement using time-frequency-bin-wise switching beamformer and gated CNN-based time-frequency mask for reverberant environments2020
- Author(s)
  R. Takahashi, K. Yamaoka, L. Li, S. Makino, T. Yamada, and M. Matsumoto
- Organizer
  RISP International Workshop on Nonlinear Circuits, Communications and Signal Processing (NCSP2020)
- Int'l Joint Research
[Presentation] Spatial feature extraction based on convolutional neural network with multiple microphone inputs for monitoring of domestic activities2020
- Author(s)
  Y. Kaneko, R. Kurosawa, T. Yamada, and Shoji Makino
- Organizer
  RISP International Workshop on Nonlinear Circuits, Communications and Signal Processing (NCSP2020)
- Int'l Joint Research
[Presentation] 車室内コミュニケーション用低遅延音源分離手法の検討2020
- Author(s)
  上田哲也, 井上翔太, 牧野昭二, 松本光雄, 山田武志
- Organizer
  日本音響学会 2020年春季研究発表会講演論文集
[Presentation] DNNマスク推定に基づく畳み込みビームフォーマによる音源分離・残響除去・雑音除去の同時実現2020
- Author(s)
  髙橋理希, 中谷智広, 落合翼, 木下慶介, 池下林太郎, Marc Delcroix, 荒木章子, 牧野昭二
- Organizer
  日本音響学会 2020年春季研究発表会講演論文集
[Presentation] 基底共有型半教師あり独立低ランク行列分析に基づく多チャネル補聴器システム2020
- Author(s)
  宇根昌和, 久保優騎, 高宗典玄, 北村大地, 猿渡洋, 牧野昭二
- Organizer
  日本音響学会 2020年春季研究発表会講演論文集
[Presentation] 発話の時間変動に着目した音声認識誤り区間推定の検討2020
- Author(s)
  舒禹清, 山田武志, 牧野昭二
- Organizer
  日本音響学会 2020年春季研究発表会講演論文集
[Presentation] 空間特徴と音響特徴を併用する音響イベント検出の検討2020
- Author(s)
  陳軼夫, 山田武志, 牧野昭二
- Organizer
  日本音響学会 2020年春季研究発表会講演論文集
[Presentation] 空間フィルタの自動推定による音響シーン識別の検討2020
- Author(s)
  大野泰己, 山田武志, 牧野昭二
- Organizer
  電子情報通信学会 2020年総合大会
[Presentation] Generative Adversarial Networks を用いた半教師あり学習の音響イベント検出への適用2020
- Author(s)
  合馬一弥, 山田武志, 牧野昭二
- Organizer
  電子情報通信学会 2020年総合大会
[Presentation] Time-frequency-bin-wise switching of minimum variance distortionless response beamformer for underdetermined situations2019
- Author(s)
  K. Yamaoka, N. Ono, S. Makino, and T. Yamada
- Organizer
  International Conference on Acoustics, Speech, and Signal Processing (ICASSP2019)
- Int'l Joint Research / Invited
[Presentation] Fast MVAE: Joint separation and classification of mixed sources based on multichannel variational autoencoder with auxiliary classifier2019
- Author(s)
  L. Li, H. Kameoka, and S. Makino
- Organizer
  International Conference on Acoustics, Speech, and Signal Processing (ICASSP2019)
- Int'l Joint Research
[Presentation] Joint separation and dereverberation of reverberant mixtures with multichannel variational autoencoder2019
- Author(s)
  S. Inoue, H. Kameoka, L. Li, S. Seki, and S. Makino
- Organizer
  International Conference on Acoustics, Speech, and Signal Processing (ICASSP2019)
- Int'l Joint Research
[Presentation] CNN-based virtual microphone signal estimation for MPDR beamforming in underdetermined situations2019
- Author(s)
  K. Yamaoka, L. Li, N. Ono, S. Makino, and T. Yamada
- Organizer
  European Signal Processing Conference (EUSIPCO 2019)
- Int'l Joint Research / Invited
[Presentation] Wavelength proportional arrangement of virtual microphones based on interpolation/extrapolation for underdetermined speech enhancement2019
- Author(s)
  R. Jinzai, K. Yamaoka, M. Matsumoto, S. Makino, and T. Yamada
- Organizer
  European Signal Processing Conference (EUSIPCO 2019)
- Int'l Joint Research / Invited
[Presentation] Gated convolutional neural network-based voice activity detection under high-level noise environments2019
- Author(s)
  L. Li, K. Yamaoka, Y. Koshino, M. Matsumoto, and S. Makino
- Organizer
  International Congress on Acoustics (ICA2019)
- Int'l Joint Research
[Presentation] Joint separation, dereverberation and classification of multiple sources using multichannel variational autoencoder with auxiliary classifier2019
- Author(s)
  S. Inoue, H. Kameoka, L. Li, and S. Makino
- Organizer
  International Congress on Acoustics (ICA2019)
- Int'l Joint Research / Invited
[Presentation] Improving singing aid system for laryngectomees with statistical voice conversion and VAE-SPACE2019
- Author(s)
  L. Li, T. Toda, K. Morikawa, K. Kobayashi, and S. Makino
- Organizer
  Annual Conference of the International Society for Music Information Retrieval (ISMIR2019)
- Int'l Joint Research
[Presentation] Evaluation of multichannel hearing aid system by rank-constrained spatial covariance matrix estimation2019
- Author(s)
  M. Une, Y. Kubo, N. Takamune, D. Kitamura, H. Saruwatari, and S. Makino
- Organizer
  Asia-Pacific Signal and Information Processing Association (APSIPA 2019)
- Int'l Joint Research / Invited
[Presentation] Classifcation of causes of speech recognition errors using attention-based bidirectional long short-term memory and modulation spectrum2019
- Author(s)
  J. Santoso, T. Yamada, and S. Makino
- Organizer
  Asia-Pacific Signal and Information Processing Association (APSIPA 2019)
- Int'l Joint Research
[Presentation] ランク制約付き空間共分散モデル推定を用いた多チャネル補聴器システムの評価2019
- Author(s)
  宇根昌和, 久保優騎, 高宗典玄, 北村大地, 猿渡洋, 牧野昭二
- Organizer
  日本音響学会 2019年秋季研究発表会講演論文集
[Presentation] BLSTMと変調スペクトルを用いた発話特徴識別の検討2019
- Author(s)
  サントソジェニファー, 山田武志, 牧野昭二
- Organizer
  日本音響学会 2019年秋季研究発表会講演論文集
[Presentation] BLSTMを用いた音声認識誤り区間推定の検討2019
- Author(s)
  舒禹清, 山田武志, 牧野昭二
- Organizer
  日本音響学会 2019年秋季研究発表会講演論文集
[Presentation] 多チャンネル変分自己符号化器法による任意話者の音源分離2019
- Author(s)
  李莉, 亀岡弘和, 井上翔太, 牧野昭二
- Organizer
  電子情報通信学会 2019年応用音響研究会

2019 Fiscal Year Annual Research Report

音環境の認識と理解のための革新的マイクロホンアレー基盤技術の研究

Principal Investigator

牧野 昭二 筑波大学, システム情報系, 教授 (60396190)

Current Status of Research Progress

Reason

Research Products

[Journal Article] Supervised determined source separation with multichannel variational autoencoder2019

Author(s)

Journal Title

DOI

[Presentation] Blind source separation with low latency for in-car communication2020

Author(s)

Organizer

[Presentation] Underdetermined multichannel speech enhancement using time-frequency-bin-wise switching beamformer and gated CNN-based time-frequency mask for reverberant environments2020

Author(s)

Organizer

[Presentation] Spatial feature extraction based on convolutional neural network with multiple microphone inputs for monitoring of domestic activities2020

Author(s)

Organizer

[Presentation] 車室内コミュニケーション用低遅延音源分離手法の検討2020

Author(s)

Organizer

[Presentation] DNNマスク推定に基づく畳み込みビームフォーマによる音源分離・残響除去・雑音除去の同時実現2020

Author(s)

Organizer

[Presentation] 基底共有型半教師あり独立低ランク行列分析に基づく多チャネル補聴器システム2020

Author(s)

Organizer

[Presentation] 発話の時間変動に着目した音声認識誤り区間推定の検討2020

Author(s)

Organizer

[Presentation] 空間特徴と音響特徴を併用する音響イベント検出の検討2020

Author(s)

Organizer

[Presentation] 空間フィルタの自動推定による音響シーン識別の検討2020

Author(s)

Organizer

[Presentation] Generative Adversarial Networks を用いた半教師あり学習の音響イベント検出への適用2020

Author(s)

Organizer

[Presentation] Time-frequency-bin-wise switching of minimum variance distortionless response beamformer for underdetermined situations2019

Author(s)

Organizer

[Presentation] Fast MVAE: Joint separation and classification of mixed sources based on multichannel variational autoencoder with auxiliary classifier2019

Author(s)

Organizer

[Presentation] Joint separation and dereverberation of reverberant mixtures with multichannel variational autoencoder2019

Author(s)

Organizer

[Presentation] CNN-based virtual microphone signal estimation for MPDR beamforming in underdetermined situations2019

Author(s)

Organizer

[Presentation] Wavelength proportional arrangement of virtual microphones based on interpolation/extrapolation for underdetermined speech enhancement2019

Author(s)

Organizer

[Presentation] Gated convolutional neural network-based voice activity detection under high-level noise environments2019

Author(s)

Organizer

[Presentation] Joint separation, dereverberation and classification of multiple sources using multichannel variational autoencoder with auxiliary classifier2019

Author(s)

Organizer

[Presentation] Improving singing aid system for laryngectomees with statistical voice conversion and VAE-SPACE2019

Author(s)

Organizer

[Presentation] Evaluation of multichannel hearing aid system by rank-constrained spatial covariance matrix estimation2019

Author(s)

Organizer

[Presentation] Classifcation of causes of speech recognition errors using attention-based bidirectional long short-term memory and modulation spectrum2019

Author(s)

Organizer

[Presentation] ランク制約付き空間共分散モデル推定を用いた多チャネル補聴器システムの評価2019

Author(s)

Organizer

[Presentation] BLSTMと変調スペクトルを用いた発話特徴識別の検討2019

Author(s)

Organizer

[Presentation] BLSTMを用いた音声認識誤り区間推定の検討2019

Author(s)

Organizer

牧野昭二筑波大学, システム情報系, 教授 (60396190)