Acoustic scene classification based on spatial attention mechanism

Research Project

Project/Area Number	20K11880
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Multi-year Fund
Section	一般
Review Section	Basic Section 61010:Perceptual information processing-related
Research Institution	University of Tsukuba
Principal Investigator	YAMADA Takeshi 筑波大学, システム情報系, 教授 (20312829)
Project Period (FY)	2020-04-01 – 2024-03-31
Project Status	Completed (Fiscal Year 2023)
Budget Amount *help	¥4,030,000 (Direct Cost: ¥3,100,000、Indirect Cost: ¥930,000) Fiscal Year 2022: ¥1,040,000 (Direct Cost: ¥800,000、Indirect Cost: ¥240,000) Fiscal Year 2021: ¥1,040,000 (Direct Cost: ¥800,000、Indirect Cost: ¥240,000) Fiscal Year 2020: ¥1,950,000 (Direct Cost: ¥1,500,000、Indirect Cost: ¥450,000)
Keywords	音響シーン識別 / 空間アテンション機構 / ビームフォーマ / 空間フィルタ / ニューラルネットワーク / 損失関数 / 空間信号処理 / 音響イベント検出 / マイクロホンアレー / 空間情報 / アテンション機構
Outline of Research at the Start	音響シーン識別において複数のマイクの録音信号を入力することにより、個々の音源の方向などの空間特徴を活用することが可能となり、識別性能のさらなる向上が期待できる。本研究の目的は、空間信号処理と識別器の融合による新しい音響シーン識別手法を確立することである。具体的には、音響シーンに存在する複数の音源の中からより重要な音源に自動的に焦点を当てる機能（空間アテンション機構）を有するニューラルネットワークを新たに開発する。
Outline of Final Research Achievements	In order to improve the performance of acoustic scene classification that uses a beamformer as preprocessing, this study introduced a new idea of a spatial attention mechanism that automatically focuses on the sound of interest (useful for classification) among multiple sounds present in the acoustic scene. To realize this idea, we proposed a classification method based on automatic weighting of multiple spatial filter outputs and, as its extension, a classification method based on automatic estimation of spatial filters, and demonstrated their effectiveness through experiments.
Academic Significance and Societal Importance of the Research Achievements	本研究成果の学術的独自性と創造性は、空間アテンション機構という新しいアイデアを実現した点にある。これにより、目的音方向などの事前情報を必要とせず、注目すべき音がどの音なのかを自動的に見つけると共に、それを強調するための空間フィルタを自動推定することが可能となった。これは信号処理技術と識別技術の有機的な統合によって成し得たものであり、音響シーン識別のみならず、雑音下音声認識などの他の様々なタスクへの展開が期待できる。

Report

(5 results)

2023 Annual Research Report Final Research Report ( PDF )
2022 Research-status Report
2021 Research-status Report
2020 Research-status Report

Research Products
(9 results)

All 2024 2022 2021

All Journal Article (1 results) (of which Peer Reviewed: 1 results, Open Access: 1 results) Presentation (8 results) (of which Int'l Joint Research: 3 results)

[Journal Article] Monitoring of Domestic Activities Using Multiple Beamformers and Attention Mechanism2021
- Author(s)
  Kaneko Yuki、Yamada Takeshi、Makino Shoji
- Journal Title
  
  Journal of Signal Processing
  
  Volume: 25 Issue: 6 Pages: 239-243
- DOI
  10.2299/jsp.25.239
- NAID
  130008110097
- ISSN
  1342-6230, 1880-1013
- Year and Date
  2021-11-01
- Related Report
  2021 Research-status Report
- Peer Reviewed / Open Access
[Presentation] 注目すべき音を自動検出するニューラルビームフォーマの残響下での有効性評価2024
- Author(s)
  市川創大, 山田武志
- Organizer
  日本音響学会春季研究発表会
- Related Report
  2023 Annual Research Report
[Presentation] Neural beamformer with automatic detection of notable sounds for acoustic scene classification2022
- Author(s)
  Sota Ichikawa, Takeshi Yamada, Shoji Makino
- Organizer
  APSIPA ASC (Asia-Pacific Signal and Information Processing Association Annual Summit and Conference) 2022
- Related Report
  2022 Research-status Report
- Int'l Joint Research
[Presentation] 音響シーン識別のための注目すべき音を自動検出するニューラルビームフォーマの検討2022
- Author(s)
  市川創大, 山田武志, 牧野昭二
- Organizer
  音学シンポジウム2022
- Related Report
  2022 Research-status Report
[Presentation] Semi-supervised learning using weakly labeled data generated by GAN in sound event detection2022
- Author(s)
  Kazuya Ouma, Takeshi Yamada, Shoji Makino
- Organizer
  RISP International Workshop on Nonlinear Circuits, Communications and Signal Processing 2022 (NCSP'22)
- Related Report
  2021 Research-status Report
- Int'l Joint Research
[Presentation] Wave-U-Netと識別器のエンドツーエンド学習による音響シーン識別の検討2022
- Author(s)
  山田友紀, 山田武志, 牧野昭二
- Organizer
  日本音響学会春季研究発表会
- Related Report
  2021 Research-status Report
[Presentation] 音響イベント検出におけるGANを用いた弱ラベルデータ生成による半教師あり学習2021
- Author(s)
  合馬一弥, 山田武志, 牧野昭二
- Organizer
  日本音響学会秋季研究発表会
- Related Report
  2021 Research-status Report
[Presentation] Monitoring of domestic activities using multiple beamformers and attention mechanism2021
- Author(s)
  Yuki Kaneko, Takeshi Yamada, Shoji Makino
- Organizer
  RISP International Workshop on Nonlinear Circuits, Communications and Signal Processing 2021 (NCSP'21)
- Related Report
  2020 Research-status Report
- Int'l Joint Research
[Presentation] 音響イベント検出と位置推定における転移学習の効果の検証2021
- Author(s)
  陳軼夫, 山田武志, 牧野昭二
- Organizer
  日本音響学会2021年春季研究発表会
- Related Report
  2020 Research-status Report

Acoustic scene classification based on spatial attention mechanism

Principal Investigator

YAMADA Takeshi 筑波大学, システム情報系, 教授 (20312829)

¥4,030,000 (Direct Cost: ¥3,100,000、Indirect Cost: ¥930,000)

Report

Research Products

[Journal Article] Monitoring of Domestic Activities Using Multiple Beamformers and Attention Mechanism2021

Author(s)

Journal Title

DOI

NAID

ISSN

Year and Date

Related Report

[Presentation] 注目すべき音を自動検出するニューラルビームフォーマの残響下での有効性評価2024

Author(s)

Organizer

Related Report

[Presentation] Neural beamformer with automatic detection of notable sounds for acoustic scene classification2022

Author(s)

Organizer

Related Report

[Presentation] 音響シーン識別のための注目すべき音を自動検出するニューラルビームフォーマの検討2022

Author(s)

Organizer

Related Report

[Presentation] Semi-supervised learning using weakly labeled data generated by GAN in sound event detection2022

Author(s)

Organizer

Related Report

[Presentation] Wave-U-Netと識別器のエンドツーエンド学習による音響シーン識別の検討2022

Author(s)

Organizer

Related Report

[Presentation] 音響イベント検出におけるGANを用いた弱ラベルデータ生成による半教師あり学習2021

Author(s)

Organizer

Related Report

[Presentation] Monitoring of domestic activities using multiple beamformers and attention mechanism2021

Author(s)

Organizer

Related Report

[Presentation] 音響イベント検出と位置推定における転移学習の効果の検証2021

Author(s)

Organizer

Related Report