• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Research on acoustic scene analysis by integrating time-domain deep leraning and multiresolution analysis

Research Project

Project/Area Number 20K19818
Research Category

Grant-in-Aid for Early-Career Scientists

Allocation TypeMulti-year Fund
Review Section Basic Section 61010:Perceptual information processing-related
Research InstitutionThe University of Tokyo

Principal Investigator

Nakamura Tomohiko  東京大学, 大学院情報理工学系研究科, 特任助教 (50866308)

Project Period (FY) 2020-04-01 – 2023-03-31
Project Status Completed (Fiscal Year 2022)
Budget Amount *help
¥4,160,000 (Direct Cost: ¥3,200,000、Indirect Cost: ¥960,000)
Fiscal Year 2022: ¥910,000 (Direct Cost: ¥700,000、Indirect Cost: ¥210,000)
Fiscal Year 2021: ¥910,000 (Direct Cost: ¥700,000、Indirect Cost: ¥210,000)
Fiscal Year 2020: ¥2,340,000 (Direct Cost: ¥1,800,000、Indirect Cost: ¥540,000)
Keywords音響情景分析 / 時間領域深層学習 / 多重解像度解析 / 音源分離 / 音響信号処理 / 深層学習 / 機械学習
Outline of Research at the Start

混合音から各音源信号を分離する技術である音源分離において,近年信号波形を直接入力・出力する時間領域深層学習が有望な結果を示している.しかし,時間領域深層学習では,高性能な音源分離を実現するように各構成要素のパラメータが学習されるため,各構成要素の機能は明確ではなく,発見的に研究が行われているのが実情である.一方,音響信号処理分野で提案された多重解像度解析は,機能が明確な構成要素を用いて全体として所望の信号解析機能を有するよう設計されている.本研究では,時間領域深層学習と多重解像度解析を融合し,両者の利点を兼ね備えた新たな音源分離手法(多重解像度深層分析)の創出を目指す.

Outline of Final Research Achievements

In this study, we proposed an audio source separation method, multiresolution deep-layered analysis. It comes from our finding that a waveform-domain audio source separation model, Wave-U-Net, resembles multiresolution analysis in downsampling (DS) architecture. Inspired by the resemblance, we developed a DS layer using the discrete wavelet transform. Music source separation experiments showed that the proposed method achieves higher separation performance than conventional waveform-based methods. We also extended the proposed layer so that its wavelets can be trained together with the other components of a deep neural network. This extension paves the way for obtaining suitable wavelets for target tasks in an end-to-end manner. Finally, we applied the proposed methods to monaural vocal ensemble separation and multi-channel audio source separation tasks and demonstrated the effectiveness of the proposed methods through experiments on these tasks.

Academic Significance and Societal Importance of the Research Achievements

本研究では,時間領域で直接分離を行う深層音源分離モデル(時間領域深層学習)と,信号処理・ウェーブレット解析で培われてきた多重解像度解析を融合する分野横断的方法論を創出した.時間領域深層学習では,高性能な音源分離を実現するように各構成要素のパラメータが学習されるため,各構成要素の機能は明確ではなかった.一方,多重解像度解析は,音源によって適切に設計する必要があるものの,機能が明確な構成要素を用いている.本研究成果は,両者を統合することで深層学習の高性能性と信号処理の高い解釈性を両立する第一歩となるものである.

Report

(4 results)
  • 2022 Annual Research Report   Final Research Report ( PDF )
  • 2021 Research-status Report
  • 2020 Research-status Report
  • Research Products

    (18 results)

All 2023 2022 2021 2020 Other

All Journal Article (3 results) (of which Peer Reviewed: 3 results,  Open Access: 3 results) Presentation (11 results) (of which Int'l Joint Research: 4 results) Remarks (4 results)

  • [Journal Article] Sampling-Frequency-Independent Convolutional Layer and its Application to Audio Source Separation2022

    • Author(s)
      Koichi Saito、Tomohiko Nakamura、Kohei Yatabe、Hiroshi Saruwatari
    • Journal Title

      IEEE/ACM Transactions on Audio, Speech, and Language Processing

      Volume: 30 Pages: 2928-2943

    • DOI

      10.1109/taslp.2022.3203907

    • Related Report
      2022 Annual Research Report
    • Peer Reviewed / Open Access
  • [Journal Article] Time-Domain Audio Source Separation With Neural Networks Based on Multiresolution Analysis2021

    • Author(s)
      Nakamura Tomohiko、Kozuka Shihori、Saruwatari Hiroshi
    • Journal Title

      IEEE/ACM Transactions on Audio, Speech, and Language Processing

      Volume: 29 Pages: 1687-1701

    • DOI

      10.1109/taslp.2021.3072496

    • Related Report
      2021 Research-status Report
    • Peer Reviewed / Open Access
  • [Journal Article] Harmonic-Temporal Factor Decomposition for Unsupervised Monaural Separation of Harmonic Sounds2021

    • Author(s)
      Nakamura Tomohiko、Kameoka Hirokazu
    • Journal Title

      IEEE/ACM Transactions on Audio, Speech, and Language Processing

      Volume: 29 Pages: 68-82

    • DOI

      10.1109/taslp.2020.3037487

    • Related Report
      2020 Research-status Report
    • Peer Reviewed / Open Access
  • [Presentation] jaCappella corpus: A Japanese a cappella vocal ensemble corpus2023

    • Author(s)
      Tomohiko Nakamura、Shinnosuke Takamichi、Naoko Tanji、Satoru Fukayama、Hiroshi Saruwatari
    • Organizer
      IEEE International Conference on Acoustics, Speech, and Signal Processing
    • Related Report
      2022 Annual Research Report
    • Int'l Joint Research
  • [Presentation] jaCappella コーパス:重唱分離・合成に向けた日本語アカペラ歌唱コーパス2022

    • Author(s)
      中村 友彦,高道 慎之介,丹治 尚子,深山 覚,猿渡 洋
    • Organizer
      日本音響学会第148回(2022年秋季)研究発表会
    • Related Report
      2022 Annual Research Report
  • [Presentation] 多重解像度深層分析を用いた楽音分離の実験的評価2021

    • Author(s)
      中村 友彦、猿渡 洋
    • Organizer
      音学シンポジウム2021(第131回 音楽情報科学研究会)
    • Related Report
      2021 Research-status Report
  • [Presentation] 周波数領域でのフィルタ設計に基づくサンプリング周波数非依存畳み込み層を用いたDNN音源分離2021

    • Author(s)
      齋藤 弘一、中村 友彦、矢田部 浩平、猿渡 洋
    • Organizer
      音学シンポジウム2021(第131回 音楽情報科学研究会)
    • Related Report
      2021 Research-status Report
  • [Presentation] Sampling-frequency-independent Audio Source Separation Using Convolution Layer Based on Impulse Invariant Method2021

    • Author(s)
      Koichi Saito、Tomohiko Nakamura、Kohei Yatabe、Hiroshi Saruwatari
    • Organizer
      European Signal Processing Conference 2021
    • Related Report
      2021 Research-status Report
    • Int'l Joint Research
  • [Presentation] Independent Deeply Learned Tensor Analysis for Determined Audio Source Separation2021

    • Author(s)
      Naoki Narisawa、Rintaro Ikeshita、Norihiro Takamune、Daichi Kitamura、Tomohiko Nakamura、Hiroshi Saruwatari、Tomohiro Nakatani
    • Organizer
      European Signal Processing Conference 2021
    • Related Report
      2021 Research-status Report
    • Int'l Joint Research
  • [Presentation] ヘビーテイル生成モデルに基づく独立深層学習テンソル分析2021

    • Author(s)
      成澤 直輝、池下 林太郎、高宗 典玄、北村 大地、中村 友彦、猿渡 洋、中谷 智広
    • Organizer
      日本音響学会 2021年秋季研究発表会
    • Related Report
      2021 Research-status Report
  • [Presentation] サンプリング周波数非依存音源分離モデルを用いた楽音分離の実験的評価2021

    • Author(s)
      齋藤 弘一、中村 友彦、矢田部 浩平、猿渡 洋
    • Organizer
      日本音響学会 2021年秋季研究発表会
    • Related Report
      2021 Research-status Report
  • [Presentation] 潜在アナログフィルタ表現に基づく畳み込み層を用いたサンプリング周波数非依存なDNN音源分離2021

    • Author(s)
      齋藤 弘一, 中村 友彦, 矢田部 浩平, 小泉 悠馬, 猿渡 洋
    • Organizer
      日本音響学会2021年春季研究発表会
    • Related Report
      2020 Research-status Report
  • [Presentation] アンチエイリアシング機構を導入したサンプリング周波数非依存な畳み込み層を用いた音源分離2021

    • Author(s)
      齋藤 弘一, 中村 友彦, 矢田部 浩平, 小泉 悠馬, 猿渡 洋
    • Organizer
      情報処理学会 第130回音楽情報科学研究会
    • Related Report
      2020 Research-status Report
  • [Presentation] Investigation on Wavelet Basis Function of DNN-based Time Domain Audio Source Separation Inspired by Multiresolution Analysis2020

    • Author(s)
      Shihori Kozuka, Tomohiko Nakamura, Hiroshi Saruwatari
    • Organizer
      49th International Congress and Exposition on Noise Control Engineering
    • Related Report
      2020 Research-status Report
    • Int'l Joint Research
  • [Remarks] 重唱分離のデモページ

    • URL

      https://tomohikonakamura.github.io/Tomohiko-Nakamura/demo/jaCappella_sep

    • Related Report
      2022 Annual Research Report
  • [Remarks] 重唱分離に対する提案法のコード公開ページ

    • URL

      https://github.com/TomohikoNakamura/asteroid_jaCappella

    • Related Report
      2022 Annual Research Report
  • [Remarks] 分離音デモページ

    • URL

      https://tomohikonakamura.github.io/Tomohiko-Nakamura/demo/MRDLA/

    • Related Report
      2021 Research-status Report
  • [Remarks] 提案手法コード(GitHub)

    • URL

      https://github.com/TomohikoNakamura/dwtls

    • Related Report
      2021 Research-status Report

URL: 

Published: 2020-04-28   Modified: 2024-01-30  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi