極限環境で動作するロボット聴覚を搭載したホース型レスキューロボットシステム

Research Project

Project/Area Number	15J08765
Research Category	Grant-in-Aid for JSPS Fellows
Allocation Type	Single-year Grants
Section	国内
Research Field	Intelligent robotics
Research Institution	Kyoto University
Principal Investigator	坂東宜昭京都大学, 情報学研究科, 特別研究員(DC1)
Project Period (FY)	2015-04-24 – 2018-03-31
Project Status	Completed (Fiscal Year 2017)
Budget Amount *help	¥2,500,000 (Direct Cost: ¥2,500,000) Fiscal Year 2017: ¥800,000 (Direct Cost: ¥800,000) Fiscal Year 2016: ¥800,000 (Direct Cost: ¥800,000) Fiscal Year 2015: ¥900,000 (Direct Cost: ¥900,000)
Keywords	ロボット聴覚 / 音声強調 / 深層生成モデル / ブラインド多チャネル音声強調 / ベイジアン低ランク・スパース分解 / レスキューロボティクス / 自己位置推定 / 統計的信号処理 / マルチモーダル信号処理
Outline of Annual Research Achievements	これまで取り組んできた低ランク・スパース分解に基づく音声強調法は，スパース性という音声の１側面のみを捉えた仮定に基づき音声信号を抽出していたため，強調性能に限界があった．一方近年，深層ニューラルネットワーク(DNN)を用いて，雑音を含む音声信号からクリーンな音声信号への写像を教師あり学習することで，高品質な音声強調が実現しつつある．しかし，このアプローチでは，大量の訓練データを準備する必要があるうえ，未知の雑音環境下に対する汎化性能に問題があった．平成29年度は，雑音を事前学習せず高い品質で音声強調するために，深層学習に基づく音声モデルと従来の統計モデルに基づく雑音モデルを確率的に統合した半教師あり音声強調法を開発した．本手法では，音声スペクトログラムは深層生成モデルから確率的に生成され，雑音スペクトログラムは非負値行列因子分解(NMF)モデルから生成されると仮定し，これらが重畳することで混合音スペクトログラムが生成されると考える．音声スペクトルの深層生成モデルを事前に大量のクリーン音声信号を用いて教師なし学習しておけば，混合音が与えられたときに，含まれている実際の音声スペクトルをベイズ推論できる．本枠組みのNMFモデルは観測に合わせて雑音成分を適応的に推定するため，雑音信号の訓練データを必要としない．シミュレーション混合音を用いた評価実験では，従来の低ランク・スパース分解法より高い性能を達成した．さらに，従来のDNNに基づく教師あり音声強調法に対しても，教師あり法にとっての未知雑音環境下でより高い性能を確認した．
Research Progress Status	29年度が最終年度であるため、記入しない。
Strategy for Future Research Activity	29年度が最終年度であるため、記入しない。

Report

(3 results)

Research Products
(25 results)

All 2018 2017 2016 2015

All Journal Article (2 results) (of which Peer Reviewed: 2 results, Open Access: 1 results, Acknowledgement Compliant: 1 results) Presentation (22 results) (of which Int'l Joint Research: 8 results) Patent(Industrial Property Rights) (1 results)

[Journal Article] Speech enhancement based on Bayesian low-rank and sparse decomposition of multichannel magnitude spectrograms2018
- Author(s)
  Y.Bando, K.Itoyama, M.Konyo, S.Tadokoro, K.Nakadai, K.Yoshii, T.Kawahara, and H.G.Okuno
- Journal Title
  
  IEEE/ACM Trans. Audio, Speech & Language Processing
  
  Volume: 26 Issue: 2 Pages: 215-230
- DOI
  10.1109/taslp.2017.2772340
- Related Report
  2017 Annual Research Report
- Peer Reviewed
[Journal Article] Low Latency and High Quality Two-Stage Human-Voice-Enhancement System for a Hose-Shaped Rescue Robot2017
- Author(s)
  Yoshiaki Bando, Hiroshi Saruwatari, Nobutaka Ono, Shoji Makino, Katustoshi Itoyama1, Daichi Kitamura, Masaru Ishimura, Moe Takakusaki, Narumi Mae, Kouei Yamaoka, Yutaro Matsui, Yuichi Ambe, Masashi Konyo, Satoshi Tadokoro, Kazuyoshi Yoshii, Hiroshi G. Okuno
- Journal Title
  
  Journal of Robotics and Mechatronics
  
  Volume: 29 Issue: 1 Pages: 198-212
- DOI
  10.20965/jrm.2017.p0198
- NAID
  130007519848
- ISSN
  0915-3942, 1883-8049
- Year and Date
  2017-02-20
- Related Report
  2016 Annual Research Report
- Peer Reviewed / Open Access / Acknowledgement Compliant
[Presentation] Statistical Speech Enhancement Based on Probabilistic Integration of Variational Autoencoder and Non-Negative Matrix Factorization2018
- Author(s)
  Yoshiaki Bando, Masato Mimura, Katsutoshi Itoyama, Kazuyoshi Yoshii, Tatsuya Kawahara
- Organizer
  IEEE International Conference on Acoustics, Speech and Signal Processing
- Related Report
  2017 Annual Research Report
- Int'l Joint Research
[Presentation] 音響センサを用いた配管内探査ヘビ型ロボットの3 次元位置推定2017
- Author(s)
  坂東宜昭, 須原大貴, 亀川哲志, 糸山克寿, 吉井和佳, 松野文俊, 奥乃博
- Organizer
  日本ロボット学会学術講演会
- Related Report
  2017 Annual Research Report
[Presentation] 深層生成モデルを事前分布に用いた教師なし音声強調2017
- Author(s)
  坂東宜昭, 三村正人, 糸山克寿, 吉井和佳, 河原達也
- Organizer
  電子情報通信学会音声研究会
- Related Report
  2017 Annual Research Report
[Presentation] 多チャネル低ランク・スパース分解に基づく柔軟索状レスキューロボットのためのリアルタイム音声強調2017
- Author(s)
  坂東宜昭, 安部祐一, 糸山克寿, 昆陽雅司, 田所諭, 中臺一博, 吉井和佳, 奥乃博
- Organizer
  ロボティクス・メカトロニクス講演会
- Related Report
  2017 Annual Research Report
[Presentation] Sound-based Online Localization for an In-pipe Snake Robot2016
- Author(s)
  Yoshiaki Bando, Hiroki Suhara, Motoyasu Tanaka, Tetsushi Kamegawa, Katsutoshi Itoyama, Kazuyoshi Yoshii, Fumitoshi Matsuno, Hiroshi G. Okuno
- Organizer
  IEEE International Symposium on Safety, Security, and Rescue Robotics
- Place of Presentation
  EPFL, Lausanne, Switzerland
- Year and Date
  2016-10-23
- Related Report
  2016 Annual Research Report
- Int'l Joint Research
[Presentation] 変分ベイズ多チャネルRNMFに基づく柔軟索状レスキューロボットのための音声強調2016
- Author(s)
  坂東宜昭, 糸山克寿, 昆陽雅司, 田所諭, 中臺一博, 吉井和佳, 奥乃博
- Organizer
  日本ロボット学会第34回学術講演会
- Place of Presentation
  山形大学
- Year and Date
  2016-09-07
- Related Report
  2016 Annual Research Report
[Presentation] Variational Bayesian Multi-channel Robust NMF for Human-voice Enhancement with a Deformable and Partially-occluded Microphone Array2016
- Author(s)
  Yoshiaki Bando, Katsutoshi Itoyama, Masashi Konyo, Satoshi Tadokoro, Kazuhiro Nakadai, Kazuyoshi Yoshii, Hiroshi G. Okuno
- Organizer
  European Signal Processing Conference
- Place of Presentation
  Budapest, Hungary
- Year and Date
  2016-08-29
- Related Report
  2016 Annual Research Report
- Int'l Joint Research
[Presentation] 変分ベイズ多チャネルロバストNMFに基づくマイクロホンの移動・被覆を許容する音声強調2016
- Author(s)
  坂東宜昭, 糸山克寿, 昆陽雅司, 田所諭, 中臺一博, 吉井和佳, 河原達也, 奥乃博
- Organizer
  音声研究会
- Place of Presentation
  京都大学
- Year and Date
  2016-08-24
- Related Report
  2016 Annual Research Report
[Presentation] 柔軟索状レスキューロボットのためのマイクロホン・加速度センサアレイを用いた3 次元姿勢推定2016
- Author(s)
  坂東宜昭, 糸山克寿, 昆陽雅司, 田所諭, 中臺一博, 吉井和佳, 奥乃博
- Organizer
  日本機械学会ロボティクス・メカトロニクス講演会
- Place of Presentation
  パシフィコ横浜
- Year and Date
  2016-07-08
- Related Report
  2015 Annual Research Report
[Presentation] マイクロホンアレイ音源分離のための複素t分布に基づくマルチチャネル非負値行列因子分解2016
- Author(s)
  北村昂一, 坂東宜昭, 糸山克寿, 吉井和佳
- Organizer
  情報処理学会第78回全国大会
- Place of Presentation
  慶応義塾大学矢上キャンパス
- Year and Date
  2016-03-10
- Related Report
  2015 Annual Research Report
[Presentation] 音源到来方向・時間差を用いた非同期複数マイクロホンアレイ位置のオンライン推定2016
- Author(s)
  関口航平, 坂東宜昭, 中村圭佑, 中臺一博, 糸山克俊, 吉井和佳
- Organizer
  情報処理学会第78回全国大会
- Place of Presentation
  慶応義塾大学矢上キャンパス
- Year and Date
  2016-03-10
- Related Report
  2015 Annual Research Report
[Presentation] 音源スペクトログラムの低ランク性とスパース性を考慮した NMF-LDA に基づくマルチチャネル音源定位と音源分離2016
- Author(s)
  板倉光佑、坂東宜昭、中村栄太、糸山克寿、吉井和佳
- Organizer
  情報処理学会第78回全国大会
- Place of Presentation
  慶応義塾大学矢上キャンパス
- Year and Date
  2016-03-10
- Related Report
  2015 Annual Research Report
[Presentation] 複数移動ロボットによる協調音源分離のための分離精度予測を用いた配置最適化2015
- Author(s)
  関口航平, 坂東宜昭, 糸山克寿, 吉井和佳
- Organizer
  人工知能学会第42回 AIチャレンジ研究会
- Place of Presentation
  慶應義塾大学日吉キャンパス
- Year and Date
  2015-11-12
- Related Report
  2015 Annual Research Report
[Presentation] Human-Voice Enhancement based on Online RPCA for a Hose-shaped Rescue Robot with a Microphone Array2015
- Author(s)
  Yoshiaki Bando, Katsutoshi Itoyama, Masashi Konyo, Satoshi Tadokoro, Kazuhiro Nakadai, Kazuyoshi Yoshi, Hiroshi G. Okuno
- Organizer
  IEEE International Symposium on Safety, Security, and Rescue Robotics 2015
- Place of Presentation
  Indiana, USA
- Year and Date
  2015-10-18
- Related Report
  2015 Annual Research Report
- Int'l Joint Research
[Presentation] Microphone-accelerometer based 3D Posture Estimation for a Hose-shaped Rescue Robot2015
- Author(s)
  Yoshiaki Bando, Katsutoshi Itoyama, Masashi Konyo, Satoshi Tadokoro, Kazuhiro Nakadai, Kazuyoshi Yoshi, Hiroshi G. Okuno
- Organizer
  IEEE/RSJ International Conference on Intelligent Robots and Systems 2015
- Place of Presentation
  Hamburg, Germany
- Year and Date
  2015-09-28
- Related Report
  2015 Annual Research Report
- Int'l Joint Research
[Presentation] Audio-Visual Beat Tracking Based on a State-Space Model for a Music Robot Dancing with Humans2015
- Author(s)
  Misato Ohkita, Yoshiaki Bando, Yukara Ikemiya, Katsutoshi Itoyama, Kazuyoshi Yoshii
- Organizer
  IEEE/RSJ International Conference on Intelligent Robots and Systems 2015
- Place of Presentation
  Hamburg, Germany
- Year and Date
  2015-09-28
- Related Report
  2015 Annual Research Report
- Int'l Joint Research
[Presentation] Optimizing the Layout of Multiple Mobile Robots for Cooperative Sound Source Separation2015
- Author(s)
  Kouhei Sekiguchi, Yoshiaki Bando, Katsutoshi Itoyama, Kazuyoshi Yoshii
- Organizer
  IEEE/RSJ International Conference on Intelligent Robots and Systems 2015
- Place of Presentation
  Hamburg, Germany
- Year and Date
  2015-09-28
- Related Report
  2015 Annual Research Report
- Int'l Joint Research
[Presentation] 音源分離のためのベイズモデルに基づく音源信号の不確実性を考慮した音声認識2015
- Author(s)
  板倉光佑、坂東宜昭、糸山克寿、吉井和佳
- Organizer
  日本音響学会 2015 秋季研究発表会
- Place of Presentation
  会津大学
- Year and Date
  2015-09-16
- Related Report
  2015 Annual Research Report
[Presentation] Bayesian Integration of Sound Source Separation and Speech Recognition: A New Approach to Simultaneous Speech Recognition2015
- Author(s)
  Kousuke Itakura, Izaya Nishimuta, Yoshiaki Bando, Katsutoshi Itoyama, and Kazuyoshi Yoshii
- Organizer
  Interspeech 2015
- Place of Presentation
  Dresden, Germany
- Year and Date
  2015-09-06
- Related Report
  2015 Annual Research Report
- Int'l Joint Research
[Presentation] ロバスト主成分分析を用いた動作雑音抑圧に基づく柔軟索状レスキューロボットのための音声強調2015
- Author(s)
  坂東宜昭, 糸山克寿, 昆陽雅司, 田所諭, 中臺一博, 吉井和佳, 奥乃博
- Organizer
  日本ロボット学会第33回学術講演会
- Place of Presentation
  東京電機大学東京千住キャンパス
- Year and Date
  2015-09-03
- Related Report
  2015 Annual Research Report
[Presentation] 複数移動ロボットを用いた音源分離における音源配置に応じたロボットの最適配置探索2015
- Author(s)
  関口航平, 坂東宜昭, 糸山克寿, 吉井和佳
- Organizer
  日本ロボット学会第33回学術講演会
- Place of Presentation
  東京電機大学東京千住キャンパス
- Year and Date
  2015-09-03
- Related Report
  2015 Annual Research Report
[Presentation] 両耳聴ロボット聴覚ソフトウェアHARK-BinauralとRaspberry Pi 2を用いたヒューマノイドロボットへの適用2015
- Author(s)
  坂東宜昭, 金宜鉉, 糸山克寿, 吉井和佳, 中臺一博, 奥乃博
- Organizer
  音学シンポジウム 2015
- Place of Presentation
  電気通信大学
- Year and Date
  2015-05-23
- Related Report
  2015 Annual Research Report
[Patent(Industrial Property Rights)] 目的音響信号復元システム及び方法2016
- Inventor(s)
  坂東宜昭, 吉井和佳, 糸山克寿，奥乃博
- Industrial Property Rights Holder
  国立大学法人京都大学
- Industrial Property Rights Type
  特許
- Filing Date
  2016-05-23
- Related Report
  2016 Annual Research Report

極限環境で動作するロボット聴覚を搭載したホース型レスキューロボットシステム

Principal Investigator

坂東 宜昭 京都大学, 情報学研究科, 特別研究員(DC1)

¥2,500,000 (Direct Cost: ¥2,500,000)

Report

Research Products

[Journal Article] Speech enhancement based on Bayesian low-rank and sparse decomposition of multichannel magnitude spectrograms2018

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Low Latency and High Quality Two-Stage Human-Voice-Enhancement System for a Hose-Shaped Rescue Robot2017

Author(s)

Journal Title

DOI

NAID

ISSN

Year and Date

Related Report

[Presentation] Statistical Speech Enhancement Based on Probabilistic Integration of Variational Autoencoder and Non-Negative Matrix Factorization2018

Author(s)

Organizer

Related Report

[Presentation] 音響センサを用いた配管内探査ヘビ型ロボットの3 次元位置推定2017

Author(s)

Organizer

Related Report

[Presentation] 深層生成モデルを事前分布に用いた教師なし音声強調2017

Author(s)

Organizer

Related Report

[Presentation] 多チャネル低ランク・スパース分解に基づく柔軟索状レスキューロボットのためのリアルタイム音声強調2017

Author(s)

Organizer

Related Report

[Presentation] Sound-based Online Localization for an In-pipe Snake Robot2016

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 変分ベイズ多チャネルRNMFに基づく柔軟索状レスキューロボットのための音声強調2016

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] Variational Bayesian Multi-channel Robust NMF for Human-voice Enhancement with a Deformable and Partially-occluded Microphone Array2016

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 変分ベイズ多チャネルロバストNMFに基づくマイクロホンの移動・被覆を許容する音声強調2016

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 柔軟索状レスキューロボットのためのマイクロホン・加速度センサアレイを用いた3 次元姿勢推定2016

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] マイクロホンアレイ音源分離のための複素t分布に基づくマルチチャネル非負値行列因子分解2016

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 音源到来方向・時間差を用いた非同期複数マイクロホンアレイ位置のオンライン推定2016

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 音源スペクトログラムの低ランク性とスパース性を考慮した NMF-LDA に基づくマルチチャネル音源定位と音源分離2016

Author(s)

Organizer

坂東宜昭京都大学, 情報学研究科, 特別研究員(DC1)