2010 Fiscal Year Annual Research Report

アクティブ視聴覚統合による動的変化環境下での音環境認識

Research Project

Project/Area Number	22700165
Research Institution	Tokyo Institute of Technology
Principal Investigator	中臺一博東京工業大学, 大学院・情報理工学研究科, 連携教授 (70436715)
Keywords	視聴覚統合 / 音声認識 / 発話区間検出 / ロボット聴覚 / 音源同定 / 雑音抑圧 / ソフトウェアアーキテクチャ / 信頼度付特徴量
Research Abstract	H22年度は、(1)視聴覚統合モデルの構築,(2)自己発生音抑圧、(3)音源同定・環境音認識、(4)ロボット実機・シミュレータのためのソフトウェアアーキテクチャ検討といった当初計画に対して、ほぼ遅滞なく研究を進めることができた。(1)に関しては、発話区間検出およびデコーディング処理という音声認識における2つの主要プロセスそれぞれに視聴覚統合を行う2階層視聴覚統合方式を提案し,実装し,その有効性を明らかにした.また,信号対雑音比や画像の解像度に応じて,特徴量に対する信頼度を動的に変更するモデルを考案し,オフラインでその効果を検証した.さらに,情報量の信頼度が大きく異なる場合にはモダリティ統合よりもモダリティ選択が有効であるという知見が得られつつあり,H23年度にこの検証を行う予定である.(2)に関しては,雑音テンプレートを用いた自己雑音抑圧法を確立し,その有効性を示した.(3)に関しては,階層型のGMMを用いた音源同定手法を構築した.H23年度は(1)のシステムへの統合を行う予定である.(4)に関しては,ロボット聴覚ソフトウェアHARK上で,動作するモジュールを実装することにより,統合を容易にした.H23年度は,実際に実機ロボットでの検証を行う予定である。

Research Products
(14 results)

All 2011 2010

All Journal Article (3 results) (of which Peer Reviewed: 3 results) Presentation (11 results)

[Journal Article] ロボット聴覚のための2階層視聴覚情報統合を用いた音声認識システムの検討2010
- Author(s)
  吉田尚水, 中臺一博, 奥乃博
- Journal Title
  
  日本ロボット学会誌
  
  Volume: 28 Pages: 56-63
- URL
  https://www.jstage.jst.go.jp/article/jrsj/28/8/28_8_970/_pdf
- Peer Reviewed
[Journal Article] Robust Ego Noise Suppression of a Robot2010
- Author(s)
  Gokhan Ince, Kazuhiro Nakadai, Tobias Rodemann, Hiroshi Tsujino, Jun-ichi Imura
- Journal Title
  
  Trends in Applied Intelligent Systems Lecture Notes in Computer Science
  
  Volume: 6096/2010 Pages: 62-71
- Peer Reviewed
[Journal Article] An Improvement in Audio-Visual Voice Activity Detection for Automatic Speech Recognition2010
- Author(s)
  Takami Yoshida, Kazuhiro Nakadai, Hiroshi G.Okuno
- Journal Title
  
  Trends in Applied Intelligent Systems, Lecture Notes in Computer Science
  
  Volume: 6096/2010 Pages: 51-61
- Peer Reviewed
[Presentation] SLAMに基づく非同期分散型マイクロホンアレイによる音源定位2011
- Author(s)
  三浦弘樹, 吉田尚水, 中臺一博
- Organizer
  情報処理学会第73回全国大会
- Place of Presentation
  東京工業大学,東京
- Year and Date
  2011-03-02
[Presentation] ロボット聴覚～高雑音下でのハンズフリー音声認識～2011
- Author(s)
  中臺一博, 奥乃博
- Organizer
  電子情報通信学会音声研究会
- Place of Presentation
  ATR,京都(招待講演)
- Year and Date
  2011-01-27
[Presentation] ロボット聴覚用オープンソースソフトウェアHARK 1.0.0の概要2010
- Author(s)
  中臺一博, 奥乃博
- Organizer
  第11回計測自動制御学会システムインテグレーション部門講演会
- Place of Presentation
  東北大学,仙台
- Year and Date
  2010-12-25
[Presentation] ロボットによる音声発話区間検出のためのハイブリッドダイナミカルシステムに基づくモダリティ選択の検討2010
- Author(s)
  吉田尚水, 中臺一博
- Organizer
  第11回計測自動制御学会システムインテグレーション部門講演会
- Place of Presentation
  東北大学,仙台
- Year and Date
  2010-12-23
[Presentation] ロボット聴覚における音声認識技術-ロボット知能化に向けて-2010
- Author(s)
  中臺一博
- Organizer
  日本ロボット学会ロボット工学セミナー「記号・言語を基盤としたロボットの知能化技術」
- Place of Presentation
  東京大学,東京(セミナー講師)
- Year and Date
  2010-11-29
[Presentation] Multi-talker Speech Recognition under Ego-motion Noise using Missing Feature Theory2010
- Author(s)
  Gokhan Ince, Kazuhiro Nakadai, Tobias Rodemann, Hiroshi Tsujino, Jun-ichi Imura
- Organizer
  IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2010)
- Place of Presentation
  Taipei, Taiwan
- Year and Date
  2010-10-19
[Presentation] Two-Layered Audio-Visual Speech Recognition for Robots in Noisy Environments2010
- Author(s)
  Takami Yoshida, Kazuhiro Nakadai, Hiroshi G.Okuno
- Organizer
  IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2010)
- Place of Presentation
  Taipei, Taiwan
- Year and Date
  2010-10-19
[Presentation] Audio-visual speech recognition system for a robot2010
- Author(s)
  Takami Yoshida, Kazuhiro Nakadai
- Organizer
  International Conference on Auditory-Visual Speech Processing (AVSP 2010)
- Place of Presentation
  Hakone, Kanagawa
- Year and Date
  2010-10-01
[Presentation] Two-layered audio-visual integration in voice activity detection and automatic speech recognition for robots2010
- Author(s)
  Takami Yoshida, Kazuhiro Nakadai
- Organizer
  International Conference on Spoken Language Processing (Interspeech 2010)
- Place of Presentation
  Makuhari, Chiba
- Year and Date
  2010-09-30
[Presentation] A Robust Speech Recognition System against the Ego Noise of a Robot2010
- Author(s)
  Gokhan Ince, Kazuhiro Nakadai, Tobias Rodemann, Hiroshi Tsujino, Jun-ichi Imura
- Organizer
  International Conference on Spoken Language Processing (Interspeech 2010)
- Place of Presentation
  Makuhari, Chiba
- Year and Date
  2010-09-29
[Presentation] A Hybrid Framework for Ego Noise Cancellation of a Robot2010
- Author(s)
  Gokhan Ince, Kazuhiro Nakadai, Tobias Rodemann, Yuji Hasegawa, Hiroshi Tsujino, Jun-ichi Imura
- Organizer
  IEEE-RAS International Conference on Robotics and Automation (ICRA 2010)
- Place of Presentation
  Anchorage, USA
- Year and Date
  2010-05-06

2010 Fiscal Year Annual Research Report

アクティブ視聴覚統合による動的変化環境下での音環境認識

Principal Investigator

中臺 一博 東京工業大学, 大学院・情報理工学研究科, 連携教授 (70436715)

Research Products

[Journal Article] ロボット聴覚のための2階層視聴覚情報統合を用いた音声認識システムの検討2010

Author(s)

Journal Title

URL

[Journal Article] Robust Ego Noise Suppression of a Robot2010

Author(s)

Journal Title

[Journal Article] An Improvement in Audio-Visual Voice Activity Detection for Automatic Speech Recognition2010

Author(s)

Journal Title

[Presentation] SLAMに基づく非同期分散型マイクロホンアレイによる音源定位2011

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] ロボット聴覚～高雑音下でのハンズフリー音声認識～2011

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] ロボット聴覚用オープンソースソフトウェアHARK 1.0.0の概要2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] ロボットによる音声発話区間検出のためのハイブリッドダイナミカルシステムに基づくモダリティ選択の検討2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] ロボット聴覚における音声認識技術-ロボット知能化に向けて-2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Multi-talker Speech Recognition under Ego-motion Noise using Missing Feature Theory2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Two-Layered Audio-Visual Speech Recognition for Robots in Noisy Environments2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Audio-visual speech recognition system for a robot2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Two-layered audio-visual integration in voice activity detection and automatic speech recognition for robots2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] A Robust Speech Recognition System against the Ego Noise of a Robot2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] A Hybrid Framework for Ego Noise Cancellation of a Robot2010

Author(s)

Organizer

Place of Presentation

Year and Date

中臺一博東京工業大学, 大学院・情報理工学研究科, 連携教授 (70436715)