2009 Fiscal Year Annual Research Report

ヒューマンコミュニケーション検索・要約のためのマルチモーダル認識の研究

Research Project

Project/Area Number	20300063
Research Institution	Tokyo Institute of Technology
Principal Investigator	篠田浩一東京工業大学, 大学院・情報理工学研究科, 准教授 (10343097)
Co-Investigator(Kenkyū-buntansha)	古井貞煕東京工業大学, 大学院・情報理工学研究科, 教授 (90293076)
Keywords	マルチモーダル認識 / ヒューマンコミュニケーション / 対話マイニング
Research Abstract	本研究は、職場・家庭などの小規模コミュニティにおけるヒューマンコミュニケーションから有用な情報を自動抽出することを目的とし、言語モード及び非言語モードからなるマルチモーダル情報の認識・検索を高精度で行うシステムを開発することを目的としている。2年目である今年度は、昨年度収録した評価データベースを用いた予備評価を行い、それを踏まえてさらにデータ収録・アノテーション作業を行った。そして、引き続き要素技術の開発を試みるとともにその統合作業を開始した。まず、言語モードの研究では、引き続き音声認識技術の高性能化を行った。複数の認識器を用いた能動学習による音響モデル学習手法を開発した。また前年度開発した能動的な文選択手法の音響モデル学習への応用を開始した。非言語モードのうち音声に関しては、前年度の印象評定クラスタリングの研究で得られた知見をもとに、それを利用した音響モデルの構築手法を提案した。また、引き続き歩容(Gait)認識の性能向上を図るとともに、パーティクルフィルタを用いた人間の動作識別の評価を開始した。また、手話を対象としたジェスチャー認識の研究を開始した。さらに、言語モードと非言語モードを統合した、動画像からのイベント検出手法の検討を開始した。昨年度収録したデータベースのアノテーションを行い、それをもとにマルチチャネル音声検出手法の検討を開始した。また、周囲雑音による音声品質の劣化があることが判明したため雑音下音声の解析を開始した。

Research Products
(12 results)

All 2010 2009

All Journal Article (1 results) (of which Peer Reviewed: 1 results) Presentation (11 results)

[Journal Article] Automatic recognition of Indonesian declarative questions and statements using polynomial coefficients of the pitch contours2009
- Author(s)
  Nazrul Effendy, Koichi Shinoda, Sadaoki Furui, Somchai Jitapunkul
- Journal Title
  
  2009 The Acoustical Society of Japan, Accoust.Sci.& Tech.
  
  Volume: No.30 Pages: 249-256
- Peer Reviewed
[Presentation] Speech Modeling Based on Committee-Based Active Learning2010
- Author(s)
  濱中悠三、篠田浩一、古井貞煕、江森正、越仲孝文
- Organizer
  ICASSP2010
- Place of Presentation
  Dallas, U.S.A
- Year and Date
  2010-03-14
[Presentation] 音響特徴を用いた映像からのイベント検出の研究2010
- Author(s)
  斉藤辰彦、井上中順、篠田浩一、古井貞煕
- Organizer
  日本音響学会2010年春季研究発表会
- Place of Presentation
  東京
- Year and Date
  2010-03-08
[Presentation] 音声認識のための複数の認識器を利用した能動学習2009
- Author(s)
  濱中悠三, 江森正, 越中孝文, 篠田浩一, 古井貞煕
- Organizer
  情報処理学会音声言語情報処理学会
- Place of Presentation
  東京
- Year and Date
  2009-12-21
[Presentation] SIFT混合ガウス分布と音響特徴を用いた映像からの高次特徴検出2009
- Author(s)
  井上中順, 斉藤辰彦, 篠田浩一, 古井貞煕
- Organizer
  電子情報通信学会 PRMU研究会
- Place of Presentation
  金沢市
- Year and Date
  2009-11-26
[Presentation] TITGT at TRECVID 2009 Workshop2009
- Author(s)
  Nakamasa Inoue, Shanshan Hao, Tatsuhiko Saito, Koichi Shinoda, Ilseo Kim, Chin-Hui Leei
- Organizer
  TRECVID Workshop (TRECVID 2009)
- Place of Presentation
  Gaithersburg
- Year and Date
  2009-11-16
[Presentation] Robust Speech Recognition In The Car Environment2009
- Author(s)
  Agnieszka Betkowska Cavalcante, Koichi Shinoda, Sadaoki Furui
- Organizer
  the 4th Language and Technology Conference (LTC'09)
- Place of Presentation
  Poznan, Poland
- Year and Date
  2009-11-06
[Presentation] Noise robust speech recognition using spectral subtraction and F0 information extracted by Hough transform2009
- Author(s)
  Hideki Yasui, Koichi Shinoda, Sadaoki Furui, Koji Iwano
- Organizer
  Asia-Pacific Signal and Information Processing Association 2009 Annual Summit and Conference
- Place of Presentation
  Sapporo, Japan
- Year and Date
  2009-10-05
[Presentation] 音声認識のためのコミッティを用いた能動学習2009
- Author(s)
  濱中悠三、江森正、越仲孝文、篠田浩一、古井貞煕
- Organizer
  日本音響学会秋季研究発表会
- Place of Presentation
  郡山市
- Year and Date
  2009-09-15
[Presentation] Speaker Adaptation Based on Two-Step Active Learning2009
- Author(s)
  村上博子、篠田浩一、古井貞煕
- Organizer
  INTERSPEECH 2009 BRIGHTON
- Place of Presentation
  Brighton UK
- Year and Date
  2009-09-06
[Presentation] Independent component analysis for noisy speech recognition2009
- Author(s)
  Hsin-Lung Hsieh, Jen-tzung Chien, Koichi Shinoda, Sadaoki Furui
- Organizer
  ICASSP 2009
- Place of Presentation
  Taipei
- Year and Date
  2009-04-19
[Presentation] Online speaker clustering using incremental learning of an ergodic hidden markov model2009
- Author(s)
  Takafumi Koshinaka, Kentaro Nagatomo, Koichi Shinoda
- Organizer
  IEEE ICASSP 2009
- Place of Presentation
  Taipei
- Year and Date
  2009-04-19

2009 Fiscal Year Annual Research Report

ヒューマンコミュニケーション検索・要約のためのマルチモーダル認識の研究

Principal Investigator

篠田 浩一 東京工業大学, 大学院・情報理工学研究科, 准教授 (10343097)

Research Products

[Journal Article] Automatic recognition of Indonesian declarative questions and statements using polynomial coefficients of the pitch contours2009

Author(s)

Journal Title

[Presentation] Speech Modeling Based on Committee-Based Active Learning2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 音響特徴を用いた映像からのイベント検出の研究2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 音声認識のための複数の認識器を利用した能動学習2009

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] SIFT混合ガウス分布と音響特徴を用いた映像からの高次特徴検出2009

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] TITGT at TRECVID 2009 Workshop2009

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Robust Speech Recognition In The Car Environment2009

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Noise robust speech recognition using spectral subtraction and F0 information extracted by Hough transform2009

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 音声認識のためのコミッティを用いた能動学習2009

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Speaker Adaptation Based on Two-Step Active Learning2009

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Independent component analysis for noisy speech recognition2009

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Online speaker clustering using incremental learning of an ergodic hidden markov model2009

Author(s)

Organizer

Place of Presentation

Year and Date

篠田浩一東京工業大学, 大学院・情報理工学研究科, 准教授 (10343097)