2010 Fiscal Year Annual Research Report

ヒューマンコミュニケーション検索・要約のためのマルチモーダル認識の研究

Research Project

Project/Area Number	20300063
Research Institution	Tokyo Institute of Technology
Principal Investigator	篠田浩一東京工業大学, 大学院・情報理工学研究科, 准教授 (10343097)
Co-Investigator(Kenkyū-buntansha)	古井貞煕東京工業大学, 大学院・情報理工学研究科, 教授 (90293076)
Keywords	マルチモーダル認識 / ヒューマンコミュニケーション / 対話マイニング
Research Abstract	本研究は、職場・家庭などの小規模コミュニティにおけるヒューマンコミュニケーションから有用な情報を自動抽出することを目的とし、言語モード及び非言語モードからなるマルチモーダル情報の認識・検索を高精度で行うシステムを開発することを目的としている。最終年度である今年度は、構築したデータベースを用いて、今まで開発してきた各要素技術の高度化をはかり、あわせてその統合作業を行った。まず、言語モードの研究では、引き続き音声認識技術の高性能化を行った。能動的な文選択手法を用いた音響モデル学習の効果を確認した。また、昨年度の雑音下音声の解析結果をもとにスペクトル空間の縮小率を利用した新たな耐雑音手法を開発した。非言語モードのうち音声に関しては、印象評定クラスタリング結果を利用した音響モデルの構築手法を開発し効果を確認した。また、昨年度より継続していた、歩行速度変化に対し頑健な歩容(Gait)認識、パーティクルフィルタを用いた人間の動作識別、手話を対象としたジェスチャー認識の評価を行い、その効果を確認した。収録データベースを用いた研究では、マルチチャネル音声検出手法の効果を確認し、また、話者を同定するために新たに話者認識の研究を行い、その効果を確認した。言語モードと非言語モードを統合した映像からの情報自動抽出手法について引き続き高性能化を行った。米国TRECVIDワークショップで世界50チーム中4位(日本では1位)の成果を得た。

Research Products
(24 results)

All 2011 2010

All Journal Article (4 results) (of which Peer Reviewed: 4 results) Presentation (20 results)

[Journal Article] シンボル列化したシーンの学習と2種のプレイ種創刊度による野球放送映像プレイ種識別2010
- Author(s)
  望月貴裕, 藤井真人, 篠田浩一, 酒井善則
- Journal Title
  
  電子情報通信学会論文誌D
  
  Volume: Vol.J93-D, No.6 Pages: 1009-1023
- Peer Reviewed
[Journal Article] Acoustic Model Adaptation for Speech Recognition2010
- Author(s)
  Koichi Shinoda
- Journal Title
  
  IEICE Transactions on Information and Systems
  
  Volume: Vol.E93-D, No.9 Pages: 2648-2362
- Peer Reviewed
[Journal Article] 大規模映像資源のためのマルチモーダル高次特徴検出2010
- Author(s)
  井上中順, 斉藤辰彦, 篠田浩一, 古井貞煕
- Journal Title
  
  電子情報通信学会論文誌
  
  Volume: Vol.J93-D, No.12 Pages: 2633-2644
- Peer Reviewed
[Journal Article] Semi-synchronous speech and pen input for mobile user interfaces2010
- Author(s)
  Koichi Shinoda, Yasushi Watanabe, Kenji Iwata, Yuan Liang, Ryuta Nakagawa, Sadaoki Furui
- Journal Title
  
  Speech communication
  
  Volume: Vol.53 Pages: 283-291
- Peer Reviewed
[Presentation] 音響モデル学習のための相対エントロピーを用いた学習文選択手法2011
- Author(s)
  村上博子、篠田浩一、古井貞煕
- Organizer
  日本音響学会2011年春季講演発表会
- Place of Presentation
  東京
- Year and Date
  2011-03-09
[Presentation] Voting Approach in SMAP Adaptation for Speaker Verification2011
- Author(s)
  Sangeeta Biswas, Marc Ferras, Koichi Shinoda、Sadaoki Furui
- Organizer
  日本音響学会2011年春季研究発表会
- Place of Presentation
  東京
- Year and Date
  2011-03-09
[Presentation] 雑音下音声におけるスペクトル縮小の分析とその対雑音音声認識への利用2011
- Author(s)
  別府真由美, 篠田浩一, 古井貞煕
- Organizer
  電子情報通信学会 SP研究会
- Place of Presentation
  東京
- Year and Date
  2011-03-04
[Presentation] マルチモーダル・マルチフレームな手法を用いたTTECVIDセマンティックインデクシング2011
- Author(s)
  井上中順, 上嶋勇祐, 篠田浩一
- Organizer
  電子情報通信学会 PRMU研究会
- Place of Presentation
  さいたま市
- Year and Date
  2011-02-17
[Presentation] 映像解析・検索評価ワークショップTRECVID2010の概要2011
- Author(s)
  佐藤真一, 篠田浩一
- Organizer
  電子情報通信学会 PRMU研究会
- Place of Presentation
  さいたま市
- Year and Date
  2011-02-17
[Presentation] 音響モデル学習のための相対エントロピーを用いた学習文選択2011
- Author(s)
  村上博子, 篠田浩一, 古井貞煕
- Organizer
  情報処理学会音声言語情報処理学会
- Place of Presentation
  福山市
- Year and Date
  2011-02-04
[Presentation] Inter-speaker weighted MAP adaptation for GMM-supervector speaker recognition2010
- Author(s)
  Marc Ferras, Koichi Shinoda, Sadaoki Furui
- Organizer
  情報処理学会音声言語情報処理学会
- Place of Presentation
  東京
- Year and Date
  2010-12-20
[Presentation] Optimal use of trees in structural MAP adaptation for speaker verification2010
- Author(s)
  Sangeeta Biswas, Marc Ferras, Koichi Shinoda, Sadaoki Furui
- Organizer
  情報処理学会音声言語情報処理学会
- Place of Presentation
  東京
- Year and Date
  2010-12-20
[Presentation] カテゴリ推定に基づく動的な言語モデル適応2010
- Author(s)
  山本仁, 花沢健, 三木清一, 篠田浩一
- Organizer
  情報処理学会音声言語情報処理学会
- Place of Presentation
  東京
- Year and Date
  2010-12-20
[Presentation] TT+GT at TRECVID 2010 Workshop2010
- Author(s)
  Nakamasa Inoue, Toshiya Wada, Yusuke Kamishima, Koichi Shinoda, Ilseo Kim, Byungki Byun, Chin-Hui Lee
- Organizer
  TRECVID 2010 workshop
- Place of Presentation
  Gaithersburg
- Year and Date
  2010-11-15
[Presentation] Gait-based Person Identification Robust against Speed Variation using CHLAC features and HMMs2010
- Author(s)
  Muhammad Rasyid Aqmar, Koichi Shinoda, Sadaoki Furui
- Organizer
  電子情報通信学会 PRMU研究会
- Place of Presentation
  千葉市
- Year and Date
  2010-10-08
[Presentation] Dynamic Language Model Adaptation Using Keyword Category Classification2010
- Author(s)
  Hitoshi Yamamoto, Ken Hanazawa, Kiyokazu Miki, Koichi Shinoda
- Organizer
  INTERSPPECH2010
- Place of Presentation
  千葉市
- Year and Date
  2010-09-26
[Presentation] 会議音声認識のためのスペクトル減算に基づく音源分離2010
- Author(s)
  那須悠、篠田浩一、古井貞煕
- Organizer
  日本音響学会2010年秋季研究発表会
- Place of Presentation
  大阪
- Year and Date
  2010-09-14
[Presentation] フランス語における発声スタイルの違いがスペクトル特徴に与える影響の分析2010
- Author(s)
  別府真由美, Jean-Luc Rouas, Martine Adda-Decker, 篠田浩一, 古井貞煕
- Organizer
  日本音響学会2010年秋季研究発表会
- Place of Presentation
  大阪
- Year and Date
  2010-09-14
[Presentation] SIFT混合ガウス分布を用いた一般物体認識のためのマルチカーネル学習2010
- Author(s)
  井上中順, 上嶋勇祐, 篠田浩一, 古井貞煕
- Organizer
  電子情報通信学会 PRMU研究会
- Place of Presentation
  福岡市
- Year and Date
  2010-09-05
[Presentation] Robust Gait Recognition against Speed Variation2010
- Author(s)
  Muhammad Rasyid Agmar, Koichi Shinoda, Sadaoki Furui
- Organizer
  ICPR2010
- Place of Presentation
  Istanbul
- Year and Date
  2010-08-23
[Presentation] High-Level Feature Extraction Using SIFT GMMs and Audio Models2010
- Author(s)
  井上中順, 斉藤辰彦、篠田浩一, 古井貞煕
- Organizer
  ICPR2010
- Place of Presentation
  Istanbul
- Year and Date
  2010-08-23
[Presentation] ToFカメラによる3D手話認識2010
- Author(s)
  佐藤新、篠田浩一、古井貞煕
- Organizer
  画像の認識・理解シンポジウム
- Place of Presentation
  釧路
- Year and Date
  2010-07-27
[Presentation] NIST SRE 2010 : Tokyo Tech Speaker Recognition2010
- Author(s)
  Marc Ferras, Sangeeta Biswas, Koichi Shinoda, Sadaoki Furui
- Organizer
  NIST 2010 Speaker recognition evaluation workshop
- Place of Presentation
  Brno
- Year and Date
  2010-06-24
[Presentation] 会議音声認識のためのスペクトル減算に基づくオンライン音源分離2010
- Author(s)
  那須悠, 篠田浩一, 古井貞煕
- Organizer
  電子情報通信学会 SP研究会
- Place of Presentation
  神戸市
- Year and Date
  2010-05-26

2010 Fiscal Year Annual Research Report

ヒューマンコミュニケーション検索・要約のためのマルチモーダル認識の研究

Principal Investigator

篠田 浩一 東京工業大学, 大学院・情報理工学研究科, 准教授 (10343097)

Research Products

[Journal Article] シンボル列化したシーンの学習と2種のプレイ種創刊度による野球放送映像プレイ種識別2010

Author(s)

Journal Title

[Journal Article] Acoustic Model Adaptation for Speech Recognition2010

Author(s)

Journal Title

[Journal Article] 大規模映像資源のためのマルチモーダル高次特徴検出2010

Author(s)

Journal Title

[Journal Article] Semi-synchronous speech and pen input for mobile user interfaces2010

Author(s)

Journal Title

[Presentation] 音響モデル学習のための相対エントロピーを用いた学習文選択手法2011

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Voting Approach in SMAP Adaptation for Speaker Verification2011

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 雑音下音声におけるスペクトル縮小の分析とその対雑音音声認識への利用2011

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] マルチモーダル・マルチフレームな手法を用いたTTECVIDセマンティックインデクシング2011

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 映像解析・検索評価ワークショップTRECVID2010の概要2011

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 音響モデル学習のための相対エントロピーを用いた学習文選択2011

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Inter-speaker weighted MAP adaptation for GMM-supervector speaker recognition2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Optimal use of trees in structural MAP adaptation for speaker verification2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] カテゴリ推定に基づく動的な言語モデル適応2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] TT+GT at TRECVID 2010 Workshop2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Gait-based Person Identification Robust against Speed Variation using CHLAC features and HMMs2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Dynamic Language Model Adaptation Using Keyword Category Classification2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 会議音声認識のためのスペクトル減算に基づく音源分離2010

Author(s)

Organizer

篠田浩一東京工業大学, 大学院・情報理工学研究科, 准教授 (10343097)