2008 Fiscal Year Annual Research Report

実環境バイモーダル音声認識共通評価基盤の構築

Research Project

Project/Area Number	19700163
Research Institution	Nagoya University
Principal Investigator	宮島千代美 Nagoya University, 情報科学研究科, 助教 (90335092)
Keywords	バイモーダル音声認識 / 雑音下音声認識 / 自動車内雑音 / データベース / 近赤外映像 / 主成分分析 / オプティカルフロー
Research Abstract	今年度は, 昨年度までに収集したバイモーダル音声データを共通評価基盤として公開するための整備を進めた. 公開する共通評価基盤は, 室内で収録した音声・映像データに事後的に雑音を重畳するシミュレーションデータとその音声認識評価スクリプト群のセットCENSREC-1-AVと, 車内で収録した実環境の音声・映像データとその評価スクリプト群のセットCENSREC-2-AVの2種類であるが, これらのうち, まずCENSREC-1-AVのデータ整備を進めた. 顔映像データは, カラーカメラと近赤外カメラで収録したものの間で開始・終了時刻にズレがあるため, これらの同期を取るため, それぞれに同期して収録されている音声データ同士の振幅の相関を計算し, 相関値が最大となる時刻を基準に同期を取り, データの切り出しを行った. また音声の発話前後約0.5秒のマージンをつけて, VADのアルゴリズムにより発話区間の切り出しを行った. さらに, 映像から自動検出した唇の位置の精度を目視により確認し, オプティカルフローを特徴量として用いるため, 唇が全発話区間を通じてはみ出さない位置で切り出しを行った. また, 最終的にNIIを通じて公開・配布するDVDに収めるデータサイズの概算を行つた結果, DVD約2枚に, 共通評価基盤全体を収録できることを確認した. 研究再開後は, 引き続きCENSREC-1-AVのデータの公開準備を行い, 続いてCENSREC-2-AVも同様の手順で公開準備を行う予定である.

Research Products
(5 results)

All 2008

All Presentation (5 results)

[Presentation] CENSREC-AV : Evaluation frameworks for audio-visual speech recognition2008
- Author(s)
  S. Tamura, C. Miyajima, N. Kitaoka, S. Hayamizu, K. Takeda
- Organizer
  International. Conference on Auditory and Visual Speech Processing
- Place of Presentation
  Tangalooma, Australia
- Year and Date
  2008-09-27
[Presentation] CENSREC-4 : Development of evaluation framework for distant-talking speech recognition under reverberant environments2008
- Author(s)
  M. Nakayama, T. Nishiura, Y. Denda, N. Kitaoka, K. Yamamoto, T. Yamada, S. Tsuge, C. Miyajima, M. Fujimoto, T. Takiguchi, S. Tamura, T. Ogawa, S. Matsuda, S. Kuroiwa, K. Takeda, S. Nakamura
- Organizer
  International Conference on Spoken Language Processing
- Place of Presentation
  Brisbane, Australia
- Year and Date
  2008-09-24
[Presentation] Multi-modal real-world driving data collection, transcription, and integration using Bayesian network2008
- Author(s)
  L. Malta, P. Angkititrakul, C. Mivaiima, K. Takeda
- Organizer
  Intelligent Vehicles Symposium
- Place of Presentation
  Marrakech, Morocco
- Year and Date
  2008-06-05
[Presentation] CENSREC-4 : Development of evaluation framework for distant-talking speech recognition under reverberant environments2008
- Author(s)
  T. Nishiura, M. Nakayama, Y. Denda, N. Kitaoka, K. Yamamoto, T. Yamada, S. Tsuge, C. Miyajima, M. Fujimoto, T. Takiguchi, S. Tamura, S. Kuroiwa, K. Takeda, and S. Nakamura
- Organizer
  Language Resources and Evaluation Conference
- Place of Presentation
  Marrakech, Morocco
- Year and Date
  2008-05-29
[Presentation] In-car speech data collection along with various multimodal signals2008
- Author(s)
  A. Ozaki, S. Hara, T. Kusakawa, C. Miyajima, T. Nishino, N. Kitaoka, K. Itou, K. Takeda
- Organizer
  Language Resources and Evaluation Conference
- Place of Presentation
  Marrakech, Morocco
- Year and Date
  2008-05-28

2008 Fiscal Year Annual Research Report

実環境バイモーダル音声認識共通評価基盤の構築

Principal Investigator

宮島 千代美 Nagoya University, 情報科学研究科, 助教 (90335092)

Research Products

[Presentation] CENSREC-AV : Evaluation frameworks for audio-visual speech recognition2008

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] CENSREC-4 : Development of evaluation framework for distant-talking speech recognition under reverberant environments2008

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Multi-modal real-world driving data collection, transcription, and integration using Bayesian network2008

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] CENSREC-4 : Development of evaluation framework for distant-talking speech recognition under reverberant environments2008

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] In-car speech data collection along with various multimodal signals2008

Author(s)

Organizer

Place of Presentation

Year and Date

宮島千代美 Nagoya University, 情報科学研究科, 助教 (90335092)