2009 Fiscal Year Final Research Report

Audio-visual speech corpus for evaluating speech recognition performance in noisy environments

Research Project

Project/Area Number	19700163
Research Category	Grant-in-Aid for Young Scientists (B)
Allocation Type	Single-year Grants
Research Field	Perception information processing/Intelligent robotics
Research Institution	Nagoya University
Principal Investigator	MIYAJIMA Chiyomi Nagoya University, 大学院・情報科学研究科, 助教 (90335092)
Project Period (FY)	2007 – 2009
Keywords	バイモーダル音声認識 / データベース / 雑音環境 / 車内雑音
Research Abstract	Audio-visual speech data are collected in a silent room and a vehicle for developing an audio-visual speech corpus which is used for evaluating speech recognition performance in noisy environments, especially in in-car environments. Acoustic noise and gamma values of images are used for simulating in-car environments over the recorded data in the silent room. Baseline audio and visual features and an integration method are calibrated in some experimental evaluations. The corpus will be open to the public along with database manuals for research purposes.

Research Products
(14 results)

All 2010 2009 2008 2007

All Journal Article (3 results) (of which Peer Reviewed: 3 results) Presentation (10 results) Book (1 results)

[Journal Article] CENSREC-1-C: An evaluation framework for voice activity detection under noisy environments2009
- Author(s)
  N. Kitaoka, T. Yamada, S. Tsuge, C. Miyajima, K. Yamamoto, T. Nishiura, M. Nakayama, Y. Denda, M. Fujimoto, T. Takiguchi, S. Tamura, S. Matsuda, T. Ogawa, S. Kuroiwa, K. Takeda, S. Nakamura
- Journal Title
  
  Acoustical Science and Technology
  
  Pages: 363-371
- Peer Reviewed
[Journal Article] 音声と画像の統合によるドライバの発話区間検出2008
- Author(s)
  二宮芳樹, 坂義秀, 前野俊希, 根木大輔, 宮島千代美, 森健策, 北坂孝幸, 末永康仁
- Journal Title
  
  映像情報メディア学会誌 vol.62,no.3
  
  Pages: 435-441
- Peer Reviewed
[Journal Article] 多様な音響環境下における音声認識システム利用時のデータ収集システム2007
- Author(s)
  原直, 宮島千代美, 伊藤克亘, 武田一哉
- Journal Title
  
  電子情報通信学会論文誌 vol.J90-D,no.10
  
  Pages: 1115-1123
- Peer Reviewed
[Presentation] CENSREC- 1-AV:マルチモーダル音声認識コーパスの構築2010
- Author(s)
  田村哲嗣, 宮島千代美, 北岡教英, 武田一哉, 山田武志, 滝口哲也, 柘植覚, 山本一公, 西浦敬信, 中山雅人, 傳田遊亀, 藤本雅清, 松田繁樹, 小川哲司, 黒岩眞吾, 中村哲
- Organizer
  2010年日本音響学会春季研究発表会
- Place of Presentation
  調布市
- Year and Date
  20100300
[Presentation] 複数音響モデルからの最適選択による音声認識2009
- Author(s)
  伊藤新, 原直, 宮島千代美, 北岡教英, 武田一哉
- Organizer
  2009年電気関係学会東海支部連合大会
- Place of Presentation
  豊田市
- Year and Date
  20090900
[Presentation] 動車運転コーパスにおける行動観測信号の統合と利用2009
- Author(s)
  武田一哉, 尾崎晃, マルタルーカス, 西脇由博, 宮島千代美, 北岡教英
- Organizer
  2009年マルチメディア, 分散, 協調とモバイルシンポジウム
- Place of Presentation
  別府市
- Year and Date
  20090700
[Presentation] CENSREC-AV: Evaluation frameworks for audio- visual speech recognition2008
- Author(s)
  S. Tamura, C. Miyajima, N. Kitaoka, S. Hayamizu, K. Takeda
- Organizer
  2008 International Conference on Auditory and Visual Speech Processing
- Place of Presentation
  オーストラリア
- Year and Date
  20080900
[Presentation] CENSREC-4: Development of evaluation framework for distant- talking speech recognition under reverberant environments2008
- Author(s)
  M. Nakayama, T. Nishiura, Y. Denda, N. Kitaoka, K. Yamamoto, T. Yamada, S. Tsuge, C. Miyajima, M. Fujimoto, T. Takiguchi, S. Tamura, T. Ogawa, S. Matsuda, S. Kuroiwa, K. Takeda, S. Nakamura
- Organizer
  2008 International Conference on Spoken Language Processing
- Place of Presentation
  オーストラリア
- Year and Date
  20080900
[Presentation] In-car speech data collection along with various multimodal signals2008
- Author(s)
  L. Malta, P. Angkititrakul, C. Miyajima, K. Takeda
- Organizer
  2008 IEEE Intelligent Vehicles Symposium
- Place of Presentation
  オランダ
- Year and Date
  20080600
[Presentation] In-car speech data collection along with various multimodal signals2008
- Author(s)
  A. Ozaki, S. Hara, T. Kusakawa, C. Miyajima, T. Nishino, N. Kitaoka, K. Itou, K. Takeda
- Organizer
  2008 Language Resources and Evaluation Conference
- Place of Presentation
  モロッコ
- Year and Date
  20080500
[Presentation] CENSREC- 4: Development of evaluation framework for distant-talking speech recognition under reverberant environments2008
- Author(s)
  T. Nishiura, M. Nakayama, Y. Denda, N. Kitaoka, K. Yamamoto, T. Yamada, S. Tsuge, C. Miyajima, M. Fujimoto, T. Takiguchi, S. Tamura, S. Kuroiwa, K. Takeda, S. Nakamura
- Organizer
  2008 Language Resources and Evaluation Conference
- Place of Presentation
  モロッコ
- Year and Date
  20080500
[Presentation] Development of VAD evaluation framework CENSREC- 1-C and investigation of relationship between VAD and speech recognition performance2007
- Author(s)
  N. Kitaoka, K. Yamamoto, T. Kusamizu, S. Nakagawa, T. Yamada, S. Tsuge, C. Miyajima, T. Nishiura, M. Nakayama, Y. Denda, M. Fujimoto, T. Takiguchi, S. Tamura, S. Kuroiwa, K. Takeda, S. Nakamura
- Organizer
  2007 IEEE workshop on Automatic Speech Recognition and Understanding
- Place of Presentation
  京都市
- Year and Date
  20071200
[Presentation] On-going data collection for driving behavior signal2007
- Author(s)
  C. Miyajima, T. Kusakawa, T. Nishino, N. Kitaoka, K. Itou, K. Takeda
- Organizer
  2007 Biennial on DSP for in-Vehicle and Mobile Systems
- Place of Presentation
  トルコ
- Year and Date
  20070600
[Book] Multimodal Speech Corpora for Robust Japanese Speech Recognition in Noisy Environments(S. Itahashi and C.Y. Tseng eds., Computer Processing ofAsian Spoken Languages, Section 4.9(3))2010
- Author(s)
  S. Tamura, C. Miyajima
- Total Pages
  5

2009 Fiscal Year Final Research Report

Audio-visual speech corpus for evaluating speech recognition performance in noisy environments

Principal Investigator

MIYAJIMA Chiyomi Nagoya University, 大学院・情報科学研究科, 助教 (90335092)

Research Products

[Journal Article] CENSREC-1-C: An evaluation framework for voice activity detection under noisy environments2009

Author(s)

Journal Title

[Journal Article] 音声と画像の統合によるドライバの発話区間検出2008

Author(s)

Journal Title

[Journal Article] 多様な音響環境下における音声認識システム利用時のデータ収集システム2007

Author(s)

Journal Title

[Presentation] CENSREC- 1-AV:マルチモーダル音声認識コーパスの構築2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 複数音響モデルからの最適選択による音声認識2009

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 動車運転コーパスにおける行動観測信号の統合と利用2009

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] CENSREC-AV: Evaluation frameworks for audio- visual speech recognition2008

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] CENSREC-4: Development of evaluation framework for distant- talking speech recognition under reverberant environments2008

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] In-car speech data collection along with various multimodal signals2008

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] In-car speech data collection along with various multimodal signals2008

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] CENSREC- 4: Development of evaluation framework for distant-talking speech recognition under reverberant environments2008

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Development of VAD evaluation framework CENSREC- 1-C and investigation of relationship between VAD and speech recognition performance2007

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] On-going data collection for driving behavior signal2007

Author(s)

Organizer

Place of Presentation

Year and Date

[Book] Multimodal Speech Corpora for Robust Japanese Speech Recognition in Noisy Environments(S. Itahashi and C.Y. Tseng eds., Computer Processing ofAsian Spoken Languages, Section 4.9(3))2010

Author(s)

Total Pages