Audio-visual speech corpus for evaluating speech recognition performance in noisy environments

Research Project

Project/Area Number	19700163
Research Category	Grant-in-Aid for Young Scientists (B)
Allocation Type	Single-year Grants
Research Field	Perception information processing/Intelligent robotics
Research Institution	Nagoya University
Principal Investigator	MIYAJIMA Chiyomi Nagoya University, 大学院・情報科学研究科, 助教 (90335092)
Project Period (FY)	2007 – 2009
Project Status	Completed (Fiscal Year 2009)
Budget Amount *help	¥3,843,644 (Direct Cost: ¥3,325,880、Indirect Cost: ¥517,764) Fiscal Year 2009: ¥813,644 (Direct Cost: ¥625,880、Indirect Cost: ¥187,764) Fiscal Year 2008: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000) Fiscal Year 2007: ¥1,600,000 (Direct Cost: ¥1,600,000)
Keywords	バイモーダル音声認識 / データベース / 雑音環境 / 車内雑音 / 雑音下音声認識 / 音声認識性能評価 / 近赤外映像 / 自動車内雑音 / 主成分分析 / オプティカルフロー
Research Abstract	Audio-visual speech data are collected in a silent room and a vehicle for developing an audio-visual speech corpus which is used for evaluating speech recognition performance in noisy environments, especially in in-car environments. Acoustic noise and gamma values of images are used for simulating in-car environments over the recorded data in the silent room. Baseline audio and visual features and an integration method are calibrated in some experimental evaluations. The corpus will be open to the public along with database manuals for research purposes.

Report

(4 results)

2009 Annual Research Report Final Research Report ( PDF )
2008 Annual Research Report
2007 Annual Research Report

Research Products
(28 results)

All 2010 2009 2008 2007

All Journal Article (6 results) (of which Peer Reviewed: 6 results) Presentation (20 results) Book (2 results)

[Journal Article] CENSREC-1-C: An evaluation framework for voice activity detection under noisy environments2009
- Author(s)
  N. Kitaoka, T. Yamada, S. Tsuge, C. Miyajima, K. Yamamoto, T. Nishiura, M. Nakayama, Y. Denda, M. Fujimoto, T. Takiguchi, S. Tamura, S. Matsuda, T. Ogawa, S. Kuroiwa, K. Takeda, S. Nakamura
- Journal Title
  
  Acoustical Science and Technology
  
  Pages: 363-371
- NAID
  10025992968
- Related Report
  2009 Final Research Report
- Peer Reviewed
[Journal Article] CENSREC-1-C : An evaluation framework for voice activity detection under noisy environments2009
- Author(s)
  N.Kitaoka, T.Yamada, S.Tsuge, C.Miyajima, K.Yamamoto, T.Nishiura, M.Nakayama, Y.Denda, M.Fujimoto, T.Takiguchi, S.Tamura, S.Matsuda, T.Ogawa, S.Kuroiwa, K.Takeda, S.Nakamura
- Journal Title
  
  Acoustical Science and Technology 30
  
  Pages: 363-371
- NAID
  10025992968
- Related Report
  2009 Annual Research Report
- Peer Reviewed
[Journal Article] 音声と画像の統合によるドライバの発話区間検出2008
- Author(s)
  二宮芳樹, 坂義秀, 前野俊希, 根木大輔, 宮島千代美, 森健策, 北坂孝幸, 末永康仁
- Journal Title
  
  映像情報メディア学会誌 vol.62,no.3
  
  Pages: 435-441
- NAID
  110006855164
- Related Report
  2009 Final Research Report
- Peer Reviewed
[Journal Article] 多様な音響環境下における音声認識システム利用時のデータ収集システム2007
- Author(s)
  原直, 宮島千代美, 伊藤克亘, 武田一哉
- Journal Title
  
  電子情報通信学会論文誌 vol.J90-D,no.10
  
  Pages: 1115-1123
- NAID
  110007380588
- Related Report
  2009 Final Research Report
- Peer Reviewed
[Journal Article] 音声と画像の統合によるドライバの発話区間検出2007
- Author(s)
  二宮芳樹, 坂義秀, 前野俊希, 根木大輔, 宮島千代美, 森健策, 北坂孝幸, 末永康仁
- Journal Title
  
  映像情報メディア学会誌 vol.62
  
  Pages: 435-441
- NAID
  110006855164
- Related Report
  2007 Annual Research Report
- Peer Reviewed
[Journal Article] 多様な音響環境下における音声認識システム利用時のデータ収集システム2007
- Author(s)
  原直, 宮島千代美, 伊藤克亘, 武田一哉
- Journal Title
  
  電子情報通信学会論文誌 vol.J90-D
  
  Pages: 1115-1123
- NAID
  110007380588
- Related Report
  2007 Annual Research Report
- Peer Reviewed
[Presentation] CENSREC-1-AV:マルチモーダル音声認識コーパスの構築2010
- Author(s)
  田村哲嗣, 宮島千代美, 北岡教英, 武田一哉, 山田武志, 滝口哲也, 柘植覚, 山本一公, 西浦敬信, 中山雅人, 傳田遊亀, 藤本雅清, 松田繁樹小川哲司, 黒岩眞吾, 中村哲
- Organizer
  2010年日本音響学会春季研究発表会
- Place of Presentation
  電気通信大学(東京都)
- Year and Date
  2010-03-08
- Related Report
  2009 Annual Research Report
[Presentation] CENSREC- 1-AV:マルチモーダル音声認識コーパスの構築2010
- Author(s)
  田村哲嗣, 宮島千代美, 北岡教英, 武田一哉, 山田武志, 滝口哲也, 柘植覚, 山本一公, 西浦敬信, 中山雅人, 傳田遊亀, 藤本雅清, 松田繁樹, 小川哲司, 黒岩眞吾, 中村哲
- Organizer
  2010年日本音響学会春季研究発表会
- Place of Presentation
  調布市
- Related Report
  2009 Final Research Report
[Presentation] 複数音響モデルからの最適選択による音声認識2009
- Author(s)
  伊藤新, 原直, 宮島千代美, 北岡教英, 武田一哉
- Organizer
  2009年電気関係学会東海支部連合大会
- Place of Presentation
  愛知工業大学(愛知県)
- Year and Date
  2009-09-10
- Related Report
  2009 Annual Research Report
[Presentation] 自動車運転コーパスにおける行動観測信号の統合と利用2009
- Author(s)
  武田一哉, 尾崎晃, マルタルーカス, 西脇由博, 宮島千代美, 北岡教英
- Organizer
  2009年マルチメディア,分散,協調とモバイルシンポジウム
- Place of Presentation
  杉乃井ホテル(大分県)
- Year and Date
  2009-07-08
- Related Report
  2009 Annual Research Report
[Presentation] 複数音響モデルからの最適選択による音声認識2009
- Author(s)
  伊藤新, 原直, 宮島千代美, 北岡教英, 武田一哉
- Organizer
  2009年電気関係学会東海支部連合大会
- Place of Presentation
  豊田市
- Related Report
  2009 Final Research Report
[Presentation] 動車運転コーパスにおける行動観測信号の統合と利用2009
- Author(s)
  武田一哉, 尾崎晃, マルタルーカス, 西脇由博, 宮島千代美, 北岡教英
- Organizer
  2009年マルチメディア, 分散, 協調とモバイルシンポジウム
- Place of Presentation
  別府市
- Related Report
  2009 Final Research Report
[Presentation] CENSREC-AV : Evaluation frameworks for audio-visual speech recognition2008
- Author(s)
  S. Tamura, C. Miyajima, N. Kitaoka, S. Hayamizu, K. Takeda
- Organizer
  International. Conference on Auditory and Visual Speech Processing
- Place of Presentation
  Tangalooma, Australia
- Year and Date
  2008-09-27
- Related Report
  2008 Annual Research Report
[Presentation] CENSREC-4 : Development of evaluation framework for distant-talking speech recognition under reverberant environments2008
- Author(s)
  M. Nakayama, T. Nishiura, Y. Denda, N. Kitaoka, K. Yamamoto, T. Yamada, S. Tsuge, C. Miyajima, M. Fujimoto, T. Takiguchi, S. Tamura, T. Ogawa, S. Matsuda, S. Kuroiwa, K. Takeda, S. Nakamura
- Organizer
  International Conference on Spoken Language Processing
- Place of Presentation
  Brisbane, Australia
- Year and Date
  2008-09-24
- Related Report
  2008 Annual Research Report
[Presentation] Multi-modal real-world driving data collection, transcription, and integration using Bayesian network2008
- Author(s)
  L. Malta, P. Angkititrakul, C. Mivaiima, K. Takeda
- Organizer
  Intelligent Vehicles Symposium
- Place of Presentation
  Marrakech, Morocco
- Year and Date
  2008-06-05
- Related Report
  2008 Annual Research Report
[Presentation] CENSREC-4 : Development of evaluation framework for distant-talking speech recognition under reverberant environments2008
- Author(s)
  T. Nishiura, M. Nakayama, Y. Denda, N. Kitaoka, K. Yamamoto, T. Yamada, S. Tsuge, C. Miyajima, M. Fujimoto, T. Takiguchi, S. Tamura, S. Kuroiwa, K. Takeda, and S. Nakamura
- Organizer
  Language Resources and Evaluation Conference
- Place of Presentation
  Marrakech, Morocco
- Year and Date
  2008-05-29
- Related Report
  2008 Annual Research Report
[Presentation] In-car speech data collection along with various multimodal signals2008
- Author(s)
  A. Ozaki, S. Hara, T. Kusakawa, C. Miyajima, T. Nishino, N. Kitaoka, K. Itou, K. Takeda
- Organizer
  Language Resources and Evaluation Conference
- Place of Presentation
  Marrakech, Morocco
- Year and Date
  2008-05-28
- Related Report
  2008 Annual Research Report
[Presentation] CENSREC-AV: Evaluation frameworks for audio- visual speech recognition2008
- Author(s)
  S. Tamura, C. Miyajima, N. Kitaoka, S. Hayamizu, K. Takeda
- Organizer
  2008 International Conference on Auditory and Visual Speech Processing
- Place of Presentation
  オーストラリア
- Related Report
  2009 Final Research Report
[Presentation] CENSREC-4: Development of evaluation framework for distant- talking speech recognition under reverberant environments2008
- Author(s)
  M. Nakayama, T. Nishiura, Y. Denda, N. Kitaoka, K. Yamamoto, T. Yamada, S. Tsuge, C. Miyajima, M. Fujimoto, T. Takiguchi, S. Tamura, T. Ogawa, S. Matsuda, S. Kuroiwa, K. Takeda, S. Nakamura
- Organizer
  2008 International Conference on Spoken Language Processing
- Place of Presentation
  オーストラリア
- Related Report
  2009 Final Research Report
[Presentation] In-car speech data collection along with various multimodal signals2008
- Author(s)
  L. Malta, P. Angkititrakul, C. Miyajima, K. Takeda
- Organizer
  2008 IEEE Intelligent Vehicles Symposium
- Place of Presentation
  オランダ
- Related Report
  2009 Final Research Report
[Presentation] In-car speech data collection along with various multimodal signals2008
- Author(s)
  A. Ozaki, S. Hara, T. Kusakawa, C. Miyajima, T. Nishino, N. Kitaoka, K. Itou, K. Takeda
- Organizer
  2008 Language Resources and Evaluation Conference
- Place of Presentation
  モロッコ
- Related Report
  2009 Final Research Report
[Presentation] CENSREC- 4: Development of evaluation framework for distant-talking speech recognition under reverberant environments2008
- Author(s)
  T. Nishiura, M. Nakayama, Y. Denda, N. Kitaoka, K. Yamamoto, T. Yamada, S. Tsuge, C. Miyajima, M. Fujimoto, T. Takiguchi, S. Tamura, S. Kuroiwa, K. Takeda, S. Nakamura
- Organizer
  2008 Language Resources and Evaluation Conference
- Place of Presentation
  モロッコ
- Related Report
  2009 Final Research Report
[Presentation] Development of VAD evaluation framework CENSREC- 1-C and investigation of relationship between VAD and speech recognition performance2007
- Author(s)
  N. Kitaoka, K. Yamamoto, T. Kusamizu, S. Nakagawa, T. Yamada, S. Tsuge, C. Miyajima, T. Nishiura, M. Nakayama, Y. Denda, M. Fujimoto, T. Takiguchi, S. Tamura, S. Kuroiwa, K. Takeda, S. Nakamura
- Organizer
  2007 IEEE workshop on Automatic Speech Recognition and Understanding
- Place of Presentation
  京都市
- Related Report
  2009 Final Research Report
[Presentation] On-going data collection for driving behavior signal2007
- Author(s)
  C. Miyajima, T. Kusakawa, T. Nishino, N. Kitaoka, K. Itou, K. Takeda
- Organizer
  2007 Biennial on DSP for in-Vehicle and Mobile Systems
- Place of Presentation
  トルコ
- Related Report
  2009 Final Research Report
[Presentation] Development of VAD evaluation framework CENSREC-1-C and investigation of relationship between VAD and speech recognition perfor mance2007
- Author(s)
  N. Kitaoka, K. Yamamoto, T. Kusamizu, S. Nakagawa, T. Yamada, S. Tsuge, C. Miyajima, T. Nishiura, M. Nakayama, Y. Denda, M. Fujimoto, T. Takiguchi, S. Tamura, S. Kuroiwa, K. Takeda, and S. Nakamura
- Organizer
  Proc. IEEE workshop on Automatic Speech Recognition and Understanding
- Place of Presentation
  Kyoto, Japan
- Related Report
  2007 Annual Research Report
[Presentation] On-going data collection for driving behavior signal2007
- Author(s)
  C. Miyajima, T. Kusakawa, T. Nishino, N. Kitaoka, K. Itou, and K. Takeda,
- Organizer
  Proc. 2007 Biennial on DSP for in-Vehicle and Mobile Systems
- Place of Presentation
  Istanbul, Turkey
- Related Report
  2007 Annual Research Report
[Book] Multimodal Speech Corpora for Robust Japanese Speech Recognition in Noisy Environments(S. Itahashi and C.Y. Tseng eds., Computer Processing ofAsian Spoken Languages, Section 4.9(3))2010
- Author(s)
  S. Tamura, C. Miyajima
- Total Pages
  5
- Related Report
  2009 Final Research Report
[Book] Computer Processing of Asian Spoken Languages (Section 4.10)(S. Itahashi, C.Y. Tseng eds., Multimodal Speech Corpora for Robust Japanese Speech Recognition in Noisy Environments)2010
- Author(s)
  M.Tamura, C.Mivajima
- Total Pages
  5
- Publisher
  Japanese Writer's House
- Related Report
  2009 Annual Research Report

Audio-visual speech corpus for evaluating speech recognition performance in noisy environments

Principal Investigator

MIYAJIMA Chiyomi Nagoya University, 大学院・情報科学研究科, 助教 (90335092)

¥3,843,644 (Direct Cost: ¥3,325,880、Indirect Cost: ¥517,764)

Report

Research Products

[Journal Article] CENSREC-1-C: An evaluation framework for voice activity detection under noisy environments2009

Author(s)

Journal Title

NAID

Related Report

[Journal Article] CENSREC-1-C : An evaluation framework for voice activity detection under noisy environments2009

Author(s)

Journal Title

NAID

Related Report

[Journal Article] 音声と画像の統合によるドライバの発話区間検出2008

Author(s)

Journal Title

NAID

Related Report

[Journal Article] 多様な音響環境下における音声認識システム利用時のデータ収集システム2007

Author(s)

Journal Title

NAID

Related Report

[Journal Article] 音声と画像の統合によるドライバの発話区間検出2007

Author(s)

Journal Title

NAID

Related Report

[Journal Article] 多様な音響環境下における音声認識システム利用時のデータ収集システム2007

Author(s)

Journal Title

NAID

Related Report

[Presentation] CENSREC-1-AV:マルチモーダル音声認識コーパスの構築2010

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] CENSREC- 1-AV:マルチモーダル音声認識コーパスの構築2010

Author(s)

Organizer

Place of Presentation

Related Report

[Presentation] 複数音響モデルからの最適選択による音声認識2009

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 自動車運転コーパスにおける行動観測信号の統合と利用2009

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 複数音響モデルからの最適選択による音声認識2009

Author(s)

Organizer

Place of Presentation

Related Report

[Presentation] 動車運転コーパスにおける行動観測信号の統合と利用2009

Author(s)

Organizer

Place of Presentation

Related Report

[Presentation] CENSREC-AV : Evaluation frameworks for audio-visual speech recognition2008

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] CENSREC-4 : Development of evaluation framework for distant-talking speech recognition under reverberant environments2008

Author(s)

Organizer

Place of Presentation

Year and Date