• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Audio-visual speech corpus for evaluating speech recognition performance in noisy environments

Research Project

Project/Area Number 19700163
Research Category

Grant-in-Aid for Young Scientists (B)

Allocation TypeSingle-year Grants
Research Field Perception information processing/Intelligent robotics
Research InstitutionNagoya University

Principal Investigator

MIYAJIMA Chiyomi  Nagoya University, 大学院・情報科学研究科, 助教 (90335092)

Project Period (FY) 2007 – 2009
Project Status Completed (Fiscal Year 2009)
Budget Amount *help
¥3,843,644 (Direct Cost: ¥3,325,880、Indirect Cost: ¥517,764)
Fiscal Year 2009: ¥813,644 (Direct Cost: ¥625,880、Indirect Cost: ¥187,764)
Fiscal Year 2008: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000)
Fiscal Year 2007: ¥1,600,000 (Direct Cost: ¥1,600,000)
Keywordsバイモーダル音声認識 / データベース / 雑音環境 / 車内雑音 / 雑音下音声認識 / 音声認識性能評価 / 近赤外映像 / 自動車内雑音 / 主成分分析 / オプティカルフロー
Research Abstract

Audio-visual speech data are collected in a silent room and a vehicle for developing an audio-visual speech corpus which is used for evaluating speech recognition performance in noisy environments, especially in in-car environments. Acoustic noise and gamma values of images are used for simulating in-car environments over the recorded data in the silent room. Baseline audio and visual features and an integration method are calibrated in some experimental evaluations. The corpus will be open to the public along with database manuals for research purposes.

Report

(4 results)
  • 2009 Annual Research Report   Final Research Report ( PDF )
  • 2008 Annual Research Report
  • 2007 Annual Research Report
  • Research Products

    (28 results)

All 2010 2009 2008 2007

All Journal Article (6 results) (of which Peer Reviewed: 6 results) Presentation (20 results) Book (2 results)

  • [Journal Article] CENSREC-1-C: An evaluation framework for voice activity detection under noisy environments2009

    • Author(s)
      N. Kitaoka, T. Yamada, S. Tsuge, C. Miyajima, K. Yamamoto, T. Nishiura, M. Nakayama, Y. Denda, M. Fujimoto, T. Takiguchi, S. Tamura, S. Matsuda, T. Ogawa, S. Kuroiwa, K. Takeda, S. Nakamura
    • Journal Title

      Acoustical Science and Technology

      Pages: 363-371

    • NAID

      10025992968

    • Related Report
      2009 Final Research Report
    • Peer Reviewed
  • [Journal Article] CENSREC-1-C : An evaluation framework for voice activity detection under noisy environments2009

    • Author(s)
      N.Kitaoka, T.Yamada, S.Tsuge, C.Miyajima, K.Yamamoto, T.Nishiura, M.Nakayama, Y.Denda, M.Fujimoto, T.Takiguchi, S.Tamura, S.Matsuda, T.Ogawa, S.Kuroiwa, K.Takeda, S.Nakamura
    • Journal Title

      Acoustical Science and Technology 30

      Pages: 363-371

    • NAID

      10025992968

    • Related Report
      2009 Annual Research Report
    • Peer Reviewed
  • [Journal Article] 音声と画像の統合によるドライバの発話区間検出2008

    • Author(s)
      二宮芳樹, 坂義秀, 前野俊希, 根木大輔, 宮島千代美, 森健策, 北坂孝幸, 末永康仁
    • Journal Title

      映像情報メディア学会誌 vol.62,no.3

      Pages: 435-441

    • NAID

      110006855164

    • Related Report
      2009 Final Research Report
    • Peer Reviewed
  • [Journal Article] 多様な音響環境下における音声認識システム利用時のデータ収集システム2007

    • Author(s)
      原直, 宮島千代美, 伊藤克亘, 武田一哉
    • Journal Title

      電子情報通信学会論文誌 vol.J90-D,no.10

      Pages: 1115-1123

    • NAID

      110007380588

    • Related Report
      2009 Final Research Report
    • Peer Reviewed
  • [Journal Article] 音声と画像の統合によるドライバの発話区間検出2007

    • Author(s)
      二宮芳樹, 坂義秀, 前野俊希, 根木大輔, 宮島千代美, 森健策, 北坂孝幸, 末永康仁
    • Journal Title

      映像情報メディア学会誌 vol.62

      Pages: 435-441

    • NAID

      110006855164

    • Related Report
      2007 Annual Research Report
    • Peer Reviewed
  • [Journal Article] 多様な音響環境下における音声認識システム利用時のデータ収集システム2007

    • Author(s)
      原直, 宮島千代美, 伊藤克亘, 武田一哉
    • Journal Title

      電子情報通信学会論文誌 vol.J90-D

      Pages: 1115-1123

    • NAID

      110007380588

    • Related Report
      2007 Annual Research Report
    • Peer Reviewed
  • [Presentation] CENSREC-1-AV:マルチモーダル音声認識コーパスの構築2010

    • Author(s)
      田村哲嗣, 宮島千代美, 北岡教英, 武田一哉, 山田武志, 滝口哲也, 柘植覚, 山本一公, 西浦敬信, 中山雅人, 傳田遊亀, 藤本雅清, 松田繁樹小川哲司, 黒岩眞吾, 中村哲
    • Organizer
      2010年日本音響学会春季研究発表会
    • Place of Presentation
      電気通信大学(東京都)
    • Year and Date
      2010-03-08
    • Related Report
      2009 Annual Research Report
  • [Presentation] CENSREC- 1-AV:マルチモーダル音声認識コーパスの構築2010

    • Author(s)
      田村哲嗣, 宮島千代美, 北岡教英, 武田一哉, 山田武志, 滝口哲也, 柘植覚, 山本一公, 西浦敬信, 中山雅人, 傳田遊亀, 藤本雅清, 松田繁樹, 小川哲司, 黒岩眞吾, 中村哲
    • Organizer
      2010年日本音響学会春季研究発表会
    • Place of Presentation
      調布市
    • Related Report
      2009 Final Research Report
  • [Presentation] 複数音響モデルからの最適選択による音声認識2009

    • Author(s)
      伊藤新, 原直, 宮島千代美, 北岡教英, 武田一哉
    • Organizer
      2009年電気関係学会東海支部連合大会
    • Place of Presentation
      愛知工業大学(愛知県)
    • Year and Date
      2009-09-10
    • Related Report
      2009 Annual Research Report
  • [Presentation] 自動車運転コーパスにおける行動観測信号の統合と利用2009

    • Author(s)
      武田一哉, 尾崎晃, マルタルーカス, 西脇由博, 宮島千代美, 北岡教英
    • Organizer
      2009年マルチメディア,分散,協調とモバイルシンポジウム
    • Place of Presentation
      杉乃井ホテル(大分県)
    • Year and Date
      2009-07-08
    • Related Report
      2009 Annual Research Report
  • [Presentation] 複数音響モデルからの最適選択による音声認識2009

    • Author(s)
      伊藤新, 原直, 宮島千代美, 北岡教英, 武田一哉
    • Organizer
      2009年電気関係学会東海支部連合大会
    • Place of Presentation
      豊田市
    • Related Report
      2009 Final Research Report
  • [Presentation] 動車運転コーパスにおける行動観測信号の統合と利用2009

    • Author(s)
      武田一哉, 尾崎晃, マルタルーカス, 西脇由博, 宮島千代美, 北岡教英
    • Organizer
      2009年マルチメディア, 分散, 協調とモバイルシンポジウム
    • Place of Presentation
      別府市
    • Related Report
      2009 Final Research Report
  • [Presentation] CENSREC-AV : Evaluation frameworks for audio-visual speech recognition2008

    • Author(s)
      S. Tamura, C. Miyajima, N. Kitaoka, S. Hayamizu, K. Takeda
    • Organizer
      International. Conference on Auditory and Visual Speech Processing
    • Place of Presentation
      Tangalooma, Australia
    • Year and Date
      2008-09-27
    • Related Report
      2008 Annual Research Report
  • [Presentation] CENSREC-4 : Development of evaluation framework for distant-talking speech recognition under reverberant environments2008

    • Author(s)
      M. Nakayama, T. Nishiura, Y. Denda, N. Kitaoka, K. Yamamoto, T. Yamada, S. Tsuge, C. Miyajima, M. Fujimoto, T. Takiguchi, S. Tamura, T. Ogawa, S. Matsuda, S. Kuroiwa, K. Takeda, S. Nakamura
    • Organizer
      International Conference on Spoken Language Processing
    • Place of Presentation
      Brisbane, Australia
    • Year and Date
      2008-09-24
    • Related Report
      2008 Annual Research Report
  • [Presentation] Multi-modal real-world driving data collection, transcription, and integration using Bayesian network2008

    • Author(s)
      L. Malta, P. Angkititrakul, C. Mivaiima, K. Takeda
    • Organizer
      Intelligent Vehicles Symposium
    • Place of Presentation
      Marrakech, Morocco
    • Year and Date
      2008-06-05
    • Related Report
      2008 Annual Research Report
  • [Presentation] CENSREC-4 : Development of evaluation framework for distant-talking speech recognition under reverberant environments2008

    • Author(s)
      T. Nishiura, M. Nakayama, Y. Denda, N. Kitaoka, K. Yamamoto, T. Yamada, S. Tsuge, C. Miyajima, M. Fujimoto, T. Takiguchi, S. Tamura, S. Kuroiwa, K. Takeda, and S. Nakamura
    • Organizer
      Language Resources and Evaluation Conference
    • Place of Presentation
      Marrakech, Morocco
    • Year and Date
      2008-05-29
    • Related Report
      2008 Annual Research Report
  • [Presentation] In-car speech data collection along with various multimodal signals2008

    • Author(s)
      A. Ozaki, S. Hara, T. Kusakawa, C. Miyajima, T. Nishino, N. Kitaoka, K. Itou, K. Takeda
    • Organizer
      Language Resources and Evaluation Conference
    • Place of Presentation
      Marrakech, Morocco
    • Year and Date
      2008-05-28
    • Related Report
      2008 Annual Research Report
  • [Presentation] CENSREC-AV: Evaluation frameworks for audio- visual speech recognition2008

    • Author(s)
      S. Tamura, C. Miyajima, N. Kitaoka, S. Hayamizu, K. Takeda
    • Organizer
      2008 International Conference on Auditory and Visual Speech Processing
    • Place of Presentation
      オーストラリア
    • Related Report
      2009 Final Research Report
  • [Presentation] CENSREC-4: Development of evaluation framework for distant- talking speech recognition under reverberant environments2008

    • Author(s)
      M. Nakayama, T. Nishiura, Y. Denda, N. Kitaoka, K. Yamamoto, T. Yamada, S. Tsuge, C. Miyajima, M. Fujimoto, T. Takiguchi, S. Tamura, T. Ogawa, S. Matsuda, S. Kuroiwa, K. Takeda, S. Nakamura
    • Organizer
      2008 International Conference on Spoken Language Processing
    • Place of Presentation
      オーストラリア
    • Related Report
      2009 Final Research Report
  • [Presentation] In-car speech data collection along with various multimodal signals2008

    • Author(s)
      L. Malta, P. Angkititrakul, C. Miyajima, K. Takeda
    • Organizer
      2008 IEEE Intelligent Vehicles Symposium
    • Place of Presentation
      オランダ
    • Related Report
      2009 Final Research Report
  • [Presentation] In-car speech data collection along with various multimodal signals2008

    • Author(s)
      A. Ozaki, S. Hara, T. Kusakawa, C. Miyajima, T. Nishino, N. Kitaoka, K. Itou, K. Takeda
    • Organizer
      2008 Language Resources and Evaluation Conference
    • Place of Presentation
      モロッコ
    • Related Report
      2009 Final Research Report
  • [Presentation] CENSREC- 4: Development of evaluation framework for distant-talking speech recognition under reverberant environments2008

    • Author(s)
      T. Nishiura, M. Nakayama, Y. Denda, N. Kitaoka, K. Yamamoto, T. Yamada, S. Tsuge, C. Miyajima, M. Fujimoto, T. Takiguchi, S. Tamura, S. Kuroiwa, K. Takeda, S. Nakamura
    • Organizer
      2008 Language Resources and Evaluation Conference
    • Place of Presentation
      モロッコ
    • Related Report
      2009 Final Research Report
  • [Presentation] Development of VAD evaluation framework CENSREC- 1-C and investigation of relationship between VAD and speech recognition performance2007

    • Author(s)
      N. Kitaoka, K. Yamamoto, T. Kusamizu, S. Nakagawa, T. Yamada, S. Tsuge, C. Miyajima, T. Nishiura, M. Nakayama, Y. Denda, M. Fujimoto, T. Takiguchi, S. Tamura, S. Kuroiwa, K. Takeda, S. Nakamura
    • Organizer
      2007 IEEE workshop on Automatic Speech Recognition and Understanding
    • Place of Presentation
      京都市
    • Related Report
      2009 Final Research Report
  • [Presentation] On-going data collection for driving behavior signal2007

    • Author(s)
      C. Miyajima, T. Kusakawa, T. Nishino, N. Kitaoka, K. Itou, K. Takeda
    • Organizer
      2007 Biennial on DSP for in-Vehicle and Mobile Systems
    • Place of Presentation
      トルコ
    • Related Report
      2009 Final Research Report
  • [Presentation] Development of VAD evaluation framework CENSREC-1-C and investigation of relationship between VAD and speech recognition perfor mance2007

    • Author(s)
      N. Kitaoka, K. Yamamoto, T. Kusamizu, S. Nakagawa, T. Yamada, S. Tsuge, C. Miyajima, T. Nishiura, M. Nakayama, Y. Denda, M. Fujimoto, T. Takiguchi, S. Tamura, S. Kuroiwa, K. Takeda, and S. Nakamura
    • Organizer
      Proc. IEEE workshop on Automatic Speech Recognition and Understanding
    • Place of Presentation
      Kyoto, Japan
    • Related Report
      2007 Annual Research Report
  • [Presentation] On-going data collection for driving behavior signal2007

    • Author(s)
      C. Miyajima, T. Kusakawa, T. Nishino, N. Kitaoka, K. Itou, and K. Takeda,
    • Organizer
      Proc. 2007 Biennial on DSP for in-Vehicle and Mobile Systems
    • Place of Presentation
      Istanbul, Turkey
    • Related Report
      2007 Annual Research Report
  • [Book] Multimodal Speech Corpora for Robust Japanese Speech Recognition in Noisy Environments(S. Itahashi and C.Y. Tseng eds., Computer Processing ofAsian Spoken Languages, Section 4.9(3))2010

    • Author(s)
      S. Tamura, C. Miyajima
    • Total Pages
      5
    • Related Report
      2009 Final Research Report
  • [Book] Computer Processing of Asian Spoken Languages (Section 4.10)(S. Itahashi, C.Y. Tseng eds., Multimodal Speech Corpora for Robust Japanese Speech Recognition in Noisy Environments)2010

    • Author(s)
      M.Tamura, C.Mivajima
    • Total Pages
      5
    • Publisher
      Japanese Writer's House
    • Related Report
      2009 Annual Research Report

URL: 

Published: 2007-04-01   Modified: 2016-04-21  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi