2007 Fiscal Year Annual Research Report

構造的表象に基づく音声の分析とその高頑健性音声認識への応用

Research Project

Project/Area Number	06J11711
Research Institution	The University of Tokyo
Principal Investigator	朝川智 The University of Tokyo, 大学院・新領域創成科学研究科, 特別研究員(DC1)
Keywords	音声分析 / 音声認識 / 音声の構造的表象
Research Abstract	従来の音声工学では,音響音声学に基づきスペクトル(声紋)をその物理表象として用いてきたが,スペクトルには性別・年齢などの生理学的特性や収録機器などの音響的特性の違いといった非言語的特徴が音響的な歪みとして不可避的に含まれ,音声認識の頑健性を低下させる一因となっている.本研究では,スペクトルのような個々の絶対的な音響特性を直接用いず,音響事象の相対関係,即ち音声のダイナミクスのみを抽出することにより,非言語的特徴の違いによる歪みを排除して,より安定で頑健な音声的照合を行う手法を提案し,従来の方法論とは全く異なる音声認識の枠組みを検討するものである. 本年度は,音響的実体を全く用いることのない,構造的表象のみに基づく単語認識実験を,様々な環境下において実験を行う計画であった。具体的に行った実験として,システムの想定する話者とは極端に特性の異なる話者による音声(極端に体格の小さい話者を模擬した音声)に対して認識実験を行い,従来の音響特徴量を直接比較する手法ではほぼ認識が不可能な場合においても,提案手法では約80%の認識率となり,本手法の非言語的特徴に対する頑健性を実験的に確認した.さらに本年度は,認識における識別器の高精度化を検討した.これまで学習データから得られる特徴量の分布と入力との尤度に基づいて認識を行っていたが,線形判別分析に基づく構造識別手法を提案し,これまでの手法よりも数〜10%ほどの認識率の向上を確認した.

Research Products

(4 results)

All 2007

All Presentation (4 results)

[Presentation] Random discriminant structure analysis for continuous Japanese vowel recognition2007
- Author(s)
  Y. Qiao
- Organizer
  ASRU 2007
- Place of Presentation
  Kyoto,Japan
- Year and Date
  2007-12-12
[Presentation] Recognition of connected Japanese vowel utterances using random discximinant structure analysis2007
- Author(s)
  Y. Qiao
- Organizer
  電子情報通信学会音声研究会
- Place of Presentation
  千葉工業大学
- Year and Date
  2007-11-28
[Presentation] 音声の構造的表象を用いた音声認識における特徴量空間分割とその効果2007
- Author(s)
  朝川智
- Organizer
  日本音響学会秋季研究発表会
- Place of Presentation
  山梨大学
- Year and Date
  2007-09-27
[Presentation] Automatic recognition of connected vowels only using speaker-invariant representation of speech dynamics2007
- Author(s)
  S. Asakawa
- Organizer
  Interspeech 2007
- Place of Presentation
  Antwerp,Belgium
- Year and Date
  2007-08-29

2007 Fiscal Year Annual Research Report

構造的表象に基づく音声の分析とその高頑健性音声認識への応用

Principal Investigator

朝川 智 The University of Tokyo, 大学院・新領域創成科学研究科, 特別研究員(DC1)

Research Products

[Presentation] Random discriminant structure analysis for continuous Japanese vowel recognition2007

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Recognition of connected Japanese vowel utterances using random discximinant structure analysis2007

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 音声の構造的表象を用いた音声認識における特徴量空間分割とその効果2007

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Automatic recognition of connected vowels only using speaker-invariant representation of speech dynamics2007

Author(s)

Organizer

Place of Presentation

Year and Date

朝川智 The University of Tokyo, 大学院・新領域創成科学研究科, 特別研究員(DC1)