2011 Fiscal Year Annual Research Report

WFSTによる音声認識の高度化

Research Project

Project/Area Number	21300062
Research Institution	Tokyo Institute of Technology
Principal Investigator	古井貞熙東京工業大学, 名誉教授 (90293076)
Co-Investigator(Kenkyū-buntansha)	篠田浩一東京工業大学, 大学院・情報理工学研究科, 准教授 (10343097) 篠崎隆宏千葉大学, 大学院・融合科学研究科, 助教 (80447903)
Keywords	音声認識 / WFST / デコーダ
Research Abstract	WFSTによる音声認識デコーダの機能の高度化と、多様な目的に適用可能なフレキシブルデコーダの実現と応用を図り、下記の種々の実績を上げた。 1.音声・非音声特徴を組み込んだデコーダの改良と評価:雑音環壌下で頑健に動作する音声認識を実現するため、音声・非音声の度合いを示すスコアを仮説評価スコアに組み込んだデコーダの改良を進め、標準的なデータベースを用いた評価実験により、提案法の有効性を確認した。 2.複数言語混在音声の認識への適用:インドネシア語の音声認識において、英語とインドネシア語が文間、あるいは文内で入れ替わる状況(code-switching)に対応するため、code-switching言語モデルと、単独の言語の言語モデルを組み合わせる2種類の方法を検討し、それぞれ認識タスクの特徴に対応して特長があることが確認された。 3.音声認識誤り訂正の容易なインタフェースの検討:音声認識を用いた入力インタフェースにおいて、ユーザが認識結果候補を参照しながら逐次的に誤りを訂正する過程で、更新された言語モデルを用いて、候補単語リスト中での正しい単語のランクを自動的に上げることにより、誤り訂正を容易にする方法を提案し、その有効性を実験的に確認した。 4.眼電位入力インタフェースへの適用:筋委縮性側索硬化症(ALS)において、眼球運動だけが最後まで障害されないことに基づき、眼電位を用いて眼球動作を認識する方法について検討した。複数電極からの電位入力に対して音声認識デコーダを用いた認識実験を行い、眼電位を用いたコミュニケーションの可能性を確認した。 5.デコーダの公開:開発したT3音声認識デコーダについて、NICTから国内の研究者への公開を進め、ソースコードの公開、さらに海外研究者への公開を可能とした。

Research Products
(25 results)

All 2012 2011

All Journal Article (2 results) (of which Peer Reviewed: 2 results) Presentation (22 results) Book (1 results)

[Journal Article] 軽量な画像特徴量を用いたマルチモーダル音声認識2012
- Author(s)
  吉川正祥、篠崎隆宏、岩野公司、古井貞煕
- Journal Title
  
  電子情報通信学会論文誌
  
  Volume: Vol.J95-D Pages: 618-627
- Peer Reviewed
[Journal Article] Committee-Based Active Learning for Speech Recognition2011
- Author(s)
  Yuzo Hamanaka, Koichi Shinoda, Takuya Tsutaoka, Sadaoki Furui, Tadashi Emori, Takafumi Koshinaka
- Journal Title
  
  IEICE Trans.Inf. & Syst.
  
  Volume: E94-D Pages: 2015-2023
- Peer Reviewed
[Presentation] Unsupervised CV language model adaptation based on direct likelihood maximization sentence selection2012
- Author(s)
  Takahiro Shinozaki, Yasuo Horiuchi, Shingo Kuroiwa
- Organizer
  ICASSP2012
- Place of Presentation
  京都市(京都府)
- Year and Date
  2012-03-27
[Presentation] 眼電位入力音声合成インタフェースの提案とユーザー適応の検討2012
- Author(s)
  房福明, 篠崎隆宏, 堀内靖雄, 黒岩眞吾, 古井貞煕, 武者利光
- Organizer
  知能システムシンポジウム
- Place of Presentation
  千葉市(千葉県)
- Year and Date
  2012-03-16
[Presentation] MAP adaptation using multiple priors for speaker verification2012
- Author(s)
  Sangeeta Biswas, Johan Rohdin, Koichi Shinoda, Sadaoki Furui
- Organizer
  日本音響学会春季研究発表会
- Place of Presentation
  横浜市(神奈川県)
- Year and Date
  2012-03-15
[Presentation] Language model for efficient error correction in speech recognition2012
- Author(s)
  Yuan Liang, Koichi Shinoda, Sadaoki Furui
- Organizer
  日本音響学会春季研究発表会
- Place of Presentation
  横浜市(神奈川県)
- Year and Date
  2012-03-15
[Presentation] 言語モデルの順向き最尤文選択適応への教師なしクロスバリデーション適応法の応用2012
- Author(s)
  篠崎隆宏, 堀内靖雄, 黒岩眞吾
- Organizer
  日本音響学会春季研究発表会
- Place of Presentation
  横浜市(神奈川県)
- Year and Date
  2012-03-15
[Presentation] Recognition of Indonesian code-switching speech2012
- Author(s)
  Yonatan Andy Fajar Nugraha, Koichi Shinoda, Sadaoki Furui, Koji Iwano
- Organizer
  日本音響学会春季研究発表会
- Place of Presentation
  横浜市(神奈川県)
- Year and Date
  2012-03-14
[Presentation] Two-pass approach for recognizing code-switching speech2012
- Author(s)
  Yonatan Andy Fajar Nugraha, Koichi Shinoda, Sadaoki Furui, Koji Iwano
- Organizer
  IEICE Speech Technical Committee Meeting
- Place of Presentation
  仙台(宮城県)
- Year and Date
  2012-02-09
[Presentation] 隠れマルコフモデルを用いた眼電位認識の研究2012
- Author(s)
  房福明, 篠崎隆宏, 堀内靖雄, 黒岩眞吾, 古井貞煕, 武者利光
- Organizer
  電子情報通信学会音声研究会
- Place of Presentation
  仙台(宮城県)
- Year and Date
  2012-02-09
[Presentation] Speaker verification using MMAP adaptation2011
- Author(s)
  Sangeeta Biswas, Johan Rohdin, Koichi Shinoda, Sadaoki Furui
- Organizer
  IEICE Speech Technical Committee Meeting
- Place of Presentation
  東京(東京都)
- Year and Date
  2011-12-19
[Presentation] Designing text corpus using phone-error distribution for acoustic modeling2011
- Author(s)
  Hiroko Murakami, Koichi Shinoda, Sadaoki Furui
- Organizer
  2011 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)
- Place of Presentation
  Hawaii(米国)
- Year and Date
  2011-12-11
[Presentation] Person Authentication using 3D Human Motion2011
- Author(s)
  Felipe Gomez-Caballero, Takahiro Shinozaki, Sadaoki Furui, Koichi Shinoda
- Organizer
  ACM Multimedia'11
- Place of Presentation
  Scottsdale(米国)
- Year and Date
  2011-11-28
[Presentation] Compact speech decoder based on pure functional programming2011
- Author(s)
  Takahiro Shinozaki, Masakazu Sekijima, Shigeki Hagihara, Sadaoki Furui
- Organizer
  APSIPA ASC 2011
- Place of Presentation
  Xi'an(中国)
- Year and Date
  2011-10-18
[Presentation] Strategies for model training and adaptation based on data dependency control2011
- Author(s)
  Takahiro Shinozaki, Sadaoki Furui
- Organizer
  APSIPA ASC 2011
- Place of Presentation
  Xi'an(中国)(招待講演)
- Year and Date
  2011-10-18
[Presentation] Noise Robust Speech Recognition based on Spectral Reduction Measure2011
- Author(s)
  Mayumi Beppu, Koichi Shinoda, Sadaoki Furui
- Organizer
  APSIPA ASC 2011
- Place of Presentation
  Xi'an(中国)
- Year and Date
  2011-10-18
[Presentation] GMM尤度補正を用いた耐雑音音声認識2011
- Author(s)
  那須悠, 篠田浩一, 古井貞煕
- Organizer
  日本音響学会秋季研究発表会
- Place of Presentation
  島根大学(島根県)
- Year and Date
  2011-09-20
[Presentation] Speech processing tools-An introduction to interoperability2011
- Author(s)
  Christoph Draxler, Thomas Altosaar, Sadaoki Furui, Mark Liberman, Peter Wittenburg
- Organizer
  INTERSPEECH 2011
- Place of Presentation
  Florence(イタリア)(招待講演)
- Year and Date
  2011-08-28
[Presentation] Structual Joint Factor Analysis for Speaker Recognition2011
- Author(s)
  Marc Ferras, Koichi Shinoda, Sadaoki Furui
- Organizer
  INTERSPEECH2011
- Place of Presentation
  Florence(イタリア)
- Year and Date
  2011-08-28
[Presentation] Sentence selection by direct likelihood maximization for language model adaptation2011
- Author(s)
  Takahiro Shinozaki, Yu Kubota, Sadaoki Furui, Eiji Utsunomiya, Yasutaka Shindoh
- Organizer
  INTERSPEECH 2011
- Place of Presentation
  Florence(イタリア)
- Year and Date
  2011-08-28
[Presentation] Acoustic Forest for SMAP-based Speaker Verification2011
- Author(s)
  Sangeeta Biswas, Marc Ferras, Koichi Shinoda, Sadaoki Furui
- Organizer
  INTERSPEECH 2011
- Place of Presentation
  Florence(イタリア)
- Year and Date
  2011-08-28
[Presentation] A compact speech decoder based on pure functional programming2011
- Author(s)
  Takahiro Shinozaki, Masakazu Sekijima, Shigeki Hagihara, Sadaoki Furui
- Organizer
  IPSJ-SIGPRO
- Place of Presentation
  函館(北海道)
- Year and Date
  2011-06-14
[Presentation] Cross-channel spectral subtraction for meeting speech recognition2011
- Author(s)
  Yu Nasu, Koichi Shinoda, Sadaoki Furui
- Organizer
  ICASSP2011
- Place of Presentation
  Prague(チェコ)
- Year and Date
  2011-05-22
[Presentation] Structual MAP adaption in GMM-supervector based speaker recognition2011
- Author(s)
  Marc Ferras, Koichi Shinoda, Sadaoki Furui
- Organizer
  ICASSP2011
- Place of Presentation
  Prague(チェコ)
- Year and Date
  2011-05-22
[Book] Robust speech recognition in the car environment, in LTC 2009, LNAI 65622011
- Author(s)
  Agnieszka Betkowska Cavalcante, Koichi Shinoda, Sadaoki Furui
- Total Pages
  24-34
- Publisher
  Springer

2011 Fiscal Year Annual Research Report

WFSTによる音声認識の高度化

Principal Investigator

古井 貞熙 東京工業大学, 名誉教授 (90293076)

Research Products

[Journal Article] 軽量な画像特徴量を用いたマルチモーダル音声認識2012

Author(s)

Journal Title

[Journal Article] Committee-Based Active Learning for Speech Recognition2011

Author(s)

Journal Title

[Presentation] Unsupervised CV language model adaptation based on direct likelihood maximization sentence selection2012

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 眼電位入力音声合成インタフェースの提案とユーザー適応の検討2012

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] MAP adaptation using multiple priors for speaker verification2012

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Language model for efficient error correction in speech recognition2012

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 言語モデルの順向き最尤文選択適応への教師なしクロスバリデーション適応法の応用2012

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Recognition of Indonesian code-switching speech2012

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Two-pass approach for recognizing code-switching speech2012

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 隠れマルコフモデルを用いた眼電位認識の研究2012

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Speaker verification using MMAP adaptation2011

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Designing text corpus using phone-error distribution for acoustic modeling2011

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Person Authentication using 3D Human Motion2011

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Compact speech decoder based on pure functional programming2011

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Strategies for model training and adaptation based on data dependency control2011

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Noise Robust Speech Recognition based on Spectral Reduction Measure2011

Author(s)

Organizer

Place of Presentation

古井貞熙東京工業大学, 名誉教授 (90293076)