2002 Fiscal Year Final Research Report Summary

Asynchronous-Transition Hidden Markov Model with State-Tying across Time for Automatic Speech Recognition

Research Project

Project/Area Number	12680375
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Single-year Grants
Section	一般
Research Field	Intelligent informatics
Research Institution	Japan Advanced Institute of Science and Technology
Principal Investigator	SHIMODAIRA Hiroshi JAIST, School of Information, Science, Associate Professor, 情報科学研究科, 助教授 (30206239)
Co-Investigator(Kenkyū-buntansha)	NAKAI Mitsuru JAIST, School of Information Science, Research Associate, 情報科学研究科, 助手 (60283149) SAGAYAMA Shigeki The University of Tokyo, Graduate School of Information Science and Technology, Professor, 大学院・情報理工学系研究科, 教授 (00303321)
Project Period (FY)	2000 – 2002
Keywords	hidden Markov model / HMM / asynchronous-transition / AT-HMM
Research Abstract	This project aimed to improve acoustic models for speech recognition systems. The state-of-the-art hidden Markov model (HMM) based acoustic models usually treat the acoustic features as a chain of stationary signal sources. The observed values of these features are represented by vectors. We assumed that they might be better modeled by individual vector components. We discussed two methods based on this assumption In the first method, wearied to model asynchronous changes of individual acoustic vector components. Conventional HMM implicitly assumes that individual components change their statistical properties simultaneously. This assumption might be not true. Temporally changing patterns of individual acoustic components do not necessarily synchronize with beach other. We proposed a new HMM that allowed asynchronous state transitions between individual vector components. We demonstrated that this new HMM outperformed the conventional HMM in speaker-dependent speech recognition task In the second method, we tried to model phoneme context dependency of individual acoustic vector components. Conventional parameter tying techniques provide a common tying structure for all vector components, no matter how different is their individual components complexity and phoneme context dependency. In this discussion, we proposed a new parameter tying technique that allowed to have distinct tying structures for each component. Our experimental results showed that proposed HMM with feature-depended tying worked better than conventional HMM with a common tying

Research Products
(12 results)

All Other

All Publications (12 results)

[Publications] S.Matsuda: "Asynchronous-Transition HMM"Proc. 2000 International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2. 1001-1004 (2000)
- Description
  「研究成果報告書概要(和文)」より
[Publications] S.Matsuda: "Feature-dependent Allophone Clustering"Proc. International Conference on Spoken Language Processing (IC-SLP2000). 1. 413-416 (2000)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 松田繁樹: "複数混合分布を持つ順序制約付き非同期遷移型HMM"日本音響学会2000年秋季研究発表会講演論文集. 1-5-11. 21-22 (2000)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 松田繁樹: "複数の特徴ベクトル軌道を持つ環境依存音素クラスタの生成"日本音響学会2001年秋季研究発表会講演論文集. 1-1-10. 19-20 (2001)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 松田繁樹: "音素環境クラスタリングを基礎としたマルチパス音響モデルの自動生成"日本音響学会2002年秋季研究発表会講演論文集. 81-82 (2002)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 松田繁樹: "非同期遷移型HMMによる音声認識"電子情報通信学会論文誌D-II. J86-D-II, 6. 741-754 (2003)
- Description
  「研究成果報告書概要(和文)」より
[Publications] S.Matsuda, M.Nakai, H.Shimodaira, S.Sagayama: "Asynchronous-Transition HMM"Proc.2000 International Conference on Acoustics, Speech and Signal Processing (ICASSP) (Istanbul, Turkey). Vol.II (Jun). 1001-1004 (2000)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] S.Matsuda, M.Nakai, H.Shimodaira, S.Sagayama: "Feature-dependent Allophone Clustering"Proc.International Conference of Spoken Language Processing (CSLP2000). (Oct). 413-416 (2000)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] S.Matsuda, M.Nakai, H.Shimodaira, S.Sagayama: "Asynchronous-transition Hidden Markov Models with Multiple Mixtures"The 2000 Autumn Meeting of The Acoustical Society of Japan, (in Japanese). 1-5-11 (Sep). 21-22 (2000)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] S.Matsuda, M.Nakai, H.Shimodaira, S.Sagayama: "Generation of phoneme environment clusters with multiple trajectories"The 2001, Autumn Meeting of The Acoustical Society of Japan, (in Japanese). 1-1-10 (Oct). 19-20 (2001)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] S.Matsuda, M., Nakai, H.Shimodaira, S.Sagayama: "Automaic generation of multiple-path HMM based on phoneme-environment clustering"The 2002 Autumn Meeting of The Acoustical Society of Japan, (in Japanese). (Sep). (2002)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] S.Matsuda, M.Nakai, H.Shimodaira, S.Sagayama: "Speech Recognition Using Asynchronous Transition HMM"IEICE Trans. D-II, (in Japanese). vol.J86-D-II, no.6 (Jun). 741-754 (2003)
- Description
  「研究成果報告書概要(欧文)」より

2002 Fiscal Year Final Research Report Summary

Asynchronous-Transition Hidden Markov Model with State-Tying across Time for Automatic Speech Recognition

Principal Investigator

SHIMODAIRA Hiroshi JAIST, School of Information, Science, Associate Professor, 情報科学研究科, 助教授 (30206239)

Research Products

[Publications] S.Matsuda: "Asynchronous-Transition HMM"Proc. 2000 International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2. 1001-1004 (2000)

Description

[Publications] S.Matsuda: "Feature-dependent Allophone Clustering"Proc. International Conference on Spoken Language Processing (IC-SLP2000). 1. 413-416 (2000)

Description

[Publications] 松田 繁樹: "複数混合分布を持つ順序制約付き非同期遷移型HMM"日本音響学会2000年秋季研究発表会講演論文集. 1-5-11. 21-22 (2000)

Description

[Publications] 松田 繁樹: "複数の特徴ベクトル軌道を持つ環境依存音素クラスタの生成"日本音響学会2001年秋季研究発表会講演論文集. 1-1-10. 19-20 (2001)

Description

[Publications] 松田 繁樹: "音素環境クラスタリングを基礎としたマルチパス音響モデルの自動生成"日本音響学会2002年秋季研究発表会講演論文集. 81-82 (2002)

Description

[Publications] 松田 繁樹: "非同期遷移型HMMによる音声認識"電子情報通信学会論文誌D-II. J86-D-II, 6. 741-754 (2003)

Description

[Publications] S.Matsuda, M.Nakai, H.Shimodaira, S.Sagayama: "Asynchronous-Transition HMM"Proc.2000 International Conference on Acoustics, Speech and Signal Processing (ICASSP) (Istanbul, Turkey). Vol.II (Jun). 1001-1004 (2000)

Description

[Publications] S.Matsuda, M.Nakai, H.Shimodaira, S.Sagayama: "Feature-dependent Allophone Clustering"Proc.International Conference of Spoken Language Processing (CSLP2000). (Oct). 413-416 (2000)

Description

[Publications] S.Matsuda, M.Nakai, H.Shimodaira, S.Sagayama: "Asynchronous-transition Hidden Markov Models with Multiple Mixtures"The 2000 Autumn Meeting of The Acoustical Society of Japan, (in Japanese). 1-5-11 (Sep). 21-22 (2000)

Description

[Publications] S.Matsuda, M.Nakai, H.Shimodaira, S.Sagayama: "Generation of phoneme environment clusters with multiple trajectories"The 2001, Autumn Meeting of The Acoustical Society of Japan, (in Japanese). 1-1-10 (Oct). 19-20 (2001)

Description

[Publications] S.Matsuda, M., Nakai, H.Shimodaira, S.Sagayama: "Automaic generation of multiple-path HMM based on phoneme-environment clustering"The 2002 Autumn Meeting of The Acoustical Society of Japan, (in Japanese). (Sep). (2002)

Description

[Publications] S.Matsuda, M.Nakai, H.Shimodaira, S.Sagayama: "Speech Recognition Using Asynchronous Transition HMM"IEICE Trans. D-II, (in Japanese). vol.J86-D-II, no.6 (Jun). 741-754 (2003)

Description

[Publications] 松田繁樹: "複数混合分布を持つ順序制約付き非同期遷移型HMM"日本音響学会2000年秋季研究発表会講演論文集. 1-5-11. 21-22 (2000)

[Publications] 松田繁樹: "複数の特徴ベクトル軌道を持つ環境依存音素クラスタの生成"日本音響学会2001年秋季研究発表会講演論文集. 1-1-10. 19-20 (2001)

[Publications] 松田繁樹: "音素環境クラスタリングを基礎としたマルチパス音響モデルの自動生成"日本音響学会2002年秋季研究発表会講演論文集. 81-82 (2002)

[Publications] 松田繁樹: "非同期遷移型HMMによる音声認識"電子情報通信学会論文誌D-II. J86-D-II, 6. 741-754 (2003)