High performance speech and gesture recognition based on the stochastic model with mutual state-observation-dependencies
Project/Area Number |
12680399
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Intelligent informatics
|
Research Institution | Waseda University |
Principal Investigator |
KOBAYASHI Tetsunori School of Science and Engineering, Professor, 理工学部, 教授 (30162001)
|
Project Period (FY) |
2000 – 2002
|
Project Status |
Completed (Fiscal Year 2002)
|
Budget Amount *help |
¥3,600,000 (Direct Cost: ¥3,600,000)
Fiscal Year 2002: ¥800,000 (Direct Cost: ¥800,000)
Fiscal Year 2001: ¥900,000 (Direct Cost: ¥900,000)
Fiscal Year 2000: ¥1,900,000 (Direct Cost: ¥1,900,000)
|
Keywords | stochastic model / acoustic model / PHMM / SPHMM / speech recognition / gesture recognition / 時系列パターン認識 |
Research Abstract |
Aiming at treating more complicated temporal changes of stochastic phenomena, Partly-Hidden Markov Model (PHMM), is proposed and applied to speech and gesture recognition. It can treat the observation dependent behaviors in both observations and state transitions. Some simulation experiments showed the high potential of PHMM. In addition, from the gesture recognition and the isolated spoken word recognition experiments, PHMM showed the performance to exceed HMM. In the formulation of original PHMM, we used common pair of hidden state and observable state to determine the stochastic phenomena of the observation and the state transition. In the formulation modified here, we use common hidden state but different observable state for the observation and for the state transition separately. This slight modification brought the big flexibility in the modeling of phenomena and reduced the word errors compared with HMM and traditional PHMM using continuous speech. We also proposed Smoothed Partly-Hidden Markov Model (SPHMM), in which the observation and state transition probabilities are defined by the geometric means of PHMM-based ones and HMM-based ones. From continuous speech recognition experiments, it was found that SPHMM gave the best performance compared with HMM and PHMM when the weight of smoothing was set adequately.
|
Report
(4 results)
Research Products
(20 results)