2004 Fiscal Year Final Research Report Summary

Hands free speech recognition method based on auditory characteristics

Research Project

Project/Area Number	15500106
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Single-year Grants
Section	一般
Research Field	Perception information processing/Intelligent robotics
Research Institution	Shinshu University
Principal Investigator	MATSUMOTO Hiroshi Shinshu University, Faculty of Engineering, Professor, 工学部, 教授 (60005452)
Co-Investigator(Kenkyū-buntansha)	YAMAMOTO Kazumasa Shinshu University, Faculty of Engineering, Assistant, 工学部, 助手 (40324230)
Project Period (FY)	2003 – 2004
Keywords	Hands free speech recognition / Aurora-2 database / Generalized logarithmic scale / Mel-LPC analysis / Wiener filter / Dereverberation / Distant speech recognition / Forward Masking
Research Abstract	Firstly, we proposed a forward masking of Mel-LPC based spectrum on the generalized logarithmic scale. Besides, the variance normalization and a mashing control with the estimated SNR are examined for improving noise robustness. The experimental results on the Aurora-2 database showed that Mel-LPC based cepstrum on generalized log-scale with cepstrum mean and variance normalization for γ=0.1 provides the best performance over the normalized forward masking parameter under any condition. Secondly, We developed a frequency warped Wiener filter to enhance Mel-LPC spectra in presence of additive noise. The proposed filter is directly estimated from the signal on the linear frequency scale and then is efficiently implemented in the autocorrelation domain without denoising input speech. As a result of evaluation using Aurora 2 database, the optimum filter order is shown to be comparable to that of Mel-LPC analysis, and thus filtering is computationally inexpensive. Word accuracy is improved by about 20% at most with the proposed Wiener filter. Thirdly, in order to reduce the influence of reverberation, we examined a reverberation model on the power trajectory domain at the output of a mel-filter in the MFCC analysis. The model parameters consists of the decay rate representing reverberation, the ratio of reverberant power to the direct sound, and the frequency response of the channel including some parts of coloration. Recognition experiments show that the dereverberation method based on this model attains about 10% improvement in Ace. compared to non-processed conditions.

Research Products
(8 results)

All 2005 2004

All Journal Article (8 results)

[Journal Article] Reverberation modeling on power spectral trajectory for distant speech recogntion2005
- Author(s)
  H.Matsumoto, T.Takei, K.Yamamoto
- Journal Title
  
  Proc.of 2005 Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA05)
  
  Pages: B9
- Description
  「研究成果報告書概要(和文)」より
[Journal Article] Frequency Warped Wiener Filtering for Mel-LPC Based Speech Recognition2005
- Author(s)
  Md.Babul Islam, H.Matsumoto, K.Yamamoto
- Journal Title
  
  Proc.of International Workshop on Nonlinear Signal and Image Processing (NSIP2005)
  
  Pages: 19PM2D-1
- Description
  「研究成果報告書概要(和文)」より
[Journal Article] Reverberation modeling on power spectral trajectory for distant speech recogntion2005
- Author(s)
  Matsumoto, T.Takei, K.Yamamoto
- Journal Title
  
  Proc.of 2005 Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA05)
  
  Pages: b9
- Description
  「研究成果報告書概要(欧文)」より
[Journal Article] Frequency Warped Wiener Filtering for Mel-LPC Based Speech Recognition2005
- Author(s)
  Md.Babul Islam, H.Matsumoto, K.Yamamoto
- Journal Title
  
  Proc.of International Workshop on Nonlinear Signal and Image Processing (NSIP2005) 19PM2D-1
- Description
  「研究成果報告書概要(欧文)」より
[Journal Article] Improved forward masking on a generalized logarithmic scale for robust speech recognition2004
- Author(s)
  H.Matsumoto, T.Ichikawa, K.Yamamoto
- Journal Title
  
  Proc.of 18th International Congress on Acoustics
  
  Pages: Th4.H.4
- Description
  「研究成果報告書概要(和文)」より
[Journal Article] Syllable-connected models for Japanese speech recognition2004
- Author(s)
  K.Yamamoto, T.Ikeda, H.Matsumoto, et al.
- Journal Title
  
  Proc.of 18th International Congress on Acoustics
  
  Pages: Fr2.H.4
- Description
  「研究成果報告書概要(和文)」より
[Journal Article] Improved forward masking on a generalized logarithmic scale for robust speech recognition2004
- Author(s)
  H.Matsumoto, T.Ichikawa, K.Yamamoto
- Journal Title
  
  Proc.of 18th International Congress on Acoustics Th4.H.4
- Description
  「研究成果報告書概要(欧文)」より
[Journal Article] Syllable-connected models for Japanese speech recognition2004
- Author(s)
  K.Yamamoto, T.Ikeda, H.Matsumoto, et al.
- Journal Title
  
  Proc.of 18th International Congress on Acoustics Fr2.H.2
- Description
  「研究成果報告書概要(欧文)」より

2004 Fiscal Year Final Research Report Summary

Hands free speech recognition method based on auditory characteristics

Principal Investigator

MATSUMOTO Hiroshi Shinshu University, Faculty of Engineering, Professor, 工学部, 教授 (60005452)

Research Products

[Journal Article] Reverberation modeling on power spectral trajectory for distant speech recogntion2005

Author(s)

Journal Title

Description

[Journal Article] Frequency Warped Wiener Filtering for Mel-LPC Based Speech Recognition2005

Author(s)

Journal Title

Description

[Journal Article] Reverberation modeling on power spectral trajectory for distant speech recogntion2005

Author(s)

Journal Title

Description

[Journal Article] Frequency Warped Wiener Filtering for Mel-LPC Based Speech Recognition2005

Author(s)

Journal Title

Description

[Journal Article] Improved forward masking on a generalized logarithmic scale for robust speech recognition2004

Author(s)

Journal Title

Description

[Journal Article] Syllable-connected models for Japanese speech recognition2004

Author(s)

Journal Title

Description

[Journal Article] Improved forward masking on a generalized logarithmic scale for robust speech recognition2004

Author(s)

Journal Title

Description

[Journal Article] Syllable-connected models for Japanese speech recognition2004

Author(s)

Journal Title

Description