2002 Fiscal Year Final Research Report Summary

A Study on Hands-Free Speech Recognition Using Microphone Array

Research Project

Project/Area Number	11480077
Research Category	Grant-in-Aid for Scientific Research (B)
Allocation Type	Single-year Grants
Section	一般
Research Field	Intelligent informatics
Research Institution	NARA INSTITUTE OF SCIENCE AND TECHNOLOGY
Principal Investigator	SARUWATARI Hiroshi Nara Institute of Science and Technology, Graduate School of Information Science, Associate Professor, 情報科学研究科, 助教授 (30324974)
Co-Investigator(Kenkyū-buntansha)	LEE Akinobu Nara Institute of Science and Technology, Graduate School of Information Science, Assistant Professor, 情報科学研究科, 助手 (80332766) SHIKANO Kiyohiro Nara Institute of Science and Technology, Graduate School of Information Science, Professor, 情報科学研究科, 教授 (00263426)
Project Period (FY)	1999 – 2002
Keywords	Microphone array / Speech recognition / Hands-free / Source localization / Super directivity / Noise reduction / Real environments / Beamforming
Research Abstract	In recent years, an accuracy in the speech recognition system can be remarkably improved by using Hidden Markov Model and Neural Networks. However, in real environments, there still exists the significant problems that the speech recognition performance degrades because of the additive noise and reverberation of the room. In this study, we introduce a microphone array technology in which the sound sources can be identified and detected accurately in the three-dimensional acoustic field. This study can provide the following final results. (1) The real acoustic database has been constructed using a 56-ch microphone array system, and the database has been widely distributed. (2) The accurate DOA (direction of arrival) estimation technique with the CSP method has been realized. In addition, we have applied the technique into a moving robot navigation problem. 3) As a new array signal processing, we have proposed the multi-beamforming technique and the blind source separation by ICA. The effectiveness has been revealed through the experiments in the real situations. (4) We have proposed a new approach that combines three-dimensional Vitabi search in the speech recognition and DOA estimation. The research results have been published as follows ; Journal paper 12, International conference 30, Invited talk4, Technical Report 11, Domestic workshop 15.

Research Products
(24 results)

All Other

All Publications (24 results)

[Publications] 西浦敬信: "CSP法による音源位置同定を備えたマルチビームフォーミング"電子情報通信学会論文誌. Vol.J83-D-II, No.7. 1610-1619 (2000)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 西浦敬信: "マイクロホンアレーを用いたCSP法による複数音源位置推定"電子情報通信学会論文誌. Vol.J83-D-II, No.8. 1713-1721 (2000)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 西浦敬信: "反射音を利用したマルチビームフォーミングによる音声認識"電子情報通信学会論文誌. Vol.J83-D-II, No.11. 2198-2205 (2000)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 三木一浩: "マイクロホンアレーとHMM分解・合成法による雑音・残響下音声認識"電子情報通信学会論文誌. Vol.J83-D-II, No.11. 2206-2214 (2000)
- Description
  「研究成果報告書概要(和文)」より
[Publications] Tetsuya Takiguchi: "HMM-Separation-Based Speech Recognition for a Distant Moving Speaker"IEEE Transactions on Speech and Audio Processing. Vol.9, NO.2. 127-140 (2001)
- Description
  「研究成果報告書概要(和文)」より
[Publications] Hidekazu Kamiyanagida: "Direction of Arrival Estimation Using Nonlinear Microphone Array"IEICE Transactions Fundamentals. Vol.E84-A, No.4. 999-1010 (2001)
- Description
  「研究成果報告書概要(和文)」より
[Publications] Takeshi Yamada: "Distant-Talking Speech Recognition Based on a 3-D Viterbi search using a microphone array"IEEE Transactions on Speech and Audio Processing. Vol.10, No.2. 48-56 (2002)
- Description
  「研究成果報告書概要(和文)」より
[Publications] Panikos Heracleous: "A Microphone Array-based 3-D N-best Search for Simultanous Recognition of Multiple Sound Sources"IEICE Trans. Information and Systems. Vol.E85-D, No.6. 994-1002 (2002)
- Description
  「研究成果報告書概要(和文)」より
[Publications] Yuko Okada: "A design of adaptive beamformer based on average speech spectrum for noisy speech recognition"Acoustical Science and Technology. Vol.23, No.6. 323-327 (2002)
- Description
  「研究成果報告書概要(和文)」より
[Publications] Hiroshi Saruwatari: "Fast-Convergence Algorithm for Blind Source Separation Based on Array Signal Processing"IEICE Trans.Fundamentals. Vol.E86-A, No.3. 286-291 (2003)
- Description
  「研究成果報告書概要(和文)」より
[Publications] Tsuyoki Nishikawa: "Blind source separation of acoustic signals based on multistage ICA combining frequency-domain ICA and time-domain ICA"IEICE Trans.Fundamentals. Vol.E86-A, No.4. 846-858 (2003)
- Description
  「研究成果報告書概要(和文)」より
[Publications] Tsuyoki Nishikawa: "Stable learning algorithm for blind separation of temporally correlated acoustic signals combining multistage ICA and Linear Prediction"IEICE Transactions Fundamentals. Vol.E86-A, No.8(in printing). (2003)
- Description
  「研究成果報告書概要(和文)」より
[Publications] Takanori Nishiura: "Multiple Beamforming with Source localization Based on CSP Analysis, (in Japanese)"IEICE Trans.on Information and Systems. Vol.J83-D-II, No.7. 1610-1619 (2000)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Takanori Nishiura: "Localization of Multiple Sound Sources Based on CSP Analysis with a Microphone Array, (inJapanese)"IEICE Trans.on Information and Systems. Vol.J83-D-II, No.8. 1713-1721 (2000)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Takanori Nishiura: "Speech Recognition by Multiple Beamforming Utilizing Reflection Signals, (in Japanese)"IEICE Trans.on Information and Systems. Vol.J83-D-II, No.11. 2198-2205 (2000)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Kazuhiro Miki: "Speech Recognition Based on HMM Decomposition and Composition Method with a Microphone Array in Noisy Reverberant Environments, (in Japanese)"IEICE Trans.on Information and Systems. Vol.J83-D-II, No.11. 2206-2214 (2000)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Tetsuya Takiguchi: "HMM-Separation-Based Speech Recognition for a Distant Moving Speaker"IEEE Transactions on Speech and Audio Processing. Vol.9, No.2. 127-140 (2001)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Hidekazu Kamiyanagida: "Direction of Arrival Estimation Using Nonlinear Microphone Array"IEICE Transactions Fundamentals. Vol.E84-A, No.4. 999-1010 (2001)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Takeshi Yamada: "Distant-Talking Speech Recognition Based on a 3-D Viterbi search using a microphone array"IEEE Transactions on Speech and Audio Processing. Vol.10, No.2. 48-56 (2002)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Panikos Heracleous: "A Microphone Array-based 3-D N-best Search for Simultaneous Recognition of Multiple Sound Sources"IEICE Trans.Information and Systems. Vol.E85-D, No.6. 994-1002 (2002)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Yuko Okada: "A design of adaptive beamformer based on average speech spectrum for noisy speech recognition"Acoustical Science and Technology. Vol.23, No.6. 323-327 (2002)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Hiroshi Saruwatari: "Fast-Convergence Algorithm for Blind Source Separation Based on Array Signal Processing"IEICE Trans.Fundamentals. Vol.E86-A, No.3. 286-291 (2003)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Tsuyoki Nishikawa: "Blind source separation of acoustic signals based on multistage ICA combining frequency-domain ICA and time-domain ICA"IEICE Trans.Fundamentals. Vol.E86-A, No.4. 846-858 (2003)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Tsuyoki Nishikawa: "Stable learning algorithm for blind separation of temporally correlated acoustic signals combining multistage ICA and Linear Prediction"IEICE Trans.Fundamentals. Vol.E86-A, No.8 (in printing). (2003)
- Description
  「研究成果報告書概要(欧文)」より

2002 Fiscal Year Final Research Report Summary

A Study on Hands-Free Speech Recognition Using Microphone Array

Principal Investigator

SARUWATARI Hiroshi Nara Institute of Science and Technology, Graduate School of Information Science, Associate Professor, 情報科学研究科, 助教授 (30324974)

Research Products

[Publications] 西浦 敬信: "CSP法による音源位置同定を備えたマルチビームフォーミング"電子情報通信学会論文誌. Vol.J83-D-II, No.7. 1610-1619 (2000)

Description

[Publications] 西浦 敬信: "マイクロホンアレーを用いたCSP法による複数音源位置推定"電子情報通信学会論文誌. Vol.J83-D-II, No.8. 1713-1721 (2000)

Description

[Publications] 西浦 敬信: "反射音を利用したマルチビームフォーミングによる音声認識"電子情報通信学会論文誌. Vol.J83-D-II, No.11. 2198-2205 (2000)

Description

[Publications] 三木 一浩: "マイクロホンアレーとHMM分解・合成法による雑音・残響下音声認識"電子情報通信学会論文誌. Vol.J83-D-II, No.11. 2206-2214 (2000)

Description

[Publications] Tetsuya Takiguchi: "HMM-Separation-Based Speech Recognition for a Distant Moving Speaker"IEEE Transactions on Speech and Audio Processing. Vol.9, NO.2. 127-140 (2001)

Description

[Publications] Hidekazu Kamiyanagida: "Direction of Arrival Estimation Using Nonlinear Microphone Array"IEICE Transactions Fundamentals. Vol.E84-A, No.4. 999-1010 (2001)

Description

[Publications] Takeshi Yamada: "Distant-Talking Speech Recognition Based on a 3-D Viterbi search using a microphone array"IEEE Transactions on Speech and Audio Processing. Vol.10, No.2. 48-56 (2002)

Description

[Publications] Panikos Heracleous: "A Microphone Array-based 3-D N-best Search for Simultanous Recognition of Multiple Sound Sources"IEICE Trans. Information and Systems. Vol.E85-D, No.6. 994-1002 (2002)

Description

[Publications] Yuko Okada: "A design of adaptive beamformer based on average speech spectrum for noisy speech recognition"Acoustical Science and Technology. Vol.23, No.6. 323-327 (2002)

Description

[Publications] Hiroshi Saruwatari: "Fast-Convergence Algorithm for Blind Source Separation Based on Array Signal Processing"IEICE Trans.Fundamentals. Vol.E86-A, No.3. 286-291 (2003)

Description

[Publications] Tsuyoki Nishikawa: "Blind source separation of acoustic signals based on multistage ICA combining frequency-domain ICA and time-domain ICA"IEICE Trans.Fundamentals. Vol.E86-A, No.4. 846-858 (2003)

Description

[Publications] Tsuyoki Nishikawa: "Stable learning algorithm for blind separation of temporally correlated acoustic signals combining multistage ICA and Linear Prediction"IEICE Transactions Fundamentals. Vol.E86-A, No.8(in printing). (2003)

Description

[Publications] Takanori Nishiura: "Multiple Beamforming with Source localization Based on CSP Analysis, (in Japanese)"IEICE Trans.on Information and Systems. Vol.J83-D-II, No.7. 1610-1619 (2000)

Description

[Publications] Takanori Nishiura: "Localization of Multiple Sound Sources Based on CSP Analysis with a Microphone Array, (inJapanese)"IEICE Trans.on Information and Systems. Vol.J83-D-II, No.8. 1713-1721 (2000)

Description

[Publications] Takanori Nishiura: "Speech Recognition by Multiple Beamforming Utilizing Reflection Signals, (in Japanese)"IEICE Trans.on Information and Systems. Vol.J83-D-II, No.11. 2198-2205 (2000)

Description

[Publications] Kazuhiro Miki: "Speech Recognition Based on HMM Decomposition and Composition Method with a Microphone Array in Noisy Reverberant Environments, (in Japanese)"IEICE Trans.on Information and Systems. Vol.J83-D-II, No.11. 2206-2214 (2000)

Description

[Publications] Tetsuya Takiguchi: "HMM-Separation-Based Speech Recognition for a Distant Moving Speaker"IEEE Transactions on Speech and Audio Processing. Vol.9, No.2. 127-140 (2001)

Description

[Publications] Hidekazu Kamiyanagida: "Direction of Arrival Estimation Using Nonlinear Microphone Array"IEICE Transactions Fundamentals. Vol.E84-A, No.4. 999-1010 (2001)

Description

[Publications] Takeshi Yamada: "Distant-Talking Speech Recognition Based on a 3-D Viterbi search using a microphone array"IEEE Transactions on Speech and Audio Processing. Vol.10, No.2. 48-56 (2002)

Description

[Publications] Panikos Heracleous: "A Microphone Array-based 3-D N-best Search for Simultaneous Recognition of Multiple Sound Sources"IEICE Trans.Information and Systems. Vol.E85-D, No.6. 994-1002 (2002)

Description

[Publications] Yuko Okada: "A design of adaptive beamformer based on average speech spectrum for noisy speech recognition"Acoustical Science and Technology. Vol.23, No.6. 323-327 (2002)

Description

[Publications] Hiroshi Saruwatari: "Fast-Convergence Algorithm for Blind Source Separation Based on Array Signal Processing"IEICE Trans.Fundamentals. Vol.E86-A, No.3. 286-291 (2003)

Description

[Publications] Tsuyoki Nishikawa: "Blind source separation of acoustic signals based on multistage ICA combining frequency-domain ICA and time-domain ICA"IEICE Trans.Fundamentals. Vol.E86-A, No.4. 846-858 (2003)

Description

[Publications] Tsuyoki Nishikawa: "Stable learning algorithm for blind separation of temporally correlated acoustic signals combining multistage ICA and Linear Prediction"IEICE Trans.Fundamentals. Vol.E86-A, No.8 (in printing). (2003)

Description

[Publications] 西浦敬信: "CSP法による音源位置同定を備えたマルチビームフォーミング"電子情報通信学会論文誌. Vol.J83-D-II, No.7. 1610-1619 (2000)

[Publications] 西浦敬信: "マイクロホンアレーを用いたCSP法による複数音源位置推定"電子情報通信学会論文誌. Vol.J83-D-II, No.8. 1713-1721 (2000)

[Publications] 西浦敬信: "反射音を利用したマルチビームフォーミングによる音声認識"電子情報通信学会論文誌. Vol.J83-D-II, No.11. 2198-2205 (2000)

[Publications] 三木一浩: "マイクロホンアレーとHMM分解・合成法による雑音・残響下音声認識"電子情報通信学会論文誌. Vol.J83-D-II, No.11. 2206-2214 (2000)