Distant-talking speech recognition based on spectral subtraction by multi-channel least mean square approach

Research Project

Project/Area Number	22700169
Research Category	Grant-in-Aid for Young Scientists (B)
Allocation Type	Single-year Grants
Research Field	Perception information processing/Intelligent robotics
Research Institution	Nagaoka University of Technology (2012) Shizuoka University (2010-2011)
Principal Investigator	WANG Longbiao 長岡技術科学大学, 産学融合トップランナー養成センター, 産学融合特任准教授 (30510458)
Project Period (FY)	2010 – 2012
Project Status	Completed (Fiscal Year 2012)
Budget Amount *help	¥3,900,000 (Direct Cost: ¥3,000,000、Indirect Cost: ¥900,000) Fiscal Year 2012: ¥1,040,000 (Direct Cost: ¥800,000、Indirect Cost: ¥240,000) Fiscal Year 2011: ¥1,170,000 (Direct Cost: ¥900,000、Indirect Cost: ¥270,000) Fiscal Year 2010: ¥1,690,000 (Direct Cost: ¥1,300,000、Indirect Cost: ¥390,000)
Keywords	一般化スペクトルサブトラクション / ハンズフリー音声認 / missing feature theory / マルチチャンネルLMS / ブラインド残響除去 / ハンズフリー音声認識 / 音源分離 / 独立成分分析
Research Abstract	We proposed a blind dereverberation method based on spectral subtraction using a multi-channel least mean square algorithm (MCLMS). This method was evaluated in a simulated and real noisy reverberant environment with stationary noise. In this study, we also evaluate this method in a noisy reverberant environment with non-stationary noise like music. After suppressing the music, using a blind source separation based on Efficient FastICA (independent component analysis) algorithm, spectral subtraction based dereverberation method is employed to reduce late reverberation. The proposed method achieves an average relative word error reduction rate of 41.9% and 7.9% compared to baseline method and the state-of-art multi-step linear prediction (MSLP) based dreverberation in a real environment, respectively.

Report

(4 results)

2012 Annual Research Report Final Research Report ( PDF )
2011 Annual Research Report
2010 Annual Research Report

Research Products
(48 results)

All 2013 2012 2011 2010 Other

All Journal Article (7 results) (of which Peer Reviewed: 2 results) Presentation (34 results) (of which Invited: 1 results) Book (4 results) Remarks (3 results)

[Journal Article] Speaker identification and verification by combining MFCC and phase information2012
- Author(s)
  S. Nakagawa, L. Wang and S. Ohtsuka
- Journal Title
  
  IEEE Transactions on Audio, Speech and Language Processing
  
  Volume: Vol.20, No.4 Issue: 4 Pages: 1085-1095
- DOI
  10.1109/tasl.2011.2172422
- Related Report
  2012 Annual Research Report 2012 Final Research Report
[Journal Article] Dereverberation and Denoising Based on Generalized Spectral Subtraction by Multi-channel LMS Algorithm Using a Small-scale Microphone Array2012
- Author(s)
  L. Wang, K. Odani and A. Kai
- Journal Title
  
  Eurasip Journal on Advanced in Signal Processing
  
  Volume: 2012 Issue: 1
- DOI
  10.1186/1687-6180-2012-12
- Related Report
  2012 Final Research Report 2011 Annual Research Report
[Journal Article] Identification of a distant speaker and its robustness2011
- Author(s)
  Y. Jiang, Z. Tang and L. Wang
- Journal Title
  
  Chinese Journal of Electronics
  
  Volume: Vol.20, No.2 Pages: 278-282
- URL
  http://www.ejournal.org.cn/Jweb_cje/EN/abstract/abstract1109.shtml
- Related Report
  2012 Final Research Report 2011 Annual Research Report
[Journal Article] Distant-talking speech recognition based on spectral subtraction by multi-channel LMS algorithm2011
- Author(s)
  L. Wang, N. Kitaoka, S. Nakagawa
- Journal Title
  
  IEICE Trans. on Information and Systems
  
  Volume: Vol.E94-D, No.3 Pages: 659-667
- URL
  http://search.ieice.org/bin/summary.php?id=e94-d_3_659
- Related Report
  2012 Final Research Report
[Journal Article] Distant-talking speech recognition based on spectral subtraction by multi-channel LMS algorithm2011
- Author(s)
  L.Wang, N.Kitaoka, S.Nakagawa
- Journal Title
  
  IEICE Trans.on Information and Systems
  
  Volume: Vol.E94-D, No.3 Pages: 659-667
- Related Report
  2010 Annual Research Report
- Peer Reviewed
[Journal Article] Speaker recognition by combining MFCC and phase information in noisy conditions2010
- Author(s)
  L. Wang, K. Minami, K. Yamamoto, S. Nakagawa
- Journal Title
  
  IEICE Trans. on Information and Systems
  
  Volume: Vol.E93-D,No.9 Pages: 2397-2406
- URL
  http://search.ieice.org/bin/summary.php?id=e93-d_9_2397
- Related Report
  2012 Final Research Report
[Journal Article] Speaker recognition by combining MFCC and phase information in noisy conditions2010
- Author(s)
  L.Wang, K.Minami, K.Yamamoto, S.Nakagawa
- Journal Title
  
  IEICE Trans.on Information and Systems
  
  Volume: Vol.E93-D, No.9 Pages: 2397-2406
- Related Report
  2010 Annual Research Report
- Peer Reviewed
[Presentation] Single-sided Approach to Discriminative PLDA Training for Text-Independent SpeakerVerification2013
- Author(s)
  Zhaofeng Zhang、Lee Kong Aik、LongbiaoWang、Atsuhiko Kai、Ma Bin
- Organizer
  Proc. of the 2013 SpringMeeting of the ASJ
- Related Report
  2012 Final Research Report
[Presentation] Single-sided Approach to Discriminative PLDA Training for Text-Independent Speaker Verification2013
- Author(s)
  Z. Zhang、L. Lee、L. Wang、A. Kai、B. Ma
- Organizer
  日本音響学会2013年春季研究発表会
- Place of Presentation
  東京工科大学八王子キャンパス（東京都）
- Related Report
  2012 Annual Research Report
[Presentation] 話者認識技術の現状と課題2013
- Author(s)
  網野加苗、石原俊一、小川哲司、長内隆、黒岩眞吾、越仲孝文、篠田浩一、柘植覚、、西田昌史、松井知子、王龍標
- Organizer
  音声研究会
- Place of Presentation
  大同大学（愛知県）
- Related Report
  2012 Annual Research Report
- Invited
[Presentation] 音声認識誤り率の推定を用いたPOMDPモデルの構築の検討2012
- Author(s)
  西島祥悟、甲斐充彦、小暮悟、王龍標
- Organizer
  第64回言語・音声理解と対話処理研究会
- Place of Presentation
  東京大学本郷キャンパス(東京)
- Year and Date
  2012-03-26
- Related Report
  2011 Annual Research Report
[Presentation] 話者や発話固有の特徴の違いに注目した認識性能の個人差の要因分析2012
- Author(s)
  赤尾佳彦、王龍標、甲斐充彦
- Organizer
  日本音響学会2012年春季研究発表会講演論文集
- Place of Presentation
  神奈川大学横浜キャンパス(横浜)
- Year and Date
  2012-03-15
- Related Report
  2011 Annual Research Report
[Presentation] SS法に基づくブラインド残響除去法の実環境音声における評価2012
- Author(s)
  小谷恭平、王龍標、甲斐充彦
- Organizer
  日本音響学会2012年春季研究発表会講演論文集
- Place of Presentation
  神奈川大学横浜キャンパス(横浜)
- Year and Date
  2012-03-13
- Related Report
  2011 Annual Research Report
[Presentation] Distant-talking speaker identification using a reverberation model with various artificial room impulse responses2012
- Author(s)
  L. Wang, Z. Zhang, A. Kai and Y. Kishi
- Organizer
  Proc. of APSIPA ASC 2012
- Related Report
  2012 Final Research Report
[Presentation] Dereverberantion based on Generalized Spectral Subtraction for Distant-talking Speaker Recognition2012
- Author(s)
  Z. Zhang, L. Wang and A. Kai
- Organizer
  Proc. of APSIPA ASC 2012
- Related Report
  2012 Final Research Report
[Presentation] On the Use of Phase Information-based Joint Factor Analysis for Speaker Verification under Channel Mismatch Condition2012
- Author(s)
  Y. Hirano, L. Wang, A. Kai and S.Nakagawa
- Organizer
  Proc. of APSIPA ASC 2012
- Related Report
  2012 Final Research Report
[Presentation] Speech Recognition by Denoising and Dereverberation Based on Spectral Subtraction in a Real Noisy Reverberant Environment2012
- Author(s)
  K. Odani, L. Wang and A. Kai
- Organizer
  Proc. of Interspeech 2012
- Related Report
  2012 Final Research Report
[Presentation] 音声ツイートを想定したtwitterクライアントの試作・評価と発話特徴の利用に関する一考察2012
- Author(s)
  進士智也、甲斐充彦、王龍標、小暮悟
- Organizer
  第14回音声言語シンポジウム
- Place of Presentation
  東京工業大学大岡山キャンパス（東京都）
- Related Report
  2012 Annual Research Report
[Presentation] 音響情報と空間情報の利用によるSpeaker Diarizationの検討2012
- Author(s)
  倉島諒、兼子史聖、王龍標、甲斐充彦
- Organizer
  日本音響学会2012年秋季研究発表会
- Place of Presentation
  信州大学 (長野県)
- Related Report
  2012 Annual Research Report
[Presentation] 一般化スペクトルサブトラクションによる残響除去法を用いた遠隔発話話者認識2012
- Author(s)
  張兆峰、奥和紀、小谷恭平、王龍標、甲斐充彦
- Organizer
  日本音響学会2012年秋季研究発表会
- Place of Presentation
  信州大学 (長野県)
- Related Report
  2012 Annual Research Report
[Presentation] MFCC と位相情報を用いたJoint Factor Analysis によるチャネルミスマッチ条件下での話者照合2012
- Author(s)
  平野郁也、王龍標、甲斐充彦、中川聖一
- Organizer
  日本音響学会2012年秋季研究発表会
- Place of Presentation
  信州大学 (長野県)
- Related Report
  2012 Annual Research Report
[Presentation] 音楽重畳音声を用いた音源分離と残響除去法の評価2012
- Author(s)
  小谷恭平、王龍標、甲斐充彦
- Organizer
  日本音響学会2012年秋季研究発表会
- Place of Presentation
  信州大学 (長野県)
- Related Report
  2012 Annual Research Report
[Presentation] SS法に基づく雑音残響除去法の実環境下における評価2012
- Author(s)
  小谷恭平、王龍標、甲斐充彦
- Organizer
  電子情報通信学会技術研究報告
- Place of Presentation
  大阪大学中之島センター（大阪府）
- Related Report
  2012 Annual Research Report
[Presentation] 単語断片の候補選択が可能な音声入力インタフェースの実装と評価2011
- Author(s)
  張用起、甲斐充彦、王龍標
- Organizer
  音声言語情報処理研究会
- Place of Presentation
  芝浦工業大学(東京)
- Year and Date
  2011-12-20
- Related Report
  2011 Annual Research Report
[Presentation] Blind Dereverberation Based on Generalized Spectral Subtraction by Multi-channel LMS Algorithm2011
- Author(s)
  K.Odani, L.Wang, A.Kai
- Organizer
  APSIPA ASC 2011
- Place of Presentation
  Grand New World Hotel Xi'an (Xi'an, China)
- Year and Date
  2011-10-20
- Related Report
  2011 Annual Research Report
[Presentation] 複数の人工室内インパルス応答を用いた残響モデルの利用による遠隔発話話者認識2011
- Author(s)
  王龍標、岸良樹、張兆峰、甲斐充彦
- Organizer
  日本音響学会2011年秋季研究発表会講演論文集
- Place of Presentation
  島根大学(島根県)
- Year and Date
  2011-09-21
- Related Report
  2011 Annual Research Report
[Presentation] SS法に基づくブラインド残響除去法による雑音残響下音声認識2011
- Author(s)
  小谷恭平、王龍標、甲斐充彦
- Organizer
  日本音響学会2011年秋季研究発表会講演論文集
- Place of Presentation
  島根大学(島根県)
- Year and Date
  2011-09-21
- Related Report
  2011 Annual Research Report
[Presentation] Evaluation of hands-free large vocabulary continuous speech recognition by blind dereverberation based on spectral subtraction by multi-channel LMS algorithm2011
- Author(s)
  L.Wang, K.Odani, A.Kai
- Organizer
  International conference on Text, Speech and Dialogue 2011
- Place of Presentation
  University of West Bohemia (Pilsen, Czech Republic)
- Year and Date
  2011-09-05
- Related Report
  2011 Annual Research Report
[Presentation] 遠隔音声認識のためのマルチチャンネルLMSアルゴリズムによる残響除去法の改善2011
- Author(s)
  小谷恭平、王龍標、甲斐充彦
- Organizer
  電子情報通信学会技術研究報告
- Place of Presentation
  立命館大学大阪キャンパス(大阪府)
- Year and Date
  2011-05-12
- Related Report
  2011 Annual Research Report
[Presentation] マルチチャンネルLMSアルゴリズムに基づくブラインド残響除去による大語彙音声認識の評価2011
- Author(s)
  小谷恭平、王龍標、甲斐充彦
- Organizer
  日本音響学会2011年春季研究発表会
- Place of Presentation
  早稲田大学西早稲田キャンパス(東京都)
- Year and Date
  2011-03-10
- Related Report
  2010 Annual Research Report
[Presentation] 人工残響モデルで模擬した環境の違いによる遠隔発話話者認識への影響分析2011
- Author(s)
  岸良樹、王龍標、甲斐充彦
- Organizer
  日本音響学会2011年春季研究発表会
- Place of Presentation
  早稲田大学西早稲田キャンパス(東京都)
- Year and Date
  2011-03-10
- Related Report
  2010 Annual Research Report
[Presentation] Blind Dereverberation Based on Generalized Spectral Subtraction by Multi-channel LMS Algorithm2011
- Author(s)
  Kyohei Odani, Longbiao Wang and Atsuhiko Kai
- Organizer
  Proc. of APSIPA ASC 2011
- Related Report
  2012 Final Research Report
[Presentation] Evaluation of Hands-free Large Vocabulary Continuous Speech Recognition by Blind Dereverberation Based onSpectral Subtraction by Multi-channelLMS Algorithm2011
- Author(s)
  Longbiao Wang , Kyohei Odani and Atsuhiko Kai
- Organizer
  Proc. of Text, Speech and Dialogue
- Related Report
  2012 Final Research Report
[Presentation] Multimodal interface with N-best display including candidates of spoken word fragments2010
- Author(s)
  Y.Jang, A.Kai, L.Wang
- Organizer
  APSIPA ASC 2010
- Place of Presentation
  Biopolis, Singapore
- Year and Date
  2010-12-16
- Related Report
  2010 Annual Research Report
[Presentation] Investigation of driving-behavior modeling for recognition of a driving situation2010
- Author(s)
  J.Ema, L.Wang, A.Kai, T.Itoh
- Organizer
  APSIPA ASC 2010
- Place of Presentation
  Biopolis, Singapore
- Year and Date
  2010-12-15
- Related Report
  2010 Annual Research Report
[Presentation] Compensation approaches for distant Speaker identification under reverberant environments2010
- Author(s)
  Y.Jiang, Z.Tang, L.Wang
- Organizer
  CCPR 2010
- Place of Presentation
  Chongqing University, Chongqing, China
- Year and Date
  2010-10-23
- Related Report
  2010 Annual Research Report
[Presentation] 車の運転状況の認識のための運転行動モデルの検討2010
- Author(s)
  江間旬記、王龍標、甲斐充彦、伊藤敏彦
- Organizer
  電子情報通信学会 2010年度ソサエティ大会
- Place of Presentation
  大阪府立大学(大阪府)
- Year and Date
  2010-09-16
- Related Report
  2010 Annual Research Report
[Presentation] 単語断片を含む複数候補の動的構成によるマルチモーダル単語入力インタフェース2010
- Author(s)
  張用起、甲斐充彦、王龍標
- Organizer
  日本音響学会2010年秋季研究発表会
- Place of Presentation
  関西大学(大阪府)
- Year and Date
  2010-09-16
- Related Report
  2010 Annual Research Report
[Presentation] 人工残響モデルを用いた環境の違いに頑健な遠隔発話話者認識の検討2010
- Author(s)
  岸良樹、王龍標、甲斐充彦
- Organizer
  日本音響学会2010年秋季研究発表会
- Place of Presentation
  関西大学(大阪府)
- Year and Date
  2010-09-14
- Related Report
  2010 Annual Research Report
[Presentation] Multimodal interface with N-best display including candidates of spoken word fragments2010
- Author(s)
  Y. Jang, A. Kai and L. Wang
- Organizer
  Proc. of APSIPA ASC2010
- Related Report
  2012 Final Research Report
[Presentation] Compensation approaches for distant Speaker identification under reverberant environments2010
- Author(s)
  Y. Jiang, Z. Tang and L. Wang
- Organizer
  Proc. of CCPR 2010
- Related Report
  2012 Final Research Report
[Book] Dereverberation Based on Spectral Subtraction by Multi-channel LMS Algorithm for Hands-free Speech Recognition2012
- Author(s)
  Longbiao Wang, Kyohei Odani, Atsuhiko Kai, Norihide Kitaoka and Seiichi Nakagawa
- Publisher
  Chapter in Modern Speech Recognition Approaches with Case Studies, S. Ramakrishnan (Eds.), IN-TECH
- Related Report
  2012 Final Research Report
[Book] “Dereverberation Based on Spectral Subtraction by Multi-channel LMS Algorithm for Hands-free Speech Recognition”, Chapter in Modern Speech Recognition Approaches with Case Studies, S. Ramakrishnan (Eds.)2012
- Author(s)
  L. Wang, K. Odani, A. Kai, N. Kitaoka and S. Nakagawa
- Publisher
  IN-TECH
- Related Report
  2012 Annual Research Report
[Book] Evaluation of hands-free large vocabulary continuous speech recognition by blind dereverberation based on spectral subtraction by multi-channel LMS algorithm2011
- Author(s)
  Longbiao Wang, Kyohei Odani and Atsuhiko Kai
- Publisher
  Ivan Habernal, Vaclav Matousek (Eds.), Lecture Notes in Artificial Intelligence, Springer LNAI6836
- Related Report
  2012 Final Research Report
[Book] (章節)"Evaluation of hands-free large vocabulary continuous speech recognition by blind dereverberation based on spectral subtraction by multi-channel LMS algorithm" in LNAI 6836, Text, Speech and Dialogue2011
- Author(s)
  L.Wang, K.Odani, A.Kai
- Total Pages
  8
- Publisher
  Springer-Verlag Verlin Heidelberg
- Related Report
  2011 Annual Research Report
[Remarks]
- URL
  http://sip.nagaokaut.ac.jp/wang-j.html
- Related Report
  2012 Final Research Report
[Remarks]
- URL
  http://ssp.sys.eng.shizuoka.ac.jp/wang-j.html
- Related Report
  2011 Annual Research Report
[Remarks]
- URL
  http://ssp.sys.eng.shizuoka.ac.jp/wang-j.html
- Related Report
  2010 Annual Research Report

Distant-talking speech recognition based on spectral subtraction by multi-channel least mean square approach

Principal Investigator

WANG Longbiao 長岡技術科学大学, 産学融合トップランナー養成センター, 産学融合特任准教授 (30510458)

¥3,900,000 (Direct Cost: ¥3,000,000、Indirect Cost: ¥900,000)

Report

Research Products

[Journal Article] Speaker identification and verification by combining MFCC and phase information2012

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Dereverberation and Denoising Based on Generalized Spectral Subtraction by Multi-channel LMS Algorithm Using a Small-scale Microphone Array2012

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Identification of a distant speaker and its robustness2011

Author(s)

Journal Title

URL

Related Report

[Journal Article] Distant-talking speech recognition based on spectral subtraction by multi-channel LMS algorithm2011

Author(s)

Journal Title

URL

Related Report

[Journal Article] Distant-talking speech recognition based on spectral subtraction by multi-channel LMS algorithm2011

Author(s)

Journal Title

Related Report

[Journal Article] Speaker recognition by combining MFCC and phase information in noisy conditions2010

Author(s)

Journal Title

URL

Related Report

[Journal Article] Speaker recognition by combining MFCC and phase information in noisy conditions2010

Author(s)

Journal Title

Related Report

[Presentation] Single-sided Approach to Discriminative PLDA Training for Text-Independent SpeakerVerification2013

Author(s)

Organizer

Related Report

[Presentation] Single-sided Approach to Discriminative PLDA Training for Text-Independent Speaker Verification2013

Author(s)

Organizer

Place of Presentation

Related Report

[Presentation] 話者認識技術の現状と課題2013

Author(s)

Organizer

Place of Presentation

Related Report

[Presentation] 音声認識誤り率の推定を用いたPOMDPモデルの構築の検討2012

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 話者や発話固有の特徴の違いに注目した認識性能の個人差の要因分析2012

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] SS法に基づくブラインド残響除去法の実環境音声における評価2012

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] Distant-talking speaker identification using a reverberation model with various artificial room impulse responses2012

Author(s)

Organizer

Related Report

[Presentation] Dereverberantion based on Generalized Spectral Subtraction for Distant-talking Speaker Recognition2012

Author(s)

Organizer

Related Report

[Presentation] On the Use of Phase Information-based Joint Factor Analysis for Speaker Verification under Channel Mismatch Condition2012