Noise source detection and recognition for audio indexing

Research Project

Project/Area Number	17500114
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Single-year Grants
Section	一般
Research Field	Perception information processing/Intelligent robotics
Research Institution	Nagasaki University
Principal Investigator	MATSUNAGA Shoichi Nagasaki University, Faculty of Engineering, Professor (90380815)
Project Period (FY)	2005 – 2007
Project Status	Completed (Fiscal Year 2007)
Budget Amount *help	¥3,840,000 (Direct Cost: ¥3,600,000、Indirect Cost: ¥240,000) Fiscal Year 2007: ¥1,040,000 (Direct Cost: ¥800,000、Indirect Cost: ¥240,000) Fiscal Year 2006: ¥1,000,000 (Direct Cost: ¥1,000,000) Fiscal Year 2005: ¥1,800,000 (Direct Cost: ¥1,800,000)
Keywords	noise / acoustic feature / sound source detection / acoustic model / clustering / 音源識別 / パラメータ / 音声 / スぺクトル / スペクトル / 情動
Research Abstract	We have studied an audio source detection approach based on a stochastic method to detect speech, noise, music, and silence. Our approach uses not only conventional surface acoustic features such as signal energy and pitch frequency but also new features that are based on spectral correlation for more accurate detection. The experiment with the broadcast news demonstrated that these feature parameters made it possible to capture the audio source more accurately. This research also proposed a sound source detection approach based on elaborate noise-modeling techniques for audio indexing. For accurate detection, we devised two methods to generate multiple-noise models through clustering techniques. One method is based on frame-wise data similarity, and the other is based on noise source similarity. The former method employs K-means clustering and a smoothing technique to avoid inaccurate segmentation. The latter method involves noise modeling based on a tree data structure generated by the progressive merging of noise clusters. The classification experiments show that by using these proposed methods, audio sources can be detected with better accuracy than that achieved by the conventional methods. When four noise models generated by the latter method were used, the noise detection performance increased by 3.9% for the periods in which the sound sources did not overlap. With regard to the experiments for an audio stream that included overlapped segments, the noise detection performance increased by 1.2% without a decrease in the speech detection performance.

Report

(4 results)

2007 Annual Research Report Final Research Report Summary
2006 Annual Research Report
2005 Annual Research Report

Research Products
(17 results)

All 2008 2007 2006 2005

All Journal Article (4 results) (of which Peer Reviewed: 2 results) Presentation (13 results)

[Journal Article] Sound source detection using multiple noise models2008
- Author(s)
  Shoichi Matsunaga
- Journal Title
  
  Proc.of ICASSP 2008「掲載確定」
- Related Report
  2007 Annual Research Report
- Peer Reviewed
[Journal Article] Emotion clustering using the results of subjective opinion tests for emotion recognition in infant's cries2007
- Author(s)
  Noriko Satoh
- Journal Title
  
  Proc.of INTERSPEECH 2007
  
  Pages: 2229-2232
- Related Report
  2007 Annual Research Report
- Peer Reviewed
[Journal Article] Emotion detection in infants' cries based on a maximum likelihood approach2006
- Author(s)
  Shoichi Matsunaga
- Journal Title
  
  Interspeech 2006
  
  Pages: 1834-1837
- Related Report
  2006 Annual Research Report
[Journal Article] Spectral Cross-Correlation Features for Audio Indexing of Broadcast Newa and Meetings2005
- Author(s)
  Masahide Yamaguchi
- Journal Title
  
  Interspeech2005
- Related Report
  2005 Annual Research Report
[Presentation] Sound source detection using multiple noise models2008
- Author(s)
  松永昭一
- Organizer
  IEEE ICASSP 2008
- Place of Presentation
  ラスベガス,米国
- Year and Date
  2008-04-03
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2007 Final Research Report Summary
[Presentation] Sound source detection using multiple noise models2008
- Author(s)
  Shoichi, Matsunaga
- Organizer
  ICASSP 2008
- Place of Presentation
  Las Vegas, USA
- Year and Date
  2008-04-03
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2007 Final Research Report Summary
[Presentation] 音源識別のための環境音クラスタリングの効果2007
- Author(s)
  松永昭一
- Organizer
  電子情報通信学会音声研究会
- Place of Presentation
  長崎
- Year and Date
  2007-10-25
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2007 Final Research Report Summary
[Presentation] Noise source clustering for audio source detection2007
- Author(s)
  Shoichi, Matsunaga
- Organizer
  IEICE Technical report
- Place of Presentation
  Nagasaki
- Year and Date
  2007-10-25
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2007 Final Research Report Summary
[Presentation] 音源識別のための環境音クラスタリングの効果2007
- Author(s)
  松永昭一
- Organizer
  電子情報通信学会音声研究会
- Place of Presentation
  長崎大学
- Year and Date
  2007-10-25
- Related Report
  2007 Annual Research Report
[Presentation] 雑音クラスタを用いた音源識別の効果2007
- Author(s)
  松永昭一
- Organizer
  日本音響学会秋季研究発表会
- Place of Presentation
  甲府
- Year and Date
  2007-09-21
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2007 Final Research Report Summary
[Presentation] Noise-source clustering for accurate audio source detection2007
- Author(s)
  Shoichi, Matsunaga
- Organizer
  ASJ Autumn Meeting
- Place of Presentation
  Kofu
- Year and Date
  2007-09-21
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2007 Final Research Report Summary
[Presentation] 特徴的音情報検出によるニューストピック分割手法の検討2007
- Author(s)
  金城潤
- Organizer
  電子情報通信学会九州支部学生会講演会
- Place of Presentation
  沖縄
- Year and Date
  2007-09-20
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2007 Final Research Report Summary
[Presentation] 音声認識のためのトピック別連結単語ユニットを用いた言語モデル2006
- Author(s)
  古賀亮二
- Organizer
  電子情報通信学会九州支部学生会講演会
- Place of Presentation
  宮崎
- Year and Date
  2006-09-27
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2007 Final Research Report Summary
[Presentation] 音響モデル作成における無音ラベル自動付与の効果2005
- Author(s)
  岡真樹
- Organizer
  電子情報通信学会九州支部学生会講演会
- Place of Presentation
  福岡
- Year and Date
  2005-09-28
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2007 Final Research Report Summary
[Presentation] 音源識別における複数の音響特徴パラメータの有用性の検討2005
- Author(s)
  山口正秀
- Organizer
  電子情報通信学会九州支部学生会講演会
- Place of Presentation
  福岡
- Year and Date
  2005-09-28
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2007 Final Research Report Summary
[Presentation] Spectral cross-correlation features for audio indexing of broadcast news and meetings2005
- Author(s)
  松永昭一
- Organizer
  Interspeech 2005
- Place of Presentation
  リスボン,ポルトガル
- Year and Date
  2005-09-05
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2007 Final Research Report Summary
[Presentation] Shoichi Matsunaga Spectral cross-correlation features for audio indexing of broadcast news and meetings2005
- Author(s)
  Shoichi, Matsunaga
- Organizer
  Interspeech 2005
- Place of Presentation
  Lisbon, Portugal
- Year and Date
  2005-09-05
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2007 Final Research Report Summary

Noise source detection and recognition for audio indexing

Principal Investigator

MATSUNAGA Shoichi Nagasaki University, Faculty of Engineering, Professor (90380815)

¥3,840,000 (Direct Cost: ¥3,600,000、Indirect Cost: ¥240,000)

Report

Research Products

[Journal Article] Sound source detection using multiple noise models2008

Author(s)

Journal Title

Related Report

[Journal Article] Emotion clustering using the results of subjective opinion tests for emotion recognition in infant's cries2007

Author(s)

Journal Title

Related Report

[Journal Article] Emotion detection in infants' cries based on a maximum likelihood approach2006

Author(s)

Journal Title

Related Report

[Journal Article] Spectral Cross-Correlation Features for Audio Indexing of Broadcast Newa and Meetings2005

Author(s)

Journal Title

Related Report

[Presentation] Sound source detection using multiple noise models2008

Author(s)

Organizer

Place of Presentation

Year and Date

Description

Related Report

[Presentation] Sound source detection using multiple noise models2008

Author(s)

Organizer

Place of Presentation

Year and Date

Description

Related Report

[Presentation] 音源識別のための環境音クラスタリングの効果2007

Author(s)

Organizer

Place of Presentation

Year and Date

Description

Related Report

[Presentation] Noise source clustering for audio source detection2007

Author(s)

Organizer

Place of Presentation

Year and Date

Description

Related Report

[Presentation] 音源識別のための環境音クラスタリングの効果2007

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 雑音クラスタを用いた音源識別の効果2007

Author(s)

Organizer

Place of Presentation

Year and Date

Description

Related Report

[Presentation] Noise-source clustering for accurate audio source detection2007

Author(s)

Organizer

Place of Presentation

Year and Date

Description

Related Report

[Presentation] 特徴的音情報検出によるニューストピック分割手法の検討2007

Author(s)

Organizer

Place of Presentation

Year and Date

Description

Related Report

[Presentation] 音声認識のためのトピック別連結単語ユニットを用いた言語モデル2006

Author(s)

Organizer