2007 Fiscal Year Final Research Report Summary
Noise source detection and recognition for audio indexing
Project/Area Number |
17500114
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Perception information processing/Intelligent robotics
|
Research Institution | Nagasaki University |
Principal Investigator |
MATSUNAGA Shoichi Nagasaki University, Faculty of Engineering, Professor (90380815)
|
Project Period (FY) |
2005 – 2007
|
Keywords | noise / acoustic feature / sound source detection / acoustic model / clustering |
Research Abstract |
We have studied an audio source detection approach based on a stochastic method to detect speech, noise, music, and silence. Our approach uses not only conventional surface acoustic features such as signal energy and pitch frequency but also new features that are based on spectral correlation for more accurate detection. The experiment with the broadcast news demonstrated that these feature parameters made it possible to capture the audio source more accurately. This research also proposed a sound source detection approach based on elaborate noise-modeling techniques for audio indexing. For accurate detection, we devised two methods to generate multiple-noise models through clustering techniques. One method is based on frame-wise data similarity, and the other is based on noise source similarity. The former method employs K-means clustering and a smoothing technique to avoid inaccurate segmentation. The latter method involves noise modeling based on a tree data structure generated by the progressive merging of noise clusters. The classification experiments show that by using these proposed methods, audio sources can be detected with better accuracy than that achieved by the conventional methods. When four noise models generated by the latter method were used, the noise detection performance increased by 3.9% for the periods in which the sound sources did not overlap. With regard to the experiments for an audio stream that included overlapped segments, the noise detection performance increased by 1.2% without a decrease in the speech detection performance.
|
Research Products
(12 results)