Project/Area Number |
10680374
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Intelligent informatics
|
Research Institution | School of Information Science Japan Advanced Institute of Science and Technology |
Principal Investigator |
AKAGI Masato Japan Advanced Institute of Science and Technology, School of Information Science, Professor, 情報科学研究科, 教授 (20242571)
|
Co-Investigator(Kenkyū-buntansha) |
岩城 護 新潟大学, 大学院・自然科学研究科, 助教授 (20262595)
|
Project Period (FY) |
1998 – 2000
|
Project Status |
Completed (Fiscal Year 2000)
|
Budget Amount *help |
¥3,100,000 (Direct Cost: ¥3,100,000)
Fiscal Year 2000: ¥500,000 (Direct Cost: ¥500,000)
Fiscal Year 1999: ¥1,200,000 (Direct Cost: ¥1,200,000)
Fiscal Year 1998: ¥1,400,000 (Direct Cost: ¥1,400,000)
|
Keywords | noise / reverberation / auditory mechanism / nerve firing / interaural time difference (ITD) / signal direction estimation / fundamental frequency / キャンセレーション / 雑音抑圧 / 聴覚末梢系 |
Research Abstract |
This research discusses models of speech enhancement and segregation based on knowledge about human psychoacoustics and auditory physiology. The cancellation model is used for enhancing speech. Special attention is paid to reducing noise by using a spatial filtering technique, and increasing the robustness of fundamental frequency estimation by using a frequency filtering technique. Both techniques adopt concepts of the cancellation model. In addition, some constraints related to the heuristic regularities proposed by Bregman are used to overcome the problem associated with segregating two acoustic sources. Simulated results show that both spatial and frequency filtering are useful in enhancing speech. As a result, these filtering methods can be used effectively at the front-end of automatic speech recognition systems, and for speech feature extraction. The sound segregation model can precisely extract a desired signal from a noisy signal even in waveforms. Additionally, this research discusses models of sound source direction estimation based on physiological data of mammal audition. The model can explain the relationship between transmission of temporal and phase information by nerve firing and accuracy of interaural time differences.
|