Project/Area Number |
01850068
|
Research Category |
Grant-in-Aid for Developmental Scientific Research
|
Allocation Type | Single-year Grants |
Research Field |
電子通信系統工学
|
Research Institution | University of Tsukuba |
Principal Investigator |
ITAHASHI Shuichi Univ. of Tsukuba, Institute of Information Sciences and Electronics, Professor, 電子・情報工学系, 教授 (70151454)
|
Co-Investigator(Kenkyū-buntansha) |
KUREMATSU Akira ATR Interpreting Telephone Research Laboratories, President, 社長
MAKINO Shuzo Tohoku University, Research Center for Applied Information Sciences, Associate P, 応情研センター, 助教授 (00089806)
KOBATAKE Hidefumi Tokyo University of Agriculture and Technology, Faculty of Technology, Professor, 工学部, 教授 (80013720)
SHIRAI Katsuhiko Waseda University, Department of Electrical Engineering, Professor, 理工学部, 教授 (10063702)
FUJISAKI Hiroya Science University of Tokyo, Faculty of Engineering, Professor, 基礎工学部, 教授 (80010776)
|
Project Period (FY) |
1989 – 1991
|
Keywords | CD-ROM / DAT / Japanese / Noise / Speech / Speech corpus / Speech database / Speech processing |
Research Abstract |
Speech material was chosen including photetically ballanced 216 words, 110 monosyllables, 70 short sentences, 11 interrogative sentences, 7 sentences for speech quality measurement, one folk tale, weather forecast sentences and narrative sentences. Speech samples were recorded onto digital audio tapes (DAT) based on the above material with ten male and ten female speakers. Noise sound in a computer room was recorded to investigate influence of noise to speech processing. Noise data of two hour duration was recorded onto DAT under 4 conditions of varying numbers and kinds of working machines. Speech data of the 20 speakers mentioned above (4 utterances for each item, 2 hours for each speaker, 40 hours in all) was checked by hearing and those of good speech and recording quality of 12 speakers (6 male and 6 female speakers, 24 hours in all) was selected among them. Master tapes for speech database was produced with start ID's and program numbers assigned to major items so that necessary items can be retrieved easily. Check lists were prepared which describes pronunciation and recording conditions in detail. One utterance of good quality from 4 repetitions was selected for 7 kinds of continuous speech data mentioned above and they were recorded on to CD-ROM. This would be the first attempt to create a CD-ROM speech database of Japanese sentence speech. Then best utterances of 110 monosyllables and phonetically balanced 216 words were also recorded onto CD-ROM. One of the major objectives of speech databases is to utilize them to develop various techniques of speech analysis, synthesis and recognition and to compare and evaluate them. Therefore, several kinds of speech analysis and recognition experiments have been performed. Some methods were proved to be useful for speech/non-speech discrimination and speech recognition under noise environments.
|