1992 Fiscal Year Final Research Report Summary
ASSESSMENT OF SPEECH SYNTHESIS TECHNOLOGY AS A HUMAN INTERFACE
Project/Area Number |
03650293
|
Research Category |
Grant-in-Aid for General Scientific Research (C)
|
Allocation Type | Single-year Grants |
Research Field |
情報工学
|
Research Institution | UTSUNOMIYA UNIVERSITY |
Principal Investigator |
KASUYA Hideki UTSUNOMIYA UNIVERSITY, FACULTY OF ENGINEERING, PROFESSOR, 工学部, 教授 (20006240)
|
Co-Investigator(Kenkyū-buntansha) |
HIKI Shizuo WASEDA UNIVERSITY FACULTY OF HUMAN SCIENCES, PROFESSOR, 人間科学部, 教授 (50006227)
|
Project Period (FY) |
1991 – 1992
|
Keywords | Speech synthesis / quality evaluation, / human factors / human interface, and / speech perception |
Research Abstract |
1) Assessment of synthetic speech quality has been made from the viewpoint of 1) intelligibility, 2) naturalness, and 3) suitability to individual applications. The suitability can be stated in terms of application purpose, the users, and the environments. 2) From the intelligibility study, we found the followings: (1) Intelligibility scores of phonemes in bisyllables are more essential than the ones in monosyllables in the sense that the formers are more closely related to the intelligibility score of the sentence than the latters. (2) The word intelligibility should be measured in the anomalous sentence context as shown at Haskins Laboratories, U.S.A. We also found an adequate procedure for the perceptual judgments of anomalous sentences. 3) We have found that the naturalness assessment of synthetic speech should be made not only by ordinary people but by the subjects who are trained in phonetics, since the naturalness test is to measure the degree of actual phonetic manifestation in synthetic speech. 4) Suitability of the synthetic speech has been experimentally studied in terms of the appropriate voice quality of the synthetic speech for particular applications. The voice quality was varied in terms of the average spectral shape and the fundamental frequency. We found that the appropriate average spectral shape depends upon whether the synthetic speech is to be heard in a quiet room or in an office and that an adequate average fundamental frequency changes depending upon whether it is to be used for long hours or for a short period of time.
|
Research Products
(8 results)