Parameter-control system of low bit rate for improving the quality of formant synthetic sounds

Research Project

Project/Area Number	60550302
Research Category	Grant-in-Aid for General Scientific Research (C)
Allocation Type	Single-year Grants
Research Field	計測・制御工学
Research Institution	Kumamoto University
Principal Investigator	WATANABE Akira Faculty of Engineering, Kumamoto University, Professor, 工学部, 教授 (50040382)
Co-Investigator(Kenkyū-buntansha)	UEDA Yuichi Faculty of Engineering, Kumamoto University, Assistant, 工学部, 助手 (00141961)
Project Period (FY)	1985 – 1986
Project Status	Completed (Fiscal Year 1986)
Budget Amount *help	¥1,900,000 (Direct Cost: ¥1,900,000) Fiscal Year 1986: ¥100,000 (Direct Cost: ¥100,000) Fiscal Year 1985: ¥1,800,000 (Direct Cost: ¥1,800,000)
Keywords	Formant analysis-synthesis system / Improvement of quality / Differentiated glottal wave / 低ビットレート
Research Abstract	A new formant analysis-synthesis system in which several parameters of differntiated glottal wave are controlled has been proposed for improving the naturalness of synthetic speech. In this study, the relationship between the parameters and the naturalness has mainly been investigated. The results are summarized as follows: 1. According to the preliminary listening test, the control of the parameters in the differentiated glottal waves and the insertion of the random signal in the unvoiced segments are available for synthesizing speech with good quality. 2. The parameters of the differentiated glottal waves should be extracted after that the influence of the phase distortion by the recording system on speech waves are compensated. 3. Based on the differentiated glottal wave which is smoothed by fitting a 10-th order polynomial to every pitch, its positive and negative peaks, and those periods change very continuously in connected speech. Therefore, it is infered that the continuity of the parameters have a serious influence on the generation of natural speech. 4. The correlations in the parameters of the differentiated glottal waves is especially prominent between the periods of the negative peaks ( vocal-tract-excitation-points ) and the other parameters. The speech synthesized using the parameters estimated from the periods only is almost the same in the quality as it using many parameters extracted. In this case, the transmitted information can be compressed to 750 bits/s of a low bit rate.

Report

(1 results)

1986 Final Research Report Summary