• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Speech Synthesis Method for Flexible Voice Quality Control

Research Project

Project/Area Number 08680386
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeSingle-year Grants
Section一般
Research Field Intelligent informatics
Research InstitutionUtsunomiya University

Principal Investigator

KASUYA Hideki  Utsunomiya University Faculty of Engineering, Professor, 工学部, 教授 (20006240)

Co-Investigator(Kenkyū-buntansha) YANG Chang-Sheng  Utsunomiya University Faculty of Engineering, Assistant, 工学部, 助手 (80272219)
Project Period (FY) 1996 – 1997
Project Status Completed (Fiscal Year 1997)
Budget Amount *help
¥2,600,000 (Direct Cost: ¥2,600,000)
Fiscal Year 1997: ¥700,000 (Direct Cost: ¥700,000)
Fiscal Year 1996: ¥1,900,000 (Direct Cost: ¥1,900,000)
KeywordsSpeech Synthesis / Voice Quality / Individuality / ARX Analysis / Formant / Voice Source Characteristics / Hoarse Voice / Whisper / フォルマント合成
Research Abstract

Flexible voice quality control in speech synthesis includes not only that of such qualities as whisper, breathy and tense but also that of talker individuality resulting from physiological differences in the speech organ. Major aim of this research project is to establish a base to realize such control in speech synthesis. In this year we have paid much attention to synthetic strategy to generate speech of whisper, breathy, harsh and tense quality as well as various talker individualities, using ARX (auto-regressive with exogenous input) speech analysis-synthesis method that was developed last year. As for whisper voice, we have investigated acoustic mechanism to interpret the formant structure specific to whisper voice and found new theory to explain frequency shift of lower formants based on MRI (magnetic resonance imaging) measurements of the larynx and computer simulation of acoustic resonance of the vocal tract. In order to produce breathy voice, we have proposed a method to control voicing source parameters and amount of laryngeal noise. Regarding harsh voice, we have first developed a sophisticated analysis-conversion-synthesis system that allows us to manipulate characteristics of jitter, shimmer, spectral fluctuation and laryngeal noise and then studied contributions of these parameters to the perception of harsh voice. From the experiments we have found that cross effects exist among these parameters to generate harsh voice quality. Tense voice has been successfully generated by controlling open quotient and spectral tilt of a voicing source waveform. Talker individuality has been found largely related to the static nature of formant trajectories and less to the dynamics.

Report

(3 results)
  • 1997 Annual Research Report   Final Research Report Summary
  • 1996 Annual Research Report
  • Research Products

    (17 results)

All Other

All Publications (17 results)

  • [Publications] 松田 勝敬, 粕谷 英樹: "ささやき声の音響理論" 日本音響学会平成9年度秋季研究発表会講演論文集. I. 299-300 (1997)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      1997 Final Research Report Summary
  • [Publications] C.S. Yang and H.Kasuya: "Automatic estimation of formant and voice source paramenters using a subspace based algorithm" Proceeding of IEEE International Confernce on Acoustics,Speech,and Signal Processing. (印刷中). (1998)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      1997 Final Research Report Summary
  • [Publications] W.Zhu and H.Kasuya: "Perceptual contributions of static and dynamic features of vocal tract characteristics to talker individuality" IEICE Trans.,Fundamentals of Electronics,Communications and Computer Sciences. E81-A,2. 268-274 (1998)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      1997 Final Research Report Summary
  • [Publications] W.Zhu and H.Kasuya: "A speech analysis-synthesis-editing system based on the ARX speech production model" J.Acosut.Soc.Jpn.(E). 19,3(印刷中). (1998)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      1997 Final Research Report Summary
  • [Publications] T.Ohtuska, C.S. Yang and H.Kasuya: "Automatic creation of CV templates for formant type speech synthesis based on HMM-based segmentation and syllable boundary detection" Proceedings of International Congress on Acoustics. (印刷中). (1998)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      1997 Final Research Report Summary
  • [Publications] 遠藤 康男, 粕谷 英樹: "周期ごとのゆらぎを考慮した音声の分析・変換・合成システム" 電子情報通信学会論文誌A. (掲載決定). (1998)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      1997 Final Research Report Summary
  • [Publications] C.S.Yang and H.Kasuya: "Automatic estimation of formant and voice source parameters using a subspace based algorithm" Proceeding of IEEE International Conference on Acoustics, Speech, and Signal Preocessing. (in print.). (1998)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      1997 Final Research Report Summary
  • [Publications] W.Zhu and H.Kasuya: "Perceptual contributions of static and dynamic features of vocal tract characteristics to talker individuality" IEICE Trans., Fundamentals of Electronics, Communications and Computer Sciences. E81-A,No.2. 268-274 (1998)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      1997 Final Research Report Summary
  • [Publications] W.Zhu and H.Kasuya: "A speech analysis-synthesis-editing system based on the ARX production model" J.Acoust.Soc.Jpn.(E). Vol.19, No.3 (in print.). (1998)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      1997 Final Research Report Summary
  • [Publications] T.Ohtsuka, C.S.yang and H.Kasuya: "Automatic creation of CV templates for formant type speech synthesis based on HMM-based segmentation and syllable boundary detection" Proceedings of Int.Congress Acoust.(in print.). (1998)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      1997 Final Research Report Summary
  • [Publications] M.Matsuda and H.Kasuya: "An acoustic theory about whisper" Proceedings of The 1997 Autumn Meeting of the Acoustical Society of Japan. 299-300 (1997)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      1997 Final Research Report Summary
  • [Publications] C.S.Yang and H.Kasuya: "Automatic estimation of formant and voice source parameters using a subspace based algorithm" Proceeding of IEEE International Conference on Acoustics,Speech,and Signal Processing. (印刷中). (1998)

    • Related Report
      1997 Annual Research Report
  • [Publications] W.Zhu and H.Kasuya: "Perceptual contributions of static and dynamic features of vocal tract characteristics to talker individuality" IEICE Trans.,Fundamentals of Electronics,Communications and Computer Sciences. E81-A,2. 268-274 (1998)

    • Related Report
      1997 Annual Research Report
  • [Publications] W.Zhu and H.Kasuya: "A speech analysis-synthesis-editing system based on the ARX speech production model" J.Acoust.Soc.Jpn.(E). 19,3(印刷中). (1998)

    • Related Report
      1997 Annual Research Report
  • [Publications] T.Ohtsuka, C.S.Yang and H.Kasuya: "Automatic creation of CV templates for formant type speech synthesis based on HMM-based segmentation and syllable boundary detection" Proceedings of Int.Congress Acoust.(印刷中). (1998)

    • Related Report
      1997 Annual Research Report
  • [Publications] 松田勝敬、粕谷英樹: "ささやき声の音響理論" 日本音響学会平成9年度秋季研究発表会講演論文集. 1. 299-300 (1997)

    • Related Report
      1997 Annual Research Report
  • [Publications] W.Ding,et al.: "Fast and robust joint estimation of vocal tract and voice source parameters" Proc. ICASSP97. (1997)

    • Related Report
      1996 Annual Research Report

URL: 

Published: 1996-04-01   Modified: 2016-04-21  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi