• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to project page

1996 Fiscal Year Final Research Report Summary

Estimation of vocal tract configurations from magnetic resonance images and synthesizing speech sounds

Research Project

Project/Area Number 07650506
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeSingle-year Grants
Section一般
Research Field 計測・制御工学
Research InstitutionKumamoto University

Principal Investigator

SONODA Yorinobu  Kumamoto University Faculty of Engineering Professor, 工学部, 教授 (70037836)

Co-Investigator(Kenkyū-buntansha) OGATA Kohichi  Kumamoto University Graduate school of Science and Technology Associate, 自然科学研究科, 助手 (10264277)
Project Period (FY) 1995 – 1996
Keywordsmagnetic resonance image / vocal tract configuration / speech synthesis / image processing / vocal tract simulator
Research Abstract

A main research of this project is investigating vocal tract configurations estimated by (1) magnetic resonance images (MRIs) and by (2) real speech signals. Especially in this term, experiments were conducted to the estimation of the vocal tract configuration from real speech signals.
Developing a simulator on a computer system which is analogous to a mechanism of speech production process of human beings, the configuration of the tract was estimated by using "Analysis by Synthesis" algorithm. The simulator consists of three parts ; vocal source, vocal tract and lip radiation. Each part was combined with a hybrid system represented by frequency domain and time domain for simplicity of insertion of loss-term into the tract model. The model of vocal tract consists of 20 cylindrical tubes which are equal in length and different in cross sectional area.
Five Japanese vowels were synthesized by using the vocal tract simulator where the configuration of the vocal tract was estimated by the MRIs, and their formant patterns (frequencies) were estimated. First formant frequencies of vowel /a/ and /i/ were estimated lower than those of real speech sound by about 120 Hz and 70 Hz, respectively. Relative errors were shown within 5 % in other vowels except fourth formant frequency of /i/.
On sounds synthesized by the shape estimated from real speech sounds, experimental results were shown rather good approximation to spectral patterns of real sounds, and their relative errors were shown within 3 %. However, errors in first formant frequencies of /a/, /u/ and /e/ were relatively large and their values ranged in 8 - 9 %.

  • Research Products

    (4 results)

All Other

All Publications (4 results)

  • [Publications] Keisuke Mori: "Relationship between lip shapes and acoustical characteristics during speech" Proceedings of 3rd Joint meeting of ASA and ASJ. 879-882 (1996)

    • Description
      「研究成果報告書概要(和文)」より
  • [Publications] Kohichi Ogata: "Development of articulatory measuring system by using magnetometer and optical sensors" Proceedings of 3rd Joint meeting of ASA and ASJ. 889-894 (1996)

    • Description
      「研究成果報告書概要(和文)」より
  • [Publications] K.Mori and Y.Sonoda: ""Relationship between lip shapes and acoustical characteristics during speech"" Proceedings of 3rd Joint meeting of Acoustical Society of America (ASA) and Acoustical Society of Japan (ASJ). 879-882 (1996)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] K.Ogata and Y.Sonoda: ""Development of articulatory measuring system by using magnetometer and opticcal sensors"" Proceedings of 3rd Joint meeting of Acoustical Society of America (ASA) and Acoustical Society of Japan (ASJ). 889-894 (1996)

    • Description
      「研究成果報告書概要(欧文)」より

URL: 

Published: 1999-03-09  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi