• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to project page

1998 Fiscal Year Final Research Report Summary

Model and example based prosodic feature extraction and its efficient integration for speech recognition along with phoneme-based recognition

Research Project

Project/Area Number 08680391
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeSingle-year Grants
Section一般
Research Field Intelligent informatics
Research InstitutionJapan Advanced Institute of Science and Technology, Hokuriku

Principal Investigator

SHIMODAIRA Hiroshi  JAIST,School of Information Science and Associate Professor, 情報科学研究科, 助教授 (30206239)

Co-Investigator(Kenkyū-buntansha) NAKAI Mitsuru  JAIST,School of Information Science and Associate, 情報科学研究科, 助手 (60283149)
Project Period (FY) 1996 – 1998
Keywordsprosody / prosodic-boundary / pitch pattern / speech recognition
Research Abstract

The aim of this research is to exploit the prosodic information contained in speech for automatic speech recognition, where the prosodic information as well as phonemic information plays an important role for speech recognition.
(a) Robust pitch determination algorithm : In contrast to the conventional pitch trackers based on numerical curve-fitting, the proposed method employs a quantitative pitch generation model, which is often used for synthesizing F_0 contour from prosodic event commands for estimating continuous F0 pattern. An inverse filtering technique is employed for obtaining the initial candidates of the prosodic commands. In order to find the optimal command sequence from the commands efficiently, a beam-search algorithm and an N-best technique are employed. Preliminary experiments for a male speaker of the ATR B-set database showed promising results both in quality of the restored pattern and estimation of the prosodic events.
Along with the improvement of F_0 smoothing technique above, a novel approach of frame-wise pitch determination algorithm which gives reliability of pitch frequency, was proposed as well.
(b) Prosodically guided speech recognition :
i. As a first step toward speech recognition based on prosodic information, isolated word recognition task under noisy environment was employed. Experiments showed that word pitch pattern helps reducing the ambiguity in discriminating similar words.
ii. It was shown that the dependencies between consecutive phrases can be measured by means of prosodic features, where 87 % accuracy rate was obtained for the ATR read speech data.
iii. A prototype of prosodically guided speech recognition system was developed, where phrase hypotheses given by phoneme recognition are rescored on the basis of likelihood of phrase boundaries measured by prosodic features.

  • Research Products

    (18 results)

All Other

All Publications (18 results)

  • [Publications] 漢野 救泰: "低域スペクトルの予測残差を利用した非定常高騒音環境での有声音区間の検出" 電子情報通信学会 論文誌 D-II. J80-DII,1. 26-35 (1997)

    • Description
      「研究成果報告書概要(和文)」より
  • [Publications] 中井 満: "F_0 生成モデルを用いたテンプレートに基づく連続音声の句境界検出" 電子情報通信学会 論文誌 D-II. J80-DII,10. 2605-2614 (1997)

    • Description
      「研究成果報告書概要(和文)」より
  • [Publications] Mitsuru Nakai: "Accent Phrase Segmentation by F_0 Clustering Using Superpositional Modelling" Computing Prosody, Springer. 343-359 (1997)

    • Description
      「研究成果報告書概要(和文)」より
  • [Publications] Mitsuru Nakai: "On Representation of Fundamental Frequency of Speech for Prosody Analysis Using Reliability Function" Proc.EuroSpeech'97. 243-246 (1997)

    • Description
      「研究成果報告書概要(和文)」より
  • [Publications] Hiroshi Shimnodaira: "Restoration of Pitch Pattern of Speech Based on a Pitch Generation" Proc.EuroSpeech'97. 521-524 (1997)

    • Description
      「研究成果報告書概要(和文)」より
  • [Publications] Hiroshi Shimnodaira: "Modified Minimum Classification Error Learning and Its Application to Neural Net-works" Advances in Pattern Recognition. 785-794 (1998)

    • Description
      「研究成果報告書概要(和文)」より
  • [Publications] Jun Rokui: "Improving the Generalization Performance of the Minimum Classification Error Learning and Its Application to Neural Networks" The Fifth International Conference on Neural Information Process-ing (ICONIP'98). 63-67 (1998)

    • Description
      「研究成果報告書概要(和文)」より
  • [Publications] Mitsuru Nakai: "The Use of F_0 Reliability Function for Prosodic Command Analysis on F_0 Contour Generation Model" proc.of the 5th International Conference on Spoken Language Pro-cessing (ICSLP98). 998 (1998)

    • Description
      「研究成果報告書概要(和文)」より
  • [Publications] Hiroshi Shimodaira: "Improving the Generalization Performance of the MCE/GPD Learning" proc.of the 5th International Conference on Spoken Language Pro-cessing (ICSLP98). 795 (1998)

    • Description
      「研究成果報告書概要(和文)」より
  • [Publications] S.Kanno and H.Shimodaira: "Voiced Sound Detection under Non-stationary and Heavy Noisy Environment Using the Prediction of Low-Frequency Spectrum" IEICE Trans.D-II,Vol.J80-DII,No.1. 26-35 (1997)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] M.Nakai, H.Singer, Y.Sakisaga and H.Shimodaira: "Accent Phrase Segmentation on F_0 Templates Using a Superpositional Prosodic Model" IEICE Trans.D-II,Vol.J80-DII,No.10. 2605-2614 (1997)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] Mitsuru Nakai, Harald Singer, Yoshinori Sagisaka, Hiroshi Shimodaira: "Accent Phrase Segmentation by F_0 Clustering Using Superpositional Modelling" Computing Prosody (Y,Sagisaka, N.Compbell, N.Higuchi Ed.) Springer. 343-359 (1997)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] Mitsuru Nakai and Hiroshi Simodaira: "On Representation of Funadamental Frequency of Speech for Prosody Analysis Using Reliability Function" Proc, EuroSpeech'97. 243-246 (1997)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] Hiroshi Shimodaira, Mitsuru Nakai and Akihiro Kumata: "Restoration of Pitch Pattern of Speech Based on a Pitch Generation Model" Proc, EuroSpeech'97. 521-524 (1997)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] Hiroshi Shimodaira, Jun Rokui and Mitsuru Nakai: "Modified Minimum Classification Error Learning and Its Application to Neural Networks" Advances in Pattern Recognition (Joint IAPR International work-shops SPPR'98 AND SPR'98). 785-794 (1998)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] Jun Rokui and Hiroshi Shimodaira: "Improving the Generalization Performance of the Minimum Classification Error Learning and Its Application to Neural Networks" The Fifth International Conference on Neural Information Processing (ICONIP'98). 63-67 (1998)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] Mitsuru Nakai and Hiroshi Shimodaia: "The Use of F_0 Reliability Function for Prosodic Command Analysis on F_0 Contour Generation Model" The 5th International Conference on Spoken Language Processing (ICSLP98). #998 (1998)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] Hiroshi Shimodaiya and Jun Rokui: "Improving the Generaliazation Performance of the MCE/GPD Learning" The 5th International Conference on Spoken Language Processing (ICSLP98). #795 (1998)

    • Description
      「研究成果報告書概要(欧文)」より

URL: 

Published: 1999-12-08  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi