• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

A scheme for continuous speech recognition in a large context based on the human process of spoken language recognition

Research Project

Project/Area Number 03452164
Research Category

Grant-in-Aid for General Scientific Research (B)

Allocation TypeSingle-year Grants
Research Field 情報工学
Research InstitutionScience University of Tokyo

Principal Investigator

FUJISAKI Hiroya  Science University of Tokyo, Dept. of Applied Electronics Professor, 基礎工学部, 教授 (80010776)

Co-Investigator(Kenkyū-buntansha) HARADA Tetsuya  Science University of Tokyo, Dept. of Applied Electronics Lecturer, 基礎工学部, 講師 (80189703)
ITOH Kohji  Science University of Tokyo, Dept. of Applied Electronics Professor, 基礎工学部, 教授 (20013683)
HIROSE Keikichi  University of Tokyo, Dept. of Electronic Engineering Associate Professor, 工学部, 助教授 (50111472)
Project Period (FY) 1991 – 1992
Project Status Completed (Fiscal Year 1992)
Budget Amount *help
¥7,000,000 (Direct Cost: ¥7,000,000)
Fiscal Year 1992: ¥1,600,000 (Direct Cost: ¥1,600,000)
Fiscal Year 1991: ¥5,400,000 (Direct Cost: ¥5,400,000)
KeywordsSpoken Language / Human Processes of Recognition / Large Context / Continuous Speech / Speech Recognition System / Syntactic Information / Semantic Information / Discourse Information / 認識過程 / 人間 / 内部辞書 / 辞書検索
Research Abstract

Most of the current systems for automatic speech recognition fail to achieve recognition performance comparable to human listeners, since they are constructed without paying attention to the human processes of spoken language recognition. From this point of view, the present study investigates the human processes and incorporates the findings into a scheme for automatic recognition of continuous speech in a large context. The followings are the main results:
1. Experimental investigation and modeling of the human processes of spoken language recognition
Using as stimuli natural utterances with controlled acoustic, syntactic and semantic information, the following findings were obtained on the human processes of spoken language recognition.
(1) The unit of speech recognition varies widely from phones and syllables to words and phrases depending on the experimental condition and context.
(2) Larger units generally require less accuracy of representation for correct recognition.
(3) The amount … More of acoustic information necessary for recognition of a given unit varies widely depending on the size of context and prior knowledge on the part of the listener.
(4) The accuracy and speed of access to mental lexicon varies dynamically depending on the acoustic, syntactic, semantic and discourse information available to the listener.
Based on these findings, a model has been constructed for the human processes of spoken language recognition.
2. Proposal and implementation of a scheme for automatic recognition of spoken language recognition
Based upon the above findings and the model, a scheme for automatic recognition of continuous speech in a large context has been proposed, featuring (1) use of multiple size units and accuracy of acoustic feature representation, (2) use of prosodic features for word and phrase boundary detection, (3) extraction of syntactic, sematic, and idiosyncratic information from a large context. The main components of the system have been implemented.
3. Demonstration of the validity of the proposed scheme
The proposed scheme has been tested by recognition experiments of phones, syllables and words in continuous speech with a large context, and the results have confirmed the essential validity and feasibility of the proposed scheme. Less

Report

(3 results)
  • 1992 Annual Research Report   Final Research Report Summary
  • 1991 Annual Research Report
  • Research Products

    (20 results)

All Other

All Publications (20 results)

  • [Publications] 藤崎 博也: "音声認識における音響的特徴表現の時間単位に関する検討" 日本音響学会平成3年秋季研究発表会講演論文集. 1. 153-154 (1991)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      1992 Final Research Report Summary
  • [Publications] 峯松,信明: "複数の時間的単位・精度の音響的特徴表現を用いた音声認識" 日本音響学会平成4年春季研究発表会講演論文集. 1. 31-32 (1992)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      1992 Final Research Report Summary
  • [Publications] 大野 澄雄: "連続音声の語句の照合における種々のレベルの辞書情報の利用" 日本音響学会平成4年春季研究発表会講演論文集. 1. 95-96 (1992)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      1992 Final Research Report Summary
  • [Publications] Fujisaki Hiroya: "The influence of semantic and syntactic information on spoken sentence recognition" Proceedings of the 1992 International Conference on Spoken Lnaguage Processing. 1. 153-156 (1992)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      1992 Final Research Report Summary
  • [Publications] 峯松 信明: "連続音声知覚における高次言語情報の及ぼす影響" 日本音響学会聴覚研究会資料. H-92-56. 1-6 (1992)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      1992 Final Research Report Summary
  • [Publications] Fujisaki Hiroya: "A scheme for automatic recognition of continuous speech in a large context based on human processes of spoken language recognition" Proceedins of EUROSPEECH 93. (1993)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      1992 Final Research Report Summary
  • [Publications] Hiroya Fujisaki: "A study on the size of the temporal unit for representing the acoustic features in automatic speech recognition" Reports of 1991 Autumn Meeting of the Acoustical Society of Japan. vol. 1. 153-154 (1991)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      1992 Final Research Report Summary
  • [Publications] Nobuaki Minematsu: "Automatic speech recognition using multiple temporal units and accuracy of representation for the acoustic features" Reports of 1992 Spring Meeting of the Acoustical Society of Japan. vol. 1. 31-32 (1992)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      1992 Final Research Report Summary
  • [Publications] Sumio Ohno: "Utilization of lexical information at multiple levels in template matching of words and phrases in continuous speech" Reports of 1992 Spring Meeting of the Acoustical Society of Japan. vol. 1. 95-96 (1992)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      1992 Final Research Report Summary
  • [Publications] Hiroya Fujisaki: "The influence of sematic and syntactic information on spoken sentence recognition" Proceedings of the 1992 International Conference on Spoken Language Processing. vol. 1. 153-156 (1992)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      1992 Final Research Report Summary
  • [Publications] Nobuaki Minematsu: "The influence of higher-level linguistic information on continuous speech perception" Transactions of Committee on Hearing Research, Acoustical Society of Japan. vol. H-92, no. 56. (1992)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      1992 Final Research Report Summary
  • [Publications] Hiroya Fujisaki: "A scheme for automatic recognition of continuous speech in a large context based on human processes of spoken language processing" Proceedings of EUROSPEECH 93, Berlin. (1993)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      1992 Final Research Report Summary
  • [Publications] Fujisaki,Hiroya: "The influence of semantic and syntactic information on spoken sentence recognition" Proceedings of the 1992 International Conference on Spoken Language Processing. 1. 153-156 (1992)

    • Related Report
      1992 Annual Research Report
  • [Publications] 峯松,信明: "連続音声知覚における高次言語情報の及ぼす影響" 日本音響学会聴覚研究会資料. H-92-56. 1-6 (1992)

    • Related Report
      1992 Annual Research Report
  • [Publications] 峯松,信明: "意味的内容が音声知覚過程に及ぼす影響に関する実験的検討" 日本音響学会秋季研究発表会講演論文集. 1. (1992)

    • Related Report
      1992 Annual Research Report
  • [Publications] Fujisaki,Hiroya: "A scheme for automatic recognition of continuous speech in a large context based on human processes of spoken language recognition" Proceedins of EUROSPEECH 93. (1993)

    • Related Report
      1992 Annual Research Report
  • [Publications] 藤崎 博也: "音声認識における音響的特徴表現の時間単位に関する検討" 日本音響学会平成3年秋季研究発表会講演論文集. 1. 153-154 (1991)

    • Related Report
      1991 Annual Research Report
  • [Publications] 峯松 信明: "複数の時間単位・精度の音響的特徴表現を用いた音声認識" 日本音響学会平成4年春季研究発表会講演論文集. 1. 31-32 (1992)

    • Related Report
      1991 Annual Research Report
  • [Publications] 大野 澄雄: "連続音声の語句の照合における種々のレベルの辞書情報の利用" 日本音響学会平成4年春季研究発表会講演論文集. 1. 95-96 (1992)

    • Related Report
      1991 Annual Research Report
  • [Publications] Fujisaki,H.: "A method for automatic speech recognition based on findings of the human process of speech perception" Proceedings of the 1992 International Conference on Spoken Language Processing (Banff,Canada). (1992)

    • Related Report
      1991 Annual Research Report

URL: 

Published: 1991-04-01   Modified: 2016-04-21  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi