• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Exploiting Speech Understanding in Intelligent Interfaces

Research Project

Project/Area Number 06044055
Research Category

Grant-in-Aid for international Scientific Research

Allocation TypeSingle-year Grants
SectionJoint Research
Research InstitutionThe University of TOKYO

Principal Investigator

WARD Nigel  The University of TOKYO Department of Mechano-Informatics Associate Professor, 大学院・工学系研究科, 助教授 (00242008)

WARDNIGEL.G (1995)  東京大学, 工学部, 助教授

Co-Investigator(Kenkyū-buntansha) TAJCHMAN Gar  国際コンピュータサイエンス研究所, 音声課, 研究員
MORGAN Nelso  国際コンピュータサイエンス研究所, 音声課・カリフォーニア大学・工学部・研究員, 教授
JURAFSKY Dan  国際コンピュータサイエンス研究所, 音声課・カリフォーニア大学・工学部・研究員, 助教授
TERADA Minoru  The University of TOKYO Department of Mechano-Informatics, 大学院・工学系研究科, 助教授 (80163921)
INOUE Hirochika  The University of TOKYO Department of Mechano-Informatics, 大学院・工学系研究科, 教授 (50111464)
DAN Jurafsky  International Computer Science Institute
NELSON Morgan  International Computer Science Institute
GARY Tajchman  International Computer Science Institute
Project Period (FY) 1994 – 1995
Project Status Completed (Fiscal Year 1995)
Budget Amount *help
¥4,200,000 (Direct Cost: ¥4,200,000)
Fiscal Year 1995: ¥1,400,000 (Direct Cost: ¥1,400,000)
Fiscal Year 1994: ¥2,800,000 (Direct Cost: ¥2,800,000)
KeywordsUserInterFace / Speech Understanding / Speech Input / Natural Language / Understanding / AIZUCHI / MultiModel / あいずち / 音声 / ユーザー・インタフェース / ノイズ / 英語 / 日本語 / 文法
Research Abstract

We are interested in the use of spoken language in human-computer interaction. The inspiration is the fact that, for human-human interaction, meaningful exchanges can take place even without accurate recognition of the words the other is saying --- this being possible due to shared knowledge and complementary communication channels, especially gesture and prosody. We want to exploit this fact for man-machine interfaces.
Therefore we are doing three things :
1. Using simple speech recognition to augment graphical user interfaces, well integrated with other input modalities : keyboard, mouse, and touch screen.
2. Building systems able to engage in simple conversations, using mostly prosodic clues. To sketch out our latest success :
We conjectured that it would be possible for Japanese to decide when to produce many back-channel utterances based on prosodic clues alone, without reference to meaning.
We found that
neither vowel lengthening, volume changes, nor energy level (to detect when the other finished speaking) were by themselves good predictors of when to produce an aizuchi. The best predictor was a low pitch level.
Specifically, upon detection of the end of a region of pitch less than.9 times the local median pitch and continuing for 150ms, coming after at least 600ms of speech, the system predicted an aizuchi 200ms to 300ms later, providing it had not done so within the preceding 1 second.
We also built a real-time system based on the above decision rule. A human stooge steered the conversation to a suitable topic and then switched on the system. After swich-on the stooge's utterances and the system's outputs, mixed together, produced one side of the conversation. We found that none of the 5 subjects had realized that his conversation partner had become partially automated.
3. Building tools and collecting data to help do 1 and 2.

Report

(2 results)
  • 1995 Final Research Report Summary
  • 1994 Annual Research Report
  • Research Products

    (19 results)

All Other

All Publications (19 results)

  • [Publications] Nigel,WARD: "Using Prosodic Clucs to Decide When to Produce Back-Channel Utterances" CSLP.

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      1995 Final Research Report Summary
  • [Publications] Jurafsky,Daniel: "A Probabilistic Model of Lexical and Syntactic Access and Disambiguation" Cognitive Science.

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      1995 Final Research Report Summary
  • [Publications] Jurafsky,Daniel: "Universal Tendencies in the Semantics of the Diminutives : Stuructured Polysemy and the Semantic Shift from Children to Second-Order Predicates" Language.

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      1995 Final Research Report Summary
  • [Publications] Gildea,Daniel and Daniel. Jurafsky: "Learning Bias and Phonological Rules Induction" Computational Linguistics.

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      1995 Final Research Report Summary
  • [Publications] Gildea,Daniel and Daniel. Jurafsky: "Automatic Induction of Finite State Transducers for Simple Phonological Rules" In Proceedings of ACL95. 9-15 (1995)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      1995 Final Research Report Summary
  • [Publications] Tajchman,Gary and Dan,Jurafsky and Eric Folder: "Learning Phonological Rule Probabilities from Speech Corpora with Exploratory Computational Phonology" In Proceedings of ACL95. 9-15 (1995)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      1995 Final Research Report Summary
  • [Publications] Tachjman,Gary and Eric,Fosler and Dan,Jurafsky: "Building Multiple Pronunciation Models for Novel Words using Exploratory Computational Phonology" In Proceeding of EUROSPEECH-95.

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      1995 Final Research Report Summary
  • [Publications] Jurafsky,Daniel Chuck,Wooters Gary,Tajchman Jonathan,Segal Andress.Stolcke Eric,Fosler and Nelson,Morgan: "Using a Stochastic Context-Free Grammer as a Language Model for Speech Recognition" In Proceedings of ICASSP-95. 189-192 (1995)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      1995 Final Research Report Summary
  • [Publications] Nigel, WARD: "Using Prosodic Clucs to Decide When to Produce Back-Channel Utterances" CSLP.

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      1995 Final Research Report Summary
  • [Publications] Jurafsky, Daniel: "A Probabilistic Model of Lexical and Syntactic Access and Disambiguation" Cognitive Science.

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      1995 Final Research Report Summary
  • [Publications] Jurafsky, Daniel: "Universal Tendencies in the Semantics of the Diminutives : Structured Polysemy and the Semantic Shift from Children to Second-Order Predicates" Language.

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      1995 Final Research Report Summary
  • [Publications] Gildea, Daniel and Daniel.Jurafsky: "Learning Bias and Phonological Rules Induction" Computational Linguistics. (1995)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      1995 Final Research Report Summary
  • [Publications] Gildea, Daniel and Daniel.Jurafsky: "Automatic Induction of Finite State Transducers for Simple Phonological Rules" In Proceedings of ACL95. 9-15 (1995)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      1995 Final Research Report Summary
  • [Publications] Tajchman, Gary and Dan, Jurafsky and Eric Folder: "Learning Phonological Rule Probabilities from Speech Corpora with Exploratory Computational Phonology" In Proceedings of ACL95. 9-15 (1995)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      1995 Final Research Report Summary
  • [Publications] Tachjman, Gary and Eric Fosler and Dan, Jurafsky: "Building Multiple Pronunciation Models for Novel Words using Exploratory Computational Phonology" In Proceedings of EUROSPEECH-95.

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      1995 Final Research Report Summary
  • [Publications] Jurafsky, Daniel Chuck, Wooters Gary, Tajchman Jonathan, Segal Andress.Stolcke Eric, Fosler and Nelson, Morgan: "Using a Stochastic Context-Free Gramr as a Language Model for Speech Recognition" In Proceedings of ICASSP-95. 189-192 (1995)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      1995 Final Research Report Summary
  • [Publications] Nigel Ward: "An Approach to Tightly-Coupled Syntactic/Semantic Processing for Speech Understanding" Proceedings of the AAAT Workshop on the Integration of Natural Language and Speech Processing. 50-57 (1994)

    • Related Report
      1994 Annual Research Report
  • [Publications] Jurafsky,Daniel: "Using a stochastic contex-free grammar as a language model for speech recognition" IEEE ICASSP-95. (1995)

    • Related Report
      1994 Annual Research Report
  • [Publications] Morgan,Nelson: "Modeling Dynamics in Connectionist Speech Recog-nition-The Time Index Model" International Conference on Spoken Language Processing. 3. 1523-1526 (1994)

    • Related Report
      1994 Annual Research Report

URL: 

Published: 1994-04-01   Modified: 2016-04-21  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi