Development of a speech understanding system

Research Project

Project/Area Number	04044108
Research Category	Grant-in-Aid for international Scientific Research
Allocation Type	Single-year Grants
Section	Joint Research
Research Institution	Osaka University
Principal Investigator	MIZOGUCHI Riichiro The Institute of Scientific and Industrial Research, Osaka University, 産業科学研究所, 教授 (20116106)
Co-Investigator(Kenkyū-buntansha)	OH Yung hwan Department of Computer Science, Korea Advanced Institute of Scientific and Techn, 電子計算機学科, 教授 KITAMURA Yoshinobu The Institute of Scientific and Industrial Research, Osaka University, 産業科学研究所, 助手 (20252710) YAMASHITA Yoichi The Institute of Scientific and Industrial Research, Osaka University, 産業科学研究所, 助手 (80174689) IKEDA Mitsuru The Institute of Scientific and Industrial Research, Osaka University, 産業科学研究所, 助手 (80212786) YUNGーHWAN Oh 韓国科学技術院, 電子計算機学科, 助教授
Project Period (FY)	1992 – 1993
Project Status	Completed (Fiscal Year 1993)
Budget Amount *help	¥8,200,000 (Direct Cost: ¥8,200,000) Fiscal Year 1993: ¥2,000,000 (Direct Cost: ¥2,000,000) Fiscal Year 1992: ¥6,200,000 (Direct Cost: ¥6,200,000)
Keywords	Speech Recognition / Speech Understanding / Korean Language / Fuzzy / Dialog Model / ATMS / ファジィ
Research Abstract	The objective of this research is development of fundamental techniques necessary to understanding spoken dialogue, which include knowledge-based speech recognition system, non-monotonic reasoning in natural language processing, and dialogue modeling. The following are the summary of the research results. 1) We verified the efficiency of the knowledge-based approach for Korean speech recognition. Furthermore, some new ideas were proposed to improve the speech recognition. To avoid the difficulties in segmentation, a non-uniform unit is introduced. Every unit has its stationary point at each end of the unit, and transient part in the middle. The parameter trajectory is described by symbolic representation and fuzzy linguistic variables. Redundancy of speech data is used to improve the performance of the recognition system in the post-processor. The prototype system was tested with continuous Korean digit speech of unknown length, and the recognition rate of 97% was obtained. 2) Understanding of continuous speech is generally a tough problem, since acoustic information is unreliable. An efficient search mechanism is indispensable because the combination of ambiguous information is very large. Then, we developed a framework of speech understanding system based on ATMS, which is a method of non-monotonic reasoning. The introduction of ATMS reduced elapsed time of natural language processing from 64 sec to 45 sec for understanding speech of 8 Japanese sentences. 3) Two kinds of dialogue model characterizing structures in dialogue were proposed for understanding spoken dialogue. One is the SR-plan model which describes utterance pairs composed of the stimulus and the response. The other is Topic Packet Network (TPN) and corresponds to the discourse segments. A mechanism for predicting the next utterance was also developed based on these dialogue models and evaluated on some sample dialogues.

Report

(2 results)

1993 Final Research Report Summary
1992 Annual Research Report

Research Products
(18 results)

All Other

All Publications (18 results)

[Publications] Hajin Yu: "Fuzzy Expert System for Continuous Speech Recognition" Proceedings of'93Korea/Japan Joint Conference on Expert Systems. 951-965 (1992)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1993 Final Research Report Summary
[Publications] Shingo Nishioka: "A Powerful Disambiguating Mechanism for Speech Understanding Systems Based on ATMS" Proceedings of 1992 International Conference on Spoken Language. 1641-1644 (1992)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1993 Final Research Report Summary
[Publications] Yoichi Yamashita: "MASCOTS II:A Dialog Manager in General Interface for Speech Input and Output" IEICE Trans.Inf.& Syst.E76-D. 74-83 (1993)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1993 Final Research Report Summary
[Publications] Yoichi Yamashita: "Next Utterance Prediction Based on Two Kinds of Dialog Models" Proceedings of EUROSPEECH'93.1161-1164 (1993)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1993 Final Research Report Summary
[Publications] Hajin Yu: "不均一単位を用いる連続音声認識エキスパートシステム" 電子情報通信学会技術研究報告. SP93-56. 57-64 (1993)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1993 Final Research Report Summary
[Publications] 大田雅彰: "音声対話理解のための話題の決定について" 電子情報通信学会技術研究報告. SP93-129. 9-16 (1994)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1993 Final Research Report Summary
[Publications] Hajin Yu: "Fuzzy Expert System for Continuous Speech Recognition" Proceedings of '93 Korea/Japan Joint Conference on Expert Systems. 951-965 (1992)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1993 Final Research Report Summary
[Publications] Shingo Nishioka: "A Powerful Disambiguating Mechanism for Speech Understanding Systems Based on ATMS" Proceedings of 1992 International Conference on Spoken Language. 1641-1644 (1992)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1993 Final Research Report Summary
[Publications] Yoichi Yamashita: "MASCOTS II : A Dialogue Manager in General Interface for Speech Input and Output" IEICE Trans. Inf. & Syst.E76-D. 74-83 (1993)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1993 Final Research Report Summary
[Publications] Yoichi Yamashita: "Next Utterance Prediction Based on Two Kinds of Dialog Models" Proceedings of EUROSPEECH '93. 1161-1164 (1993)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1993 Final Research Report Summary
[Publications] Hajin Yu: "Expert System for Continuous Speech Recognition with Non-Uniform Recognition Unit" Technical Report of IEICE. SP93-56. 57-64 (1993)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1993 Final Research Report Summary
[Publications] Masaaki Ohta: "Decision of Topic for Understanding of Spoken Language" Technical Report of IEICE. SP93-129. 9-16 (1994)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1993 Final Research Report Summary
[Publications] Hajin Yu: "Fuzzy Expert System for Continuous Speech Recognition" Proceedings of '93Korea/Japan Joint Conference on Expert Systems. 951-965 (1993)
- Related Report
  1992 Annual Research Report
[Publications] Yoichi YAMASHITA: "MASCOTS II:A Dialog Manager in General Interface for Speech Input and Output" IEICE Trans.Inf.&Syst.E76-D. 74-83 (1993)
- Related Report
  1992 Annual Research Report
[Publications] Shingo NISHIOKA: "A Powerful Disambiguating Mechanism for Speech Understanding Systems Based on ATMS" Proceedings of 1992 Intemational Conference on Spoken Language. 1641-1644 (1992)
- Related Report
  1992 Annual Research Report
[Publications] 吉田英昭: "機械への音声入力のための汎用対話管理システム" 人工知能学会言語・音声理解と対話処理研究会資料. SIG-SLUD-9202-9. 77-85 (1992)
- Related Report
  1992 Annual Research Report
[Publications] 平松敬史: "音声対話理解のための話題知識の利用" 電子情報通信学会技術研究報告. SP92-110. 55-62 (1992)
- Related Report
  1992 Annual Research Report
[Publications] 山下洋一: "対話音声処理のための模擬対話の収録と分析" 日本音響学会秋季講演論文集. 23-24 (1992)
- Related Report
  1992 Annual Research Report

Development of a speech understanding system

Principal Investigator

MIZOGUCHI Riichiro The Institute of Scientific and Industrial Research, Osaka University, 産業科学研究所, 教授 (20116106)

¥8,200,000 (Direct Cost: ¥8,200,000)

Report

Research Products

[Publications] Hajin Yu: "Fuzzy Expert System for Continuous Speech Recognition" Proceedings of'93Korea/Japan Joint Conference on Expert Systems. 951-965 (1992)

Description

Related Report

[Publications] Shingo Nishioka: "A Powerful Disambiguating Mechanism for Speech Understanding Systems Based on ATMS" Proceedings of 1992 International Conference on Spoken Language. 1641-1644 (1992)

Description

Related Report

[Publications] Yoichi Yamashita: "MASCOTS II:A Dialog Manager in General Interface for Speech Input and Output" IEICE Trans.Inf.& Syst.E76-D. 74-83 (1993)

Description

Related Report

[Publications] Yoichi Yamashita: "Next Utterance Prediction Based on Two Kinds of Dialog Models" Proceedings of EUROSPEECH'93.1161-1164 (1993)

Description

Related Report

[Publications] Hajin Yu: "不均一単位を用いる連続音声認識エキスパートシステム" 電子情報通信学会技術研究報告. SP93-56. 57-64 (1993)

Description

Related Report

[Publications] 大田雅彰: "音声対話理解のための話題の決定について" 電子情報通信学会技術研究報告. SP93-129. 9-16 (1994)

Description

Related Report

[Publications] Hajin Yu: "Fuzzy Expert System for Continuous Speech Recognition" Proceedings of '93 Korea/Japan Joint Conference on Expert Systems. 951-965 (1992)

Description

Related Report

[Publications] Shingo Nishioka: "A Powerful Disambiguating Mechanism for Speech Understanding Systems Based on ATMS" Proceedings of 1992 International Conference on Spoken Language. 1641-1644 (1992)

Description

Related Report

[Publications] Yoichi Yamashita: "MASCOTS II : A Dialogue Manager in General Interface for Speech Input and Output" IEICE Trans. Inf. & Syst.E76-D. 74-83 (1993)

Description

Related Report

[Publications] Yoichi Yamashita: "Next Utterance Prediction Based on Two Kinds of Dialog Models" Proceedings of EUROSPEECH '93. 1161-1164 (1993)

Description

Related Report

[Publications] Hajin Yu: "Expert System for Continuous Speech Recognition with Non-Uniform Recognition Unit" Technical Report of IEICE. SP93-56. 57-64 (1993)

Description

Related Report

[Publications] Masaaki Ohta: "Decision of Topic for Understanding of Spoken Language" Technical Report of IEICE. SP93-129. 9-16 (1994)

Description

Related Report

[Publications] Hajin Yu: "Fuzzy Expert System for Continuous Speech Recognition" Proceedings of '93Korea/Japan Joint Conference on Expert Systems. 951-965 (1993)

Related Report

[Publications] Yoichi YAMASHITA: "MASCOTS II:A Dialog Manager in General Interface for Speech Input and Output" IEICE Trans.Inf.&Syst.E76-D. 74-83 (1993)

Related Report

[Publications] Shingo NISHIOKA: "A Powerful Disambiguating Mechanism for Speech Understanding Systems Based on ATMS" Proceedings of 1992 Intemational Conference on Spoken Language. 1641-1644 (1992)

Related Report

[Publications] 吉田英昭: "機械への音声入力のための汎用対話管理システム" 人工知能学会言語・音声理解と対話処理研究会資料. SIG-SLUD-9202-9. 77-85 (1992)

Related Report

[Publications] 平松敬史: "音声対話理解のための話題知識の利用" 電子情報通信学会技術研究報告. SP92-110. 55-62 (1992)

Related Report

[Publications] 山下洋一: "対話音声処理のための模擬対話の収録と分析" 日本音響学会秋季講演論文集. 23-24 (1992)

Related Report