2001 Fiscal Year Final Research Report Summary
Study on Enhancement of Spoken Language Processing Using Dialogue Corpus Annotated with Discourse Information
Project/Area Number |
11480073
|
Research Category |
Grant-in-Aid for Scientific Research (B)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Intelligent informatics
|
Research Institution | The University of Electro-Communications |
Principal Investigator |
KUREMATSU Akira The Univ. of Electro-Communications, Graduate School of Electro-Communications, Professor, 大学院・電気通信学研究科, 教授 (90251701)
|
Co-Investigator(Kenkyū-buntansha) |
伝 康晴 千葉大学, 文学部, 助教授 (70291458)
山下 洋一 立命館大学, 理工学部, 教授 (80174689)
荒木 雅弘 京都工芸繊維大学, 工芸学部, 助教授 (50252490)
中里 収 名桜大学, 国際学部, 助教授 (90257197)
石崎 雅人 北陸先端科学技術大学院大学, 知識科学研究科, 助教授 (30303340)
|
Project Period (FY) |
1999 – 2001
|
Keywords | Spoken Dialogue Corpus / Goal Oriented Dialogue / Dialogue Tag / Prosody / Morphological Information / Dialogue Act / Dialogue Segment |
Research Abstract |
Dialogue corpora are indispensable to speech and language research. We developed a Japanese dialogue corpus annotated with multi-level information. The annotation information consists of speech, transcription delimited by slash unite, prosodic, part of speech, dialogue acts and dialogue segmentation. The corpus consists of 40 goal-oriented dialogues collected at different research groups. The tagging scheme was evaluated On an experimental basis. A method to infer the utterance-unit tag from both the text corpus and its ,morpheme analysis was proposed. The GUI-based annotation environment was developed which enables the users to predict dialogue acts and relevance information using machine laming techniques and to store the annotated data in the XML format. The autonomous model for turn-taking was proposed, predicting the distribution of smooth transitions between speakers. A method of tagging lubricant words which include discourse markers, fillers and acknowledgement tokens. A discourse level tagging tool using Transformation-based Learning from training corpus was developed. A rule based approach to extract dialogue acts and topics from utterances was investigated. Studies on identifying dialogue acts based on prosodic information and key words information were undertaken and the use of prosodic information was shown to be effective for dialogue tagging. Linguistic cues for dialogue act classification based on statistical analysis were explored. A tool to generate dialogue patterns based on automatically generated Voice XML from dialogue corpus. The disagreements of topic boundaries caused by different strategies were analyzed. The high correlation was shown between the degree of topic break and prosodic parameters by analyzing the topic segment tags and prosody.
|
Research Products
(14 results)
-
-
-
-
-
-
-
-
-
-
-
[Publications] Ichikawa, A., Araki, M., Horiuchi, Y., Ishizaki, M., Itabashi, S., Itoh, T., Kashioka, H., Kato, K., Kikuchi, H., Koiso, H., Kumagai, T., Kurematsu, A., Maekawa, K., Nakazato, S., Tamato, M., Tutiya, S., Yamashita, Y. and Yoshimura, T.: "Evaluation of Annotation Schemes for Japanese Discourse"Proceedings of ACL '99 Workshop on Towards Standards and Tools for Discourse Tagging. 26-34 (1999)
Description
「研究成果報告書概要(欧文)」より
-
-
-