2022 Fiscal Year Final Research Report

A turn-taking system linked with dialogue understanding and utterance generation

Research Project

PDF

Project/Area Number	20K19821
Research Category	Grant-in-Aid for Early-Career Scientists
Allocation Type	Multi-year Fund
Review Section	Basic Section 61010:Perceptual information processing-related
Research Institution	Kyoto University
Principal Investigator	Inoue Koji 京都大学, 情報学研究科, 助教 (10838684)
Project Period (FY)	2020-04-01 – 2023-03-31
Keywords	音声対話システム / 会話ロボット / ターンテイキング / 発話権取得 / 話者交替
Outline of Final Research Achievements	A novel model for turn-taking, predicting the right to speak in spoken dialogue systems, has been pioneered. To mirror human turn-taking, annotations were applied to discern the 'intent' and 'content' of each utterance within a dialogue dataset. Subsequently, a two-step turn-taking prediction model was developed. It first determines if the 'intent' or 'content' is intelligible and then decides whether to take the turn. Additionally, to enhance the functionality of spoken dialogue systems, the generation of shared laughter has been realized. A system composed of three modules for laughter detection, shared laughter prediction, and laughter type selection was proposed, demonstrating its efficacy.
Free Research Field	音声対話システム
Academic Significance and Societal Importance of the Research Achievements	音声対話システムは、会話ロボットやスマートスピーカに展開されている。しかし、これらのシステムによるやりとりは機械的であると言わざるを得ない。その要因の一つとしてターンテイキングが挙げられる。現在のシステムでは、発話権を取得するに際して、不自然に長い間や割り込みが生じることが多く、これにより対話の円滑さを低下させている。その一方で、人間どうしの対話では、特に意識することなく、円滑なターンテイキングが実現されている。本研究により、人間どうしのターンテイキングのメカニズムの解明に向けて、構成論的な一つのアプローチを示すことができた。