2017 年度実績報告書

ナゲットに基づくタスク指向対話の自動評価に関する研究

研究課題

研究課題/領域番号	17H01830
研究機関	早稲田大学
研究代表者	酒井哲也早稲田大学, 理工学術院, 教授 (80723519)
研究期間 (年度)	2017-04-01 – 2021-03-31
キーワード	対話 / 評価 / 自然言語 / 情報アクセス
研究実績の概要	ヘルプデスク対話データセットversion 0の公開 (http://waseda.box.com/DCH-0-1) と、NTCIR-14 Short Text Conversation Task (STC-3) (http://sakailab.com/ntcir14stc3/) の対話サブタスクの設計および評価指標の検討を行い、以下の学会発表を行った。・Zeng, Z. et al.: Test Collections and Measures for Evaluating Customer-Helpdesk Dialogues, EVIA 2017, pp.1-9, 査読あり, 2017. 概要: We address the problem of evaluating textual, task-oriented dialogues between the customer and the helpdesk, such as those that take the form of online chats. As an initial step towards evaluating automatic helpdesk dialogue systems, we have constructed　a test collection comprising 3,700 real Customer-Helpdesk multiturn dialogues by mining Weibo, a major Chinese social media. We have made our test collection DCH-1 publicly available for research purposes. ・Sakai, T: Towards Automatic Evaluation of Multi-Turn Dialogues: A Task Design that Leverages Inherently Subjective Annotations, EVIA 2017, pp.24-30, 査読あり, 2017.概要: This paper proposes a design of a shared task whose ultimate goal is automatic evaluation of multi-turn, dyadic, textual helpdesk dialogues. The proposed task takes the form of an offline evaluation, where participating systems are given a dialogue as input, and output at least one of the following: (1) an estimated distribution of the annotators’ quality ratings for that dialogue; and (2) an estimated distribution of the annotators’ nugget type labels for each utterance block (i.e., a maximal sequence of consecutive posts by the same utterer) in that dialogue.
現在までの達成度 (区分)	現在までの達成度 (区分) 2: おおむね順調に進展している理由当初の予定通り、以下２点を達成することができた。（１）評価型国際会議NTCIR (NII Testbeds and Community for Information access Research) において、提案する対話タスクがNTCIR-14 Short Text Conversation Task 3 (STC-3) として採択された。（当初の計画ではDialogue Taskと呼んでいたが、NTCIR-13最大の参加者を誇ったSTC-2の後継という位置づけで提案を行ったためこのような名称となった。）（２）ヘルプデスク対話データセットDCH-1 を、その34%を中国語から英語に人手で翻訳したデータも含め、一般公開することができた。
今後の研究の推進方策	2018年度の計画は以下の通りである。 4月中国語テストデータの作成、アノテーションツールの開発; 5-6月テストデータおよび訓練データ一部の中英翻訳; 5-8月中国語テストデータ・訓練データのアノテーション; 9/1 アノテーションつき訓練データ公開; 11/1 タスク参加者にテストデータ公開; 11/30 参加システム結果提出締切; 2/1 評価結果・タスクオーガナイザ論文（ドラフト）公開; 2/E 作成したデータセットについてリソース論文をSIGIR 2019に投稿; 3/15 参加者論文締切; 3/E 結果まとめ

研究成果
(5件)

すべて 2017 その他

すべて国際共同研究 (1件) 学会発表 (2件) (うち国際学会 2件) 備考 (2件)

[国際共同研究] 清華大学/華為技術/今日頭條(中国)
- 国名
  中国
- 外国機関名
  清華大学/華為技術/今日頭條
[学会発表] Test Collections and Measures for Evaluating Customer-Helpdesk Dialogues2017
- 著者名/発表者名
  Zeng, Z., Luo, C., Shang, L., Li, H., and Sakai, T.
- 学会等名
  The 8th International Workshop on Evaluating Information Access (EVIA 2017)
- 国際学会
[学会発表] Towards Automatic Evaluation of Multi-Turn Dialogues: A Task Design that Leverages Inherently Subjective Annotations2017
- 著者名/発表者名
  Sakai, T.
- 学会等名
  The 8th International Workshop on Evaluating Information Access (EVIA 2017)
- 国際学会
[備考] ヘルプデスク対話データセットDCH-1 (中国語、一部英訳あり)
- URL
  http://waseda.box.com/DCH-0-1
[備考] NTCIR-14 Short Text Conversation 3 タスクホームページ
- URL
  http://sakailab.com/ntcir14stc3/

2017 年度 実績報告書

ナゲットに基づくタスク指向対話の自動評価に関する研究

研究代表者

酒井 哲也 早稲田大学, 理工学術院, 教授 (80723519)

現在までの達成度 (区分)

理由

研究成果

[国際共同研究] 清華大学/華為技術/今日頭條(中国)

国名

外国機関名

[学会発表] Test Collections and Measures for Evaluating Customer-Helpdesk Dialogues2017

著者名/発表者名

学会等名

[学会発表] Towards Automatic Evaluation of Multi-Turn Dialogues: A Task Design that Leverages Inherently Subjective Annotations2017

著者名/発表者名

学会等名

[備考] ヘルプデスク対話データセットDCH-1 (中国語、一部英訳あり)

URL

[備考] NTCIR-14 Short Text Conversation 3 タスク ホームページ

URL

2017 年度実績報告書

酒井哲也早稲田大学, 理工学術院, 教授 (80723519)

[備考] NTCIR-14 Short Text Conversation 3 タスクホームページ