2020 Fiscal Year Research-status Report
Search-Oriented Dialog System for Data Science
Project/Area Number |
19K12132
|
Research Institution | 大学共同利用機関法人情報・システム研究機構(機構本部施設等) |
Principal Investigator |
金 進東 大学共同利用機関法人情報・システム研究機構(機構本部施設等), データサイエンス共同利用基盤施設, 特任准教授 (40536893)
|
Project Period (FY) |
2019-04-01 – 2022-03-31
|
Keywords | dialog agent / natural language query / database search / intelligent interface |
Outline of Annual Research Achievements |
The research schedule in FY2 was seriously affected by the pandemic situation. We had to cancel a workshop which were planned to be held to conduct a large scale crowd-sourcing annotation campaign. Also, the PI was limited to gain a sufficient access to the facilities and resources of his institute (DBCLS) to conduct the research. The situation affected the development schedule of the corpus of dialogs, which again affected the development schedule intent detection systems. Accordingly a substantial change to the entire research schedule had to be made. Particularly the development of corpus annotation, which require organization of human resources, was seriously delayed. Instead, research efforts were concentrated on software development in FY2. As the result, the research achievements of FY2 include (1) a corpus of search-oriented dialogs (an early version), (2) annotation guidelines, (3) a rule-based intent detection system (an early version), and a deep learning-based intent detection system (an early version). Despite of the change of the schedule, we at least could develop a working system of search-oriented dialog (an early version). An iterative research to find improvement of the performance, and its evaluation from a perspective of user experience should be conducted in FY3. Development of a high quality corpus annotation is necessary for the performance improvement and also for systematic evaluation.
|
Current Status of Research Progress |
Current Status of Research Progress
3: Progress in research has been slightly delayed.
Reason
Due to the pandemic situation, we had to cancel a workshop which were organized to conduct a crowd-sourcing annotation campaign during the LREC conference. The cancellation affected the development schedule of the corpus, and we could develop only an early version of the corpus without annotation, which also affected to the development schedule of conducting deep learning, and evaluation of the intent detection system.
|
Strategy for Future Research Activity |
Due to the pandemic situation, the research environment was largely changed, and we had to make a serious modification to the research schedule, as follows: (FY1; done) UI for corpus development, Baseline system, (FY2; done) Corpus development (an early version), Annotation Guidelines, UI for experiment, Intent detection (FY3) Corpus annotation, deep learning, evaluation, publication.
|
Causes of Carryover |
As the large scale annotation campaign which was originally planned to be conducted in FY2 was cancelled due to the pandemic situation, we plan to hire part-time annotators to obtain annotations to the dialog corpus in FY3. We also plan to conduct a reasonable scale of evaluation of the dialog system. The annotation and evaluation requires purchase of various devices and payment of some amount of compensation. As FY3 is the final year of the research project, we plan to publish conference and journal papers, and also to open a homepage to share all the outcomes from the research. It requires APC (article processing fee) and outsourcing of homepage development.
|