Search-Oriented Dialog System for Data Science
Project/Area Number |
19K12132
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Multi-year Fund |
Section | 一般 |
Review Section |
Basic Section 61030:Intelligent informatics-related
|
Research Institution | 大学共同利用機関法人情報・システム研究機構(機構本部施設等) |
Principal Investigator |
金 進東 大学共同利用機関法人情報・システム研究機構(機構本部施設等), データサイエンス共同利用基盤施設, 特任准教授 (40536893)
|
Project Period (FY) |
2019-04-01 – 2024-03-31
|
Project Status |
Granted (Fiscal Year 2022)
|
Budget Amount *help |
¥4,420,000 (Direct Cost: ¥3,400,000、Indirect Cost: ¥1,020,000)
Fiscal Year 2021: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000)
Fiscal Year 2020: ¥1,170,000 (Direct Cost: ¥900,000、Indirect Cost: ¥270,000)
Fiscal Year 2019: ¥1,820,000 (Direct Cost: ¥1,400,000、Indirect Cost: ¥420,000)
|
Keywords | dialog / intelligent agent / natural language query / database search / task-oriented dialog / intent detection / dialog agent / intelligent interface / agent / search / question answering / data science |
Outline of Research at the Start |
Data science is becoming a new paradigm of science, and a lot of investment has been made to develop science data. However, scientists are often unaware of how to access science data. Meanwhile, there has been increasing interest in the technology of conversational agent (CA), which can talk with users in human language, helping them accomplish certain tasks. The research is to investigate the potential of CA technology for search-oriented dialogs to help scientists access science data. We expect it to contribute to advancing the CA technology, and improving the environment of data science.
|
Outline of Annual Research Achievements |
In R4, a thorough evaluation on the search-oriented dialog system was conducted, which revealed several problems in the intent detection system. We conducted research to solve the problems, and found that they could be effectively solved by utilizing few-shot learning with InstructGPT. During the research, we realized that the state-of-the-art of pre-trained language models were rapidly evolving, which we needed to be able to constantly leverage. We made a change to the architecture of the search-oriented dialog system, so that we can easily incorporate external language resources like pre-trained language models. With the new architecture, we could utilize recently released pre-trained language models, including chatGPT and GPT4, which led to much improved performance.
|
Current Status of Research Progress |
Current Status of Research Progress
2: Research has progressed on the whole more than it was originally planned.
Reason
Due to the pandemic situation the original schedule of the research has been changed substantially. Also, due to problems found during a thorough evaluation of the system, the architecture of the system had to be changed. However, thanks to the extension of the research period which was generously granted by JSPS, solutions to the problems could be found, and the research is approaching to finalization.
|
Strategy for Future Research Activity |
In R5, which is the last year of the project, (1) another round of human evaluation will be conducted; (2) thorough analysis of the evaluation results will be performed, and (3) papers will be published on the results. We are also planning to make the search-oriented dialog system available as a plugin for chatGPT, as a way of inseminating the result of the research.
|
Report
(4 results)
Research Products
(2 results)