Search-Oriented Dialog System for Data Science

研究課題

研究課題/領域番号	19K12132
研究種目	基盤研究(C)
配分区分	基金
応募区分	一般
審査区分	小区分61030:知能情報学関連
研究機関	大学共同利用機関法人情報・システム研究機構(機構本部施設等)
研究代表者	金進東大学共同利用機関法人情報・システム研究機構(機構本部施設等), データサイエンス共同利用基盤施設, 特任准教授 (40536893)
研究期間 (年度)	2019-04-01 – 2024-03-31
研究課題ステータス	完了 (2023年度)
配分額 *注記	4,420千円 (直接経費: 3,400千円、間接経費: 1,020千円) 2021年度: 1,430千円 (直接経費: 1,100千円、間接経費: 330千円) 2020年度: 1,170千円 (直接経費: 900千円、間接経費: 270千円) 2019年度: 1,820千円 (直接経費: 1,400千円、間接経費: 420千円)
キーワード	Natural Language / Dialog Interface / User Interface / Database / Data Science / Large Language Models / Customized GPT / Large Language Model / ChatGPT / dialog / intelligent agent / natural language query / database search / task-oriented dialog / intent detection / dialog agent / intelligent interface / agent / search / question answering / data science
研究開始時の研究の概要	Data science is becoming a new paradigm of science, and a lot of investment has been made to develop science data. However, scientists are often unaware of how to access science data. Meanwhile, there has been increasing interest in the technology of conversational agent (CA), which can talk with users in human language, helping them accomplish certain tasks. The research is to investigate the potential of CA technology for search-oriented dialogs to help scientists access science data. We expect it to contribute to advancing the CA technology, and improving the environment of data science.
研究成果の概要	本研究の主な成果としては、（1）人間解剖学3Dモデルを検索するために開発されたカスタマイズGPTであるAnatomy3DExplorer（現在GPTストアで公開中）、（2）RDFデータ検索システムであるLODQAの拡張機能として開発されたダイアログインターフェース、そして（3）テキストアノテーションのウェブインターフェースであるTextAEの拡張機能として開発されたダイアログアノテーション機能が挙げられる。研究成果を述べた論文はGenomics&Informatics誌に投稿され、現在審査中である。全体として、検索志向のダイアログシステムがLLMを活用して効果的に開発できることを示した。
研究成果の学術的意義や社会的意義	本研究成果は、LLMを活用することで自然言語ダイアログインターフェースを効果的に開発できることを示している。この成果は既に他のデータベース検索インターフェースの開発にも拡張されている。学術的には、LLMが人間の知能とデータに埋め込まれた知能を結びつける効果的なレイヤーを提供することを示している。社会的には、データベース検索スキルを持たないユーザーに対してデータベースにアクセスするための実用的な手段を提供できることを示している。これは、一般の人々が専門知識により容易にアクセスできるようになることを意味する。