汎用かつ再利用可能な方策に基づく階層強化学習

Research Project

Project/Area Number	23H03450
Research Category	Grant-in-Aid for Scientific Research (B)
Allocation Type	Single-year Grants
Section	一般
Review Section	Basic Section 61030:Intelligent informatics-related Basic Section 60030:Statistical science-related Sections That Are Subject to Joint Review: Basic Section60030:Statistical science-related , Basic Section61030:Intelligent informatics-related
Research Institution	The University of Tokyo
Principal Investigator	鶴岡慶雅東京大学, 大学院情報理工学系研究科, 教授 (50566362)
Project Period (FY)	2023-04-01 – 2026-03-31
Project Status	Granted (Fiscal Year 2023)
Budget Amount *help	¥17,680,000 (Direct Cost: ¥13,600,000、Indirect Cost: ¥4,080,000) Fiscal Year 2023: ¥6,240,000 (Direct Cost: ¥4,800,000、Indirect Cost: ¥1,440,000)
Keywords	強化学習 / 階層強化学習 / スキル / 言語モデル
Outline of Research at the Start	近年、深層強化学習技術の急速な発展により、囲碁や将棋、ビデオゲームなどで人間を越えるレベルのAI が実現されているが、ロボットやプラント、交通やインフラの制御といった現実世界の意志決定問題に対する深層強化学習の応用は限定的である。現実のタスクの多くは、完了までに多くのステップを必要とする長期タスクであり、本研究プロジェクトでは、そのような問題に対して有効な階層強化学習手法の確立を目指す。具体的には、多様で有効なスキルを自動的に獲得し、さらにそれらを再利用可能にすることなどを通してエージェントの学習効率および汎化能力の向上を目指す。