不完全情報かつ多人数参加環境に適した構造を持つ深層強化学習手法の開発

研究課題

研究課題/領域番号	18K19832
研究種目	挑戦的研究(萌芽)
配分区分	基金
審査区分	中区分62:応用情報学およびその関連分野
研究機関	東京大学
研究代表者	金子知適東京大学, 大学院総合文化研究科, 准教授 (00345068)
研究期間 (年度)	2018-06-29 – 2022-03-31
研究課題ステータス	完了 (2021年度)
配分額 *注記	5,850千円 (直接経費: 4,500千円、間接経費: 1,350千円) 2020年度: 2,470千円 (直接経費: 1,900千円、間接経費: 570千円) 2019年度: 2,470千円 (直接経費: 1,900千円、間接経費: 570千円) 2018年度: 910千円 (直接経費: 700千円、間接経費: 210千円)
キーワード	ゲームプログラミング / 深層強化学習
研究成果の概要	不完全情報かつ多人数のゲームを題材に，モデルを持つ深層強化学習に関する研究を行った．強化学習はAlphaGoの成功で有名なように囲碁やビデオゲームで顕著な成果をあげているが，本研究ではその対象をさらに広げて現実に近い複雑さを持つ問題の例として，不完全情報かつ多人数のゲームを扱う．問題が複雑になるほど，エージェントの学習は困難になる．そこで本研究では，既存技術である深層学習に加えて，不完全情報かつ多人数を扱うことに適したモデルの獲得と精密化を行う学習フレームワークを研究した．
研究成果の学術的意義や社会的意義	より広くコンピュータを社会に役立てるために，学習するAIエージェントの研究を行った．一般に，人が完璧なプログラムを事前に準備することはほぼ不可能であるため，コンピュータプログラムあるいはAIエージェント自身が適切な振る舞いを身につけることが望ましい．適切に学習させるためには様々な技術的な困難があり，人の教育でも様々な教育方法とトレードオフがあるように，目的に応じて適切な技術を使い分けたり，新たに開発する必要がある．ここでは，複雑な振る舞いが必要とされる状況の題材として，不完全情報とマルチエージェントゲームを題材として，学習方法を研究した．

報告書

(5件)

研究成果
(40件)

すべて 2021 2020 2019 2018

すべて雑誌論文 (26件) (うち査読あり 25件、オープンアクセス 13件) 学会発表 (14件) (うち国際学会 3件)

[雑誌論文] Improving counterfactual regret minimization agents training in card game cheat using ordered abstraction2021
- 著者名/発表者名
  C. Yi and T. Kaneko
- 雑誌名
  
  Advances in computers and games
  
  巻: -
- 関連する報告書
  2021 実績報告書
- 査読あり
[雑誌論文] Local coordination in multi-agent reinforcement learning2021
- 著者名/発表者名
  F. Xu and T. Kaneko
- 雑誌名
  
  International conference on technologies and applications of artificial intelligence
  
  巻: -
- 関連する報告書
  2021 実績報告書
- 査読あり
[雑誌論文] Hierarchical advantage for reinforcement learning in parameterized action space2021
- 著者名/発表者名
  Z. Hu and T. Kaneko
- 雑誌名
  
  IEEE international conference on games
  
  巻: - ページ: 1-8
- DOI
  10.1109/cog52621.2021.9619068
- 関連する報告書
  2021 実績報告書
- 査読あり
[雑誌論文] Improve counterfactual regret minimization agents training by setting limitations of numbers of steps in games2021
- 著者名/発表者名
  C. Yi and T. Kaneko
- 雑誌名
  
  26th game programming workshop
  
  巻: - ページ: 117-123
- 関連する報告書
  2021 実績報告書
- 査読あり / オープンアクセス
[雑誌論文] Prediction of werewolf players by sentiment analysis of game dialogue in japanese2021
- 著者名/発表者名
  Y. Sun and T. Kaneko
- 雑誌名
  
  26th game programming workshop
  
  巻: - ページ: 186-191
- 関連する報告書
  2021 実績報告書
- 査読あり / オープンアクセス
[雑誌論文] ついたて王手どうぶつしょうぎの提案とCFRによる戦略の学習2021
- 著者名/発表者名
  中屋敷金子
- 雑誌名
  
  第26回ゲームプログラミングワークショップ
  
  巻: - ページ: 34-41
- NAID
  170000185756
- 関連する報告書
  2021 実績報告書
- 査読あり / オープンアクセス
[雑誌論文] Playing catan with cross-dimensional neural network2020
- 著者名/発表者名
  Gendre and Kaneko
- 雑誌名
  
  ICONIP
  
  巻: 12533 ページ: 580-592
- DOI
  10.1007/978-3-030-63833-7_49
- ISBN
  9783030638320, 9783030638337
- 関連する報告書
  2020 実施状況報告書
- 査読あり
[雑誌論文] Evaluation of loss function for stable policy learning in dobutsu shogi2020
- 著者名/発表者名
  Nakayashiki and Kaneko
- 雑誌名
  
  International conference on technologies and applications of artificial intelligence
  
  巻: N/A ページ: 175-180
- 関連する報告書
  2020 実施状況報告書
- 査読あり
[雑誌論文] Ceramic: A research environment based on the multi-player strategic board game azul2020
- 著者名/発表者名
  Gendre and Kaneko
- 雑誌名
  
  25th game programming workshop
  
  巻: 978-4-907626-46-4 C3804 ページ: 155-160
- 関連する報告書
  2020 実施状況報告書
- 査読あり / オープンアクセス
[雑誌論文] Diverse exploration via infomax options2020
- 著者名/発表者名
  Kanagawa and Kaneko
- 雑誌名
  
  Arxiv
  
  巻: 978-4-907626-46-4 C3804 ページ: 1-21
- 関連する報告書
  2020 実施状況報告書
- オープンアクセス
[雑誌論文] 離散行動空間における soft actor-critic の評価2020
- 著者名/発表者名
  合田金子
- 雑誌名
  
  第25回ゲームプログラミングワークショップ
  
  巻: 978-4-907626-46-4 C3804 ページ: 175-180
- NAID
  170000184494
- 関連する報告書
  2020 実施状況報告書
- 査読あり / オープンアクセス
[雑誌論文] 逆転の余地を考慮した評価関数の設計とどうぶつしょうぎによる評価2020
- 著者名/発表者名
  中屋敷金子
- 雑誌名
  
  第25回ゲームプログラミングワークショップ
  
  巻: 978-4-907626-46-4 C3804 ページ: 22-29
- NAID
  170000184517
- 関連する報告書
  2020 実施状況報告書
- 査読あり / オープンアクセス
[雑誌論文] Computer Shogi Tournaments and Techniques2019
- 著者名/発表者名
  Tomoyuki Kaneko and Takenobu Takizawa
- 雑誌名
  
  IEEE Transactions on Games
  
  巻: 11(3) 号: 3 ページ: 267-274
- DOI
  10.1109/tg.2019.2939259
- 関連する報告書
  2019 実施状況報告書
- 査読あり
[雑誌論文] RankNet for evaluation functions of the game of Go2019
- 著者名/発表者名
  Yusaku Mandai and Tomoyuki Kaneko
- 雑誌名
  
  ICGA Journal
  
  巻: 41(2) 号: 2 ページ: 78-91
- DOI
  10.3233/icg-190108
- 関連する報告書
  2019 実施状況報告書
- 査読あり
[雑誌論文] Rogue-Gym: A New Challenge for Generalization in Reinforcement Learning2019
- 著者名/発表者名
  Yuji Kanagawa and Tomoyuki Kaneko
- 雑誌名
  
  IEEE Conference on Games
  
  巻: 19013855 ページ: 1-8
- DOI
  10.1109/cig.2019.8848075
- 関連する報告書
  2019 実施状況報告書
- 査読あり
[雑誌論文] Deep Residual Attention Reinforcement Learning2019
- 著者名/発表者名
  Hanhua Zhu and Tomoyuki Kaneko
- 雑誌名
  
  International Conference on Technologies and Applications of Artificial Intelligence
  
  巻: 19279615 ページ: 1-6
- DOI
  10.1109/taai48200.2019.8959896
- 関連する報告書
  2019 実施状況報告書
- 査読あり
[雑誌論文] Application of Deep-RL with Sample-Efficient Method in Mini-games of StarCraft II2019
- 著者名/発表者名
  Zhejie Hu and Tomoyuki Kaneko
- 雑誌名
  
  International Conference on Technologies and Applications of Artificial Intelligence
  
  巻: 19279598 ページ: 1-6
- DOI
  10.1109/taai48200.2019.8959866
- 関連する報告書
  2019 実施状況報告書
- 査読あり
[雑誌論文] Acquiring Strategies for the Board Game Geister by Regret Minimization2019
- 著者名/発表者名
  Chen Chen and Tomoyuki Kaneko
- 雑誌名
  
  International Conference on Technologies and Applications of Artificial Intelligence
  
  巻: 19279608 ページ: 1-6
- DOI
  10.1109/taai48200.2019.8959878
- 関連する報告書
  2019 実施状況報告書
- 査読あり
[雑誌論文] Deep Recurrent Q-Network with Truncated History2018
- 著者名/発表者名
  Hyunwoo Oh and Tomoyuki Kaneko
- 雑誌名
  
  IEEE Technologies and Applications of Artificial Intelligence
  
  巻: - ページ: 34-39
- DOI
  10.1109/taai.2018.00017
- 関連する報告書
  2018 実施状況報告書
- 査読あり
[雑誌論文] Application of Deep Reinforcement Learning in Werewolf Game Agents2018
- 著者名/発表者名
  Tianhe Wang and Tomoyuki Kaneko
- 雑誌名
  
  IEEE Technologies and Applications of Artificial Intelligence
  
  巻: - ページ: 28-33
- DOI
  10.1109/taai.2018.00016
- 関連する報告書
  2018 実施状況報告書
- 査読あり
[雑誌論文] Playing the Flappy Bird with Reinforcement Learning Algorithms2018
- 著者名/発表者名
  Hanhua Zhu and Tomoyuki Kaneko
- 雑誌名
  
  The 23rd Game Programming Workshop
  
  巻: -
- 関連する報告書
  2018 実施状況報告書
- 査読あり / オープンアクセス
[雑誌論文] Counterfactual Regret Minimization for the Board Game Geister2018
- 著者名/発表者名
  Chen Chen and Tomoyuki Kaneko
- 雑誌名
  
  The 23rd Game Programming Workshop
  
  巻: -
- 関連する報告書
  2018 実施状況報告書
- 査読あり / オープンアクセス
[雑誌論文] Reinforcement Learning with Effective Exploitation of Experiences on Mini-Games of StarCraft II2018
- 著者名/発表者名
  ZheJie Hu and Tomoyuki Kaneko
- 雑誌名
  
  The 23rd Game Programming Workshop
  
  巻: -
- 関連する報告書
  2018 実施状況報告書
- 査読あり / オープンアクセス
[雑誌論文] ローグライクゲームによる強化学習ベンチマーク環境Rogue-Gymの提案2018
- 著者名/発表者名
  金川裕司金子知適
- 雑誌名
  
  第23回ゲームプログラミングワークショップ
  
  巻: -
- NAID
  170000178478
- 関連する報告書
  2018 実施状況報告書
- 査読あり / オープンアクセス
[雑誌論文] 人狼エージェントにおける深層Qネットワークの応用2018
- 著者名/発表者名
  王天鶴金子知適
- 雑誌名
  
  第23回ゲームプログラミングワークショップ
  
  巻: -
- NAID
  170000178462
- 関連する報告書
  2018 実施状況報告書
- 査読あり / オープンアクセス
[雑誌論文] LSTM の初期状態の学習による DRQN の改善2018
- 著者名/発表者名
  Oh Hyunwoo 金子知適
- 雑誌名
  
  第23回ゲームプログラミングワークショップ
  
  巻: -
- NAID
  170000178493
- 関連する報告書
  2018 実施状況報告書
- 査読あり / オープンアクセス
[学会発表] Improve counterfactual regret minimization for card game cheat2020
- 著者名/発表者名
  Yi and Kaneko
- 学会等名
  25th game programming workshop
- 関連する報告書
  2020 実施状況報告書
- 国際学会
[学会発表] Application of dream to the board game geister2020
- 著者名/発表者名
  Chen and Kaneko
- 学会等名
  25th game programming workshop
- 関連する報告書
  2020 実施状況報告書
- 国際学会
[学会発表] Training japanese mahjong agent with two dimension feature representation2020
- 著者名/発表者名
  Honghai and Kaneko
- 学会等名
  25th game programming workshop
- 関連する報告書
  2020 実施状況報告書
- 国際学会
[学会発表] ProcgenBenchmark における汎化性能を高める強化学習2020
- 著者名/発表者名
  徐金子
- 学会等名
  第25回ゲームプログラミングワークショップ
- 関連する報告書
  2020 実施状況報告書
[学会発表] Utilizing History Information in Acquiring Strategies for Board Game Geister by Deep Counterfactual Regret Minimization2019
- 著者名/発表者名
  Chen Chen and Tomoyuki Kaneko
- 学会等名
  The 24th Game Programming Workshop
- 関連する報告書
  2019 実施状況報告書
[学会発表] An Extension of Counterfactual Regret Minimization for Multiplayer Card Games2019
- 著者名/発表者名
  Yu Cao and Tomoyuki Kaneko
- 学会等名
  The 24th Game Programming Workshop
- 関連する報告書
  2019 実施状況報告書
[学会発表] Performance of Counterfactual Regret Minimization with Self-Confirming Equilibrium2019
- 著者名/発表者名
  Cheng Yi and Tomoyuki Kaneko
- 学会等名
  The 24th Game Programming Workshop
- 関連する報告書
  2019 実施状況報告書
[学会発表] どうぶつしょうぎを用いた AlphaZero の手法の調査2019
- 著者名/発表者名
  中屋敷太一金子知適
- 学会等名
  第24回ゲームプログラミングワークショップ
- 関連する報告書
  2019 実施状況報告書
[学会発表] スタークラフト II のミニゲームにおけるマルチタスク強化学習2019
- 著者名/発表者名
  徐凡超金子知適
- 学会等名
  第24回ゲームプログラミングワークショップ
- 関連する報告書
  2019 実施状況報告書
[学会発表] Enhancing Sample Efficiency of Deep Reinforcement Learning to Master the Mini-games of StarCraft II2019
- 著者名/発表者名
  ZheJie Hu and Tomoyuki Kaneko
- 学会等名
  The 24th Game Programming Workshop
- 関連する報告書
  2019 実施状況報告書
[学会発表] Counterfactual Regret Minimisation for playing the multiplayer bluffing dice game Dudo2019
- 著者名/発表者名
  Quentin Gendre and Tomoyuki Kaneko
- 学会等名
  The 24th Game Programming Workshop
- 関連する報告書
  2019 実施状況報告書
[学会発表] Training Agents with Long-range Information in Deep Reinforcement Learning2019
- 著者名/発表者名
  Hanhua Zhu and Tomoyuki Kaneko
- 学会等名
  The 24th Game Programming Workshop
- 関連する報告書
  2019 実施状況報告書
[学会発表] Back Prediction in the Game of Go2019
- 著者名/発表者名
  Tang Jiachen and Tomoyuki Kaneko
- 学会等名
  The 24th Game Programming Workshop
- 関連する報告書
  2019 実施状況報告書
[学会発表] Improving Mahjong Agent by Predicting Types of Yaku2019
- 著者名/発表者名
  Long Honghai and Tomoyuki Kaneko
- 学会等名
  The 24th Game Programming Workshop
- 関連する報告書
  2019 実施状況報告書

不完全情報かつ多人数参加環境に適した構造を持つ深層強化学習手法の開発

研究代表者

金子 知適 東京大学, 大学院総合文化研究科, 准教授 (00345068)

5,850千円 (直接経費: 4,500千円、間接経費: 1,350千円)

報告書

研究成果

[雑誌論文] Improving counterfactual regret minimization agents training in card game cheat using ordered abstraction2021

著者名/発表者名

雑誌名

関連する報告書

[雑誌論文] Local coordination in multi-agent reinforcement learning2021

著者名/発表者名

雑誌名

関連する報告書

[雑誌論文] Hierarchical advantage for reinforcement learning in parameterized action space2021

著者名/発表者名

雑誌名

DOI

関連する報告書

[雑誌論文] Improve counterfactual regret minimization agents training by setting limitations of numbers of steps in games2021

著者名/発表者名

雑誌名

関連する報告書

[雑誌論文] Prediction of werewolf players by sentiment analysis of game dialogue in japanese2021

著者名/発表者名

雑誌名

関連する報告書

[雑誌論文] ついたて王手どうぶつしょうぎの提案とCFRによる戦略の学習2021

著者名/発表者名

雑誌名

NAID

関連する報告書

[雑誌論文] Playing catan with cross-dimensional neural network2020

著者名/発表者名

雑誌名

DOI

ISBN

関連する報告書

[雑誌論文] Evaluation of loss function for stable policy learning in dobutsu shogi2020

著者名/発表者名

雑誌名

関連する報告書

[雑誌論文] Ceramic: A research environment based on the multi-player strategic board game azul2020

著者名/発表者名

雑誌名

関連する報告書

[雑誌論文] Diverse exploration via infomax options2020

著者名/発表者名

雑誌名

関連する報告書

[雑誌論文] 離散行動空間における soft actor-critic の評価2020

著者名/発表者名

雑誌名

NAID

関連する報告書

[雑誌論文] 逆転の余地を考慮した評価関数の設計とどうぶつしょうぎによる評価2020

著者名/発表者名

雑誌名

NAID

関連する報告書

[雑誌論文] Computer Shogi Tournaments and Techniques2019

著者名/発表者名

雑誌名

DOI

関連する報告書

[雑誌論文] RankNet for evaluation functions of the game of Go2019

著者名/発表者名

雑誌名

DOI

関連する報告書

[雑誌論文] Rogue-Gym: A New Challenge for Generalization in Reinforcement Learning2019

著者名/発表者名

雑誌名

DOI

関連する報告書

[雑誌論文] Deep Residual Attention Reinforcement Learning2019

著者名/発表者名

雑誌名

DOI

関連する報告書

金子知適東京大学, 大学院総合文化研究科, 准教授 (00345068)