deep reinforcement learning for imperfect and multi-player environments

Research Project

Project/Area Number	18K19832
Research Category	Grant-in-Aid for Challenging Research (Exploratory)
Allocation Type	Multi-year Fund
Review Section	Medium-sized Section 62:Applied informatics and related fields
Research Institution	The University of Tokyo
Principal Investigator	Kaneko Tomoyuki 東京大学, 大学院総合文化研究科, 准教授 (00345068)
Project Period (FY)	2018-06-29 – 2022-03-31
Project Status	Completed (Fiscal Year 2021)
Budget Amount *help	¥5,850,000 (Direct Cost: ¥4,500,000、Indirect Cost: ¥1,350,000) Fiscal Year 2020: ¥2,470,000 (Direct Cost: ¥1,900,000、Indirect Cost: ¥570,000) Fiscal Year 2019: ¥2,470,000 (Direct Cost: ¥1,900,000、Indirect Cost: ¥570,000) Fiscal Year 2018: ¥910,000 (Direct Cost: ¥700,000、Indirect Cost: ¥210,000)
Keywords	ゲームプログラミング / 深層強化学習
Outline of Final Research Achievements	This study extends deep reinforcement learning techniques into imperfect and multi-player games. As AlphaZero demonstrated, AI agents can learn decent strategies in perfect information games by reinforcement learning techniques. However, the learning is still difficult in more complicated situations, e.g., where some information is hidden and/or more than two agents are involved. Therefore, this study focused on learning in imperfect and multi-player games as a representative of challenging tasks.
Academic Significance and Societal Importance of the Research Achievements	より広くコンピュータを社会に役立てるために，学習するAIエージェントの研究を行った．一般に，人が完璧なプログラムを事前に準備することはほぼ不可能であるため，コンピュータプログラムあるいはAIエージェント自身が適切な振る舞いを身につけることが望ましい．適切に学習させるためには様々な技術的な困難があり，人の教育でも様々な教育方法とトレードオフがあるように，目的に応じて適切な技術を使い分けたり，新たに開発する必要がある．ここでは，複雑な振る舞いが必要とされる状況の題材として，不完全情報とマルチエージェントゲームを題材として，学習方法を研究した．

Report

(5 results)

2021 Annual Research Report Final Research Report ( PDF )
2020 Research-status Report
2019 Research-status Report
2018 Research-status Report

Research Products
(40 results)

All 2021 2020 2019 2018

All Journal Article (26 results) (of which Peer Reviewed: 25 results, Open Access: 13 results) Presentation (14 results) (of which Int'l Joint Research: 3 results)

[Journal Article] Improving counterfactual regret minimization agents training in card game cheat using ordered abstraction2021
- Author(s)
  C. Yi and T. Kaneko
- Journal Title
  
  Advances in computers and games
  
  Volume: -
- Related Report
  2021 Annual Research Report
- Peer Reviewed
[Journal Article] Local coordination in multi-agent reinforcement learning2021
- Author(s)
  F. Xu and T. Kaneko
- Journal Title
  
  International conference on technologies and applications of artificial intelligence
  
  Volume: -
- Related Report
  2021 Annual Research Report
- Peer Reviewed
[Journal Article] Hierarchical advantage for reinforcement learning in parameterized action space2021
- Author(s)
  Z. Hu and T. Kaneko
- Journal Title
  
  IEEE international conference on games
  
  Volume: - Pages: 1-8
- DOI
  10.1109/cog52621.2021.9619068
- Related Report
  2021 Annual Research Report
- Peer Reviewed
[Journal Article] Improve counterfactual regret minimization agents training by setting limitations of numbers of steps in games2021
- Author(s)
  C. Yi and T. Kaneko
- Journal Title
  
  26th game programming workshop
  
  Volume: - Pages: 117-123
- Related Report
  2021 Annual Research Report
- Peer Reviewed / Open Access
[Journal Article] Prediction of werewolf players by sentiment analysis of game dialogue in japanese2021
- Author(s)
  Y. Sun and T. Kaneko
- Journal Title
  
  26th game programming workshop
  
  Volume: - Pages: 186-191
- Related Report
  2021 Annual Research Report
- Peer Reviewed / Open Access
[Journal Article] ついたて王手どうぶつしょうぎの提案とCFRによる戦略の学習2021
- Author(s)
  中屋敷金子
- Journal Title
  
  第26回ゲームプログラミングワークショップ
  
  Volume: - Pages: 34-41
- NAID
  170000185756
- Related Report
  2021 Annual Research Report
- Peer Reviewed / Open Access
[Journal Article] Playing catan with cross-dimensional neural network2020
- Author(s)
  Gendre and Kaneko
- Journal Title
  
  ICONIP
  
  Volume: 12533 Pages: 580-592
- DOI
  10.1007/978-3-030-63833-7_49
- ISBN
  9783030638320, 9783030638337
- Related Report
  2020 Research-status Report
- Peer Reviewed
[Journal Article] Evaluation of loss function for stable policy learning in dobutsu shogi2020
- Author(s)
  Nakayashiki and Kaneko
- Journal Title
  
  International conference on technologies and applications of artificial intelligence
  
  Volume: N/A Pages: 175-180
- Related Report
  2020 Research-status Report
- Peer Reviewed
[Journal Article] Ceramic: A research environment based on the multi-player strategic board game azul2020
- Author(s)
  Gendre and Kaneko
- Journal Title
  
  25th game programming workshop
  
  Volume: 978-4-907626-46-4 C3804 Pages: 155-160
- Related Report
  2020 Research-status Report
- Peer Reviewed / Open Access
[Journal Article] Diverse exploration via infomax options2020
- Author(s)
  Kanagawa and Kaneko
- Journal Title
  
  Arxiv
  
  Volume: 978-4-907626-46-4 C3804 Pages: 1-21
- Related Report
  2020 Research-status Report
- Open Access
[Journal Article] 離散行動空間における soft actor-critic の評価2020
- Author(s)
  合田金子
- Journal Title
  
  第25回ゲームプログラミングワークショップ
  
  Volume: 978-4-907626-46-4 C3804 Pages: 175-180
- NAID
  170000184494
- Related Report
  2020 Research-status Report
- Peer Reviewed / Open Access
[Journal Article] 逆転の余地を考慮した評価関数の設計とどうぶつしょうぎによる評価2020
- Author(s)
  中屋敷金子
- Journal Title
  
  第25回ゲームプログラミングワークショップ
  
  Volume: 978-4-907626-46-4 C3804 Pages: 22-29
- NAID
  170000184517
- Related Report
  2020 Research-status Report
- Peer Reviewed / Open Access
[Journal Article] Computer Shogi Tournaments and Techniques2019
- Author(s)
  Tomoyuki Kaneko and Takenobu Takizawa
- Journal Title
  
  IEEE Transactions on Games
  
  Volume: 11(3) Issue: 3 Pages: 267-274
- DOI
  10.1109/tg.2019.2939259
- Related Report
  2019 Research-status Report
- Peer Reviewed
[Journal Article] RankNet for evaluation functions of the game of Go2019
- Author(s)
  Yusaku Mandai and Tomoyuki Kaneko
- Journal Title
  
  ICGA Journal
  
  Volume: 41(2) Issue: 2 Pages: 78-91
- DOI
  10.3233/icg-190108
- Related Report
  2019 Research-status Report
- Peer Reviewed
[Journal Article] Rogue-Gym: A New Challenge for Generalization in Reinforcement Learning2019
- Author(s)
  Yuji Kanagawa and Tomoyuki Kaneko
- Journal Title
  
  IEEE Conference on Games
  
  Volume: 19013855 Pages: 1-8
- DOI
  10.1109/cig.2019.8848075
- Related Report
  2019 Research-status Report
- Peer Reviewed
[Journal Article] Deep Residual Attention Reinforcement Learning2019
- Author(s)
  Hanhua Zhu and Tomoyuki Kaneko
- Journal Title
  
  International Conference on Technologies and Applications of Artificial Intelligence
  
  Volume: 19279615 Pages: 1-6
- DOI
  10.1109/taai48200.2019.8959896
- Related Report
  2019 Research-status Report
- Peer Reviewed
[Journal Article] Application of Deep-RL with Sample-Efficient Method in Mini-games of StarCraft II2019
- Author(s)
  Zhejie Hu and Tomoyuki Kaneko
- Journal Title
  
  International Conference on Technologies and Applications of Artificial Intelligence
  
  Volume: 19279598 Pages: 1-6
- DOI
  10.1109/taai48200.2019.8959866
- Related Report
  2019 Research-status Report
- Peer Reviewed
[Journal Article] Acquiring Strategies for the Board Game Geister by Regret Minimization2019
- Author(s)
  Chen Chen and Tomoyuki Kaneko
- Journal Title
  
  International Conference on Technologies and Applications of Artificial Intelligence
  
  Volume: 19279608 Pages: 1-6
- DOI
  10.1109/taai48200.2019.8959878
- Related Report
  2019 Research-status Report
- Peer Reviewed
[Journal Article] Deep Recurrent Q-Network with Truncated History2018
- Author(s)
  Hyunwoo Oh and Tomoyuki Kaneko
- Journal Title
  
  IEEE Technologies and Applications of Artificial Intelligence
  
  Volume: - Pages: 34-39
- DOI
  10.1109/taai.2018.00017
- Related Report
  2018 Research-status Report
- Peer Reviewed
[Journal Article] Application of Deep Reinforcement Learning in Werewolf Game Agents2018
- Author(s)
  Tianhe Wang and Tomoyuki Kaneko
- Journal Title
  
  IEEE Technologies and Applications of Artificial Intelligence
  
  Volume: - Pages: 28-33
- DOI
  10.1109/taai.2018.00016
- Related Report
  2018 Research-status Report
- Peer Reviewed
[Journal Article] Playing the Flappy Bird with Reinforcement Learning Algorithms2018
- Author(s)
  Hanhua Zhu and Tomoyuki Kaneko
- Journal Title
  
  The 23rd Game Programming Workshop
  
  Volume: -
- Related Report
  2018 Research-status Report
- Peer Reviewed / Open Access
[Journal Article] Counterfactual Regret Minimization for the Board Game Geister2018
- Author(s)
  Chen Chen and Tomoyuki Kaneko
- Journal Title
  
  The 23rd Game Programming Workshop
  
  Volume: -
- Related Report
  2018 Research-status Report
- Peer Reviewed / Open Access
[Journal Article] Reinforcement Learning with Effective Exploitation of Experiences on Mini-Games of StarCraft II2018
- Author(s)
  ZheJie Hu and Tomoyuki Kaneko
- Journal Title
  
  The 23rd Game Programming Workshop
  
  Volume: -
- Related Report
  2018 Research-status Report
- Peer Reviewed / Open Access
[Journal Article] ローグライクゲームによる強化学習ベンチマーク環境Rogue-Gymの提案2018
- Author(s)
  金川裕司金子知適
- Journal Title
  
  第23回ゲームプログラミングワークショップ
  
  Volume: -
- NAID
  170000178478
- Related Report
  2018 Research-status Report
- Peer Reviewed / Open Access
[Journal Article] 人狼エージェントにおける深層Qネットワークの応用2018
- Author(s)
  王天鶴金子知適
- Journal Title
  
  第23回ゲームプログラミングワークショップ
  
  Volume: -
- NAID
  170000178462
- Related Report
  2018 Research-status Report
- Peer Reviewed / Open Access
[Journal Article] LSTM の初期状態の学習による DRQN の改善2018
- Author(s)
  Oh Hyunwoo 金子知適
- Journal Title
  
  第23回ゲームプログラミングワークショップ
  
  Volume: -
- NAID
  170000178493
- Related Report
  2018 Research-status Report
- Peer Reviewed / Open Access
[Presentation] Improve counterfactual regret minimization for card game cheat2020
- Author(s)
  Yi and Kaneko
- Organizer
  25th game programming workshop
- Related Report
  2020 Research-status Report
- Int'l Joint Research
[Presentation] Application of dream to the board game geister2020
- Author(s)
  Chen and Kaneko
- Organizer
  25th game programming workshop
- Related Report
  2020 Research-status Report
- Int'l Joint Research
[Presentation] Training japanese mahjong agent with two dimension feature representation2020
- Author(s)
  Honghai and Kaneko
- Organizer
  25th game programming workshop
- Related Report
  2020 Research-status Report
- Int'l Joint Research
[Presentation] ProcgenBenchmark における汎化性能を高める強化学習2020
- Author(s)
  徐金子
- Organizer
  第25回ゲームプログラミングワークショップ
- Related Report
  2020 Research-status Report
[Presentation] Utilizing History Information in Acquiring Strategies for Board Game Geister by Deep Counterfactual Regret Minimization2019
- Author(s)
  Chen Chen and Tomoyuki Kaneko
- Organizer
  The 24th Game Programming Workshop
- Related Report
  2019 Research-status Report
[Presentation] An Extension of Counterfactual Regret Minimization for Multiplayer Card Games2019
- Author(s)
  Yu Cao and Tomoyuki Kaneko
- Organizer
  The 24th Game Programming Workshop
- Related Report
  2019 Research-status Report
[Presentation] Performance of Counterfactual Regret Minimization with Self-Confirming Equilibrium2019
- Author(s)
  Cheng Yi and Tomoyuki Kaneko
- Organizer
  The 24th Game Programming Workshop
- Related Report
  2019 Research-status Report
[Presentation] どうぶつしょうぎを用いた AlphaZero の手法の調査2019
- Author(s)
  中屋敷太一金子知適
- Organizer
  第24回ゲームプログラミングワークショップ
- Related Report
  2019 Research-status Report
[Presentation] スタークラフト II のミニゲームにおけるマルチタスク強化学習2019
- Author(s)
  徐凡超金子知適
- Organizer
  第24回ゲームプログラミングワークショップ
- Related Report
  2019 Research-status Report
[Presentation] Enhancing Sample Efficiency of Deep Reinforcement Learning to Master the Mini-games of StarCraft II2019
- Author(s)
  ZheJie Hu and Tomoyuki Kaneko
- Organizer
  The 24th Game Programming Workshop
- Related Report
  2019 Research-status Report
[Presentation] Counterfactual Regret Minimisation for playing the multiplayer bluffing dice game Dudo2019
- Author(s)
  Quentin Gendre and Tomoyuki Kaneko
- Organizer
  The 24th Game Programming Workshop
- Related Report
  2019 Research-status Report
[Presentation] Training Agents with Long-range Information in Deep Reinforcement Learning2019
- Author(s)
  Hanhua Zhu and Tomoyuki Kaneko
- Organizer
  The 24th Game Programming Workshop
- Related Report
  2019 Research-status Report
[Presentation] Back Prediction in the Game of Go2019
- Author(s)
  Tang Jiachen and Tomoyuki Kaneko
- Organizer
  The 24th Game Programming Workshop
- Related Report
  2019 Research-status Report
[Presentation] Improving Mahjong Agent by Predicting Types of Yaku2019
- Author(s)
  Long Honghai and Tomoyuki Kaneko
- Organizer
  The 24th Game Programming Workshop
- Related Report
  2019 Research-status Report

deep reinforcement learning for imperfect and multi-player environments

Principal Investigator

Kaneko Tomoyuki 東京大学, 大学院総合文化研究科, 准教授 (00345068)

¥5,850,000 (Direct Cost: ¥4,500,000、Indirect Cost: ¥1,350,000)

Report

Research Products

[Journal Article] Improving counterfactual regret minimization agents training in card game cheat using ordered abstraction2021

Author(s)

Journal Title

Related Report

[Journal Article] Local coordination in multi-agent reinforcement learning2021

Author(s)

Journal Title

Related Report

[Journal Article] Hierarchical advantage for reinforcement learning in parameterized action space2021

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Improve counterfactual regret minimization agents training by setting limitations of numbers of steps in games2021

Author(s)

Journal Title

Related Report

[Journal Article] Prediction of werewolf players by sentiment analysis of game dialogue in japanese2021

Author(s)

Journal Title

Related Report

[Journal Article] ついたて王手どうぶつしょうぎの提案とCFRによる戦略の学習2021

Author(s)

Journal Title

NAID

Related Report

[Journal Article] Playing catan with cross-dimensional neural network2020

Author(s)

Journal Title

DOI

ISBN

Related Report

[Journal Article] Evaluation of loss function for stable policy learning in dobutsu shogi2020

Author(s)

Journal Title

Related Report

[Journal Article] Ceramic: A research environment based on the multi-player strategic board game azul2020

Author(s)

Journal Title

Related Report

[Journal Article] Diverse exploration via infomax options2020

Author(s)

Journal Title

Related Report

[Journal Article] 離散行動空間における soft actor-critic の評価2020

Author(s)

Journal Title

NAID

Related Report

[Journal Article] 逆転の余地を考慮した評価関数の設計とどうぶつしょうぎによる評価2020

Author(s)

Journal Title

NAID

Related Report

[Journal Article] Computer Shogi Tournaments and Techniques2019

Author(s)

Journal Title

DOI

Related Report

[Journal Article] RankNet for evaluation functions of the game of Go2019

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Rogue-Gym: A New Challenge for Generalization in Reinforcement Learning2019

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Deep Residual Attention Reinforcement Learning2019

Author(s)

Journal Title

DOI

Related Report