モンテカルロ木探索の性能の分析と改善

Research Project

Project/Area Number	16J07455
Research Category	Grant-in-Aid for JSPS Fellows
Allocation Type	Single-year Grants
Section	国内
Research Field	Entertainment and game informatics 1
Research Institution	The University of Tokyo
Principal Investigator	今川孝久東京大学, 大学院総合文化研究科, 特別研究員(DC2)
Project Period (FY)	2016-04-22 – 2018-03-31
Project Status	Completed (Fiscal Year 2017)
Budget Amount *help	¥1,300,000 (Direct Cost: ¥1,300,000) Fiscal Year 2017: ¥600,000 (Direct Cost: ¥600,000) Fiscal Year 2016: ¥700,000 (Direct Cost: ¥700,000)
Keywords	モンテカルロ木探索 / 推定量 / 探索アルゴリズム / 主観確率 / 勝敗確定の情報
Outline of Annual Research Achievements	モンテカルロ木探索（MCTS）はゲームにおける代表的な探索の枠組みである．しかし，ゲームの性質とMCTSの性能の関係性については，まだ解明されていない点がある．本年度は，まず，多腕バンディット問題（MAB）における，期待値の最大値の推定量についての研究を行った．MABは確率的な報酬が得られるスロットマシーンが複数存在する時に，より多くの報酬を得られるプレイの仕方を求める問題である．MCTSの代表的なアルゴリズムであるUCTは，MABでの累積的な報酬の最大化を目指したアルゴリズムを木探索に応用したものであるように，MABはMCTSと密接な関わりがある．また，期待値の最大値の推定量は，最善手を判別するために重要である．判別のためには，以後も最善手を選び続けた（最も期待値が高くなるように手を選んだ）場合の報酬の期待値を比較する必要があるためである．本研究では，各確率変数に対し，その期待値が最大である確率の上限に基づき，重みを与え，その重み付き平均で期待値の最大値を推定する手法（SWE）を新たに提案した．理論的な解析を行い，推定値のバイアスが０に収束すること等を示した．加えて，実験を行い，提案手法の有効性を確かめた．様々なMABの設定の下で，提案手法は常に最良ではないものの，多くの設定で良い結果となった．次に，上記の手法SWEのMCTSへの応用を行った．既存手法UCTでは，子の価値の推定を子孫から行ったシミュレーション結果の平均で行う．まず，実験を行い，MABで，サンプルの平均による推定の代わりにSWEを使うことで推定値の精度を改善出来ることを確かめた．そして，UCTにおける，平均による推定の代わりに，SWEよる推定を行う手法を提案した．ゲームでの終盤に近いモデルと，序盤に近いモデルの２種類で，実験を行い，後者のモデルでの提案手法の有効性を示した．
Research Progress Status	29年度が最終年度であるため、記入しない。
Strategy for Future Research Activity	29年度が最終年度であるため、記入しない。

Report

(2 results)

2017 Annual Research Report
2016 Annual Research Report

Research Products
(8 results)

All 2018 2017 2016

All Journal Article (4 results) (of which Peer Reviewed: 3 results, Open Access: 1 results) Presentation (4 results) (of which Int'l Joint Research: 2 results)

[Journal Article] Estimating the maximum expected value through upper confidence bound of likelihood2018
- Author(s)
  Takahisa Imagawa and Tomoyuki Kaneko
- Journal Title
  
  2017 Conference on Technologies and Applications of Artificial Intelligence (TAAI 2017)
  
  Volume: ー
- Related Report
  2017 Annual Research Report
- Peer Reviewed
[Journal Article] モンテカルロ木探索における状態価値の推定方法の改善2017
- Author(s)
  今川　孝久，金子知適
- Journal Title
  
  ゲームプログラミングワークショップ（GPW）2017論文集
  
  Volume: ー Pages: 34-41
- NAID
  170000176046
- Related Report
  2017 Annual Research Report
- Peer Reviewed
[Journal Article] Monte Carlo Tree Search with Robust Exploration2016
- Author(s)
  T. Imagawa and T. Kaneko
- Journal Title
  
  LNCS, Computers and Games
  
  Volume: 10068 Pages: 34-46
- DOI
  10.1007/978-3-319-50935-8_4
- ISBN
  9783319509341, 9783319509358
- Related Report
  2016 Annual Research Report
- Peer Reviewed
[Journal Article] モンテカルロ木探索における子孫の勝敗確定時のプレイアウト結果の修正2016
- Author(s)
  今川孝久　金子知適
- Journal Title
  
  ゲームプログラミングワークショップ2016論文集
  
  Volume: 2016 Pages: 13-20
- NAID
  170000173623
- Related Report
  2016 Annual Research Report
- Open Access
[Presentation] Estimating the maximum expected value through upper confidence bound of likelihood2017
- Author(s)
  Takahisa Imagawa
- Organizer
  2017 Conference on Technologies and Applications of Artificial Intelligence (TAAI 2017)
- Related Report
  2017 Annual Research Report
- Int'l Joint Research
[Presentation] モンテカルロ木探索における状態価値の推定方法の改善2017
- Author(s)
  今川　孝久
- Organizer
  ゲームプログラミングワークショップ（GPW）2017
- Related Report
  2017 Annual Research Report
[Presentation] モンテカルロ木探索における子孫の勝敗確定時のプレイアウト結果の修正2016
- Author(s)
  今川孝久
- Organizer
  ゲームプログラミングワークショップ2016
- Place of Presentation
  駿河台学園箱根セミナーハウス（神奈川県足柄下郡箱根町）
- Year and Date
  2016-11-04
- Related Report
  2016 Annual Research Report
[Presentation] Monte Carlo Tree Search with Robust Exploration2016
- Author(s)
  Takahisa Imagawa
- Organizer
  Computers and Games: 9th International Conference
- Place of Presentation
  ライデン（オランダ）
- Year and Date
  2016-06-29
- Related Report
  2016 Annual Research Report
- Int'l Joint Research

モンテカルロ木探索の性能の分析と改善

Principal Investigator

今川 孝久 東京大学, 大学院総合文化研究科, 特別研究員(DC2)

¥1,300,000 (Direct Cost: ¥1,300,000)

Report

Research Products

[Journal Article] Estimating the maximum expected value through upper confidence bound of likelihood2018

Author(s)

Journal Title

Related Report

[Journal Article] モンテカルロ木探索における状態価値の推定方法の改善2017

Author(s)

Journal Title

NAID

Related Report

[Journal Article] Monte Carlo Tree Search with Robust Exploration2016

Author(s)

Journal Title

DOI

ISBN

Related Report

[Journal Article] モンテカルロ木探索における子孫の勝敗確定時のプレイアウト結果の修正2016

Author(s)

Journal Title

NAID

Related Report

[Presentation] Estimating the maximum expected value through upper confidence bound of likelihood2017

Author(s)

Organizer

Related Report

[Presentation] モンテカルロ木探索における状態価値の推定方法の改善2017

Author(s)

Organizer

Related Report

[Presentation] モンテカルロ木探索における子孫の勝敗確定時のプレイアウト結果の修正2016

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] Monte Carlo Tree Search with Robust Exploration2016

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

今川孝久東京大学, 大学院総合文化研究科, 特別研究員(DC2)