• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Theoretical research of the policy gradient reinforcement learning without Markov properties and its application to games

Research Project

Project/Area Number 26330419
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeMulti-year Fund
Section一般
Research Field Entertainment and game informatics 1
Research InstitutionShibaura Institute of Technology

Principal Investigator

Harukazu Igarashi  芝浦工業大学, 工学部, 教授 (80288886)

Co-Investigator(Renkei-kenkyūsha) ISHIHARA Seiji  東京電機大学, 理工学部, 准教授 (50351656)
Research Collaborator MORIOKA Yuichi  
YAMAMOTO Kazumasa  
Project Period (FY) 2014-04-01 – 2017-03-31
Project Status Completed (Fiscal Year 2016)
Budget Amount *help
¥1,950,000 (Direct Cost: ¥1,500,000、Indirect Cost: ¥450,000)
Fiscal Year 2016: ¥650,000 (Direct Cost: ¥500,000、Indirect Cost: ¥150,000)
Fiscal Year 2015: ¥650,000 (Direct Cost: ¥500,000、Indirect Cost: ¥150,000)
Fiscal Year 2014: ¥650,000 (Direct Cost: ¥500,000、Indirect Cost: ¥150,000)
Keywords強化学習 / 方策勾配法 / マルチエージェント / コンピュータ将棋 / ロボカップ / ソフトマックス探索 / サッカー / マルチエージェントシステム / RoboCup / ファジィ推論
Outline of Final Research Achievements

In this research project, we have made theoretical and practical research for developing expressions of policy functions and learning methods in the policy gradient reinforcement learning algorithms. Our final goal is constructing a general methodology that can be applied to computer games and engineering fields. The results of this project are as follows.
(1)Theoretical research on the policy gradient reinforcement learning: we proposed new methods in ①hierarchical reinforcement learning to learn higher strategies of agents, ②learning with separated knowledge of environmental dynamics and action-values in agent policies, and ③learning with a fuzzy controller for policies.
(2) Practical application of the policy gradient reinforcement learning: we applied the proposed learning methods to pursuit games, robot soccer games and computer shogi and examines the efficiency of our methods.

Report

(4 results)
  • 2016 Annual Research Report   Final Research Report ( PDF )
  • 2015 Research-status Report
  • 2014 Research-status Report
  • Research Products

    (14 results)

All 2017 2016 2015 2014

All Journal Article (4 results) (of which Peer Reviewed: 2 results,  Acknowledgement Compliant: 2 results,  Open Access: 1 results) Presentation (10 results)

  • [Journal Article] Hierarchical Policy Gradient Reinforcement Learning: Two-layer Model2017

    • Author(s)
      Harukazu Igarashi and Seiji Ishihara
    • Journal Title

      The Research Reports of Shibaura Institute of Technology, Natural Sciences and Engineering

      Volume: 60 Pages: 21-28

    • DOI

      10.13140/RG.2.2.19842.89285

    • Related Report
      2016 Annual Research Report
  • [Journal Article] Policy Gradient Reinforcement Learning with Separated Knowledge: Environmental Dynamics and Action-Values in Policies2016

    • Author(s)
      石原 聖司,五十嵐 治一
    • Journal Title

      IEEJ Transactions on Electronics, Information and Systems

      Volume: 136 Issue: 3 Pages: 282-289

    • DOI

      10.1541/ieejeiss.136.282

    • NAID

      130005132276

    • ISSN
      0385-4221, 1348-8155
    • Related Report
      2015 Research-status Report
    • Peer Reviewed / Acknowledgement Compliant
  • [Journal Article] Learning Positional Evaluation Functions without Using Databases of Game Records between Professional Shogi Players2016

    • Author(s)
      Harukazu Igarashi, Yuichi Morioka, Kazumasa Yamamoto
    • Journal Title

      The Research Reports of Shibaura Institute of Technology, Natural Sciences and Engineering

      Volume: 59 Pages: 39-47

    • DOI

      10.13140/RG.2.1.4797.2242

    • Related Report
      2015 Research-status Report
    • Acknowledgement Compliant
  • [Journal Article] Policy Gradient Reinforcement Learning with a Fuzzy Controller for Policy: Decision Making in RoboCup Soccer Small Size League2014

    • Author(s)
      杉本 将也,五十嵐 治一,石原 聖司,田中 一基
    • Journal Title

      Journal of Japan Society for Fuzzy Theory and Intelligent Informatics

      Volume: 26 Issue: 3 Pages: 647-657

    • DOI

      10.3156/jsoft.26.647

    • NAID

      130004491924

    • ISSN
      1347-7986, 1881-7203
    • Related Report
      2014 Research-status Report
    • Peer Reviewed / Open Access
  • [Presentation] 局面評価関数を用いたサッカーエージェントの移動先決定2016

    • Author(s)
      大内 斉,五十嵐 治一
    • Organizer
      情報処理学会
    • Place of Presentation
      箱根セミナーハウス(神奈川県足柄下郡箱根町仙石原845)
    • Year and Date
      2016-11-04
    • Related Report
      2016 Annual Research Report
  • [Presentation] ソフトマックス戦略と実現確率による深さ制御を用いたシンプルなゲーム木探索方式2016

    • Author(s)
      原悠一,五十嵐治一,森岡祐一,山本一将
    • Organizer
      情報処理学会
    • Place of Presentation
      箱根セミナーハウス(神奈川県足柄下郡箱根町仙石原845)
    • Year and Date
      2016-11-04
    • Related Report
      2016 Annual Research Report
  • [Presentation] サッカーエージェントにおけるスルーパスの強化学習2016

    • Author(s)
      田川 諒,五十嵐治一
    • Organizer
      電子情報通信学会ほか
    • Place of Presentation
      富山大学(富山県富山市)
    • Year and Date
      2016-09-07
    • Related Report
      2016 Annual Research Report
  • [Presentation] サッカーエージェントにおける局面評価関数の強化学習2015

    • Author(s)
      田川諒,五十嵐治一
    • Organizer
      情報処理学会第20回ゲーム・プログラミング・ワークショップ
    • Place of Presentation
      軽井沢
    • Year and Date
      2015-11-06
    • Related Report
      2015 Research-status Report
  • [Presentation] コンピュータ将棋における方策勾配を用いた局面評価関数の教師付学習2015

    • Author(s)
      大串明,山本一将,森岡祐一,五十嵐治一
    • Organizer
      情報処理学会第20回ゲーム・プログラミング・ワークショップ
    • Place of Presentation
      軽井沢
    • Year and Date
      2015-11-06
    • Related Report
      2015 Research-status Report
  • [Presentation] プロ棋士の棋譜データベースを用いない局面評価関数の学習法についての考察2015

    • Author(s)
      五十嵐治一,森岡祐一,山本一将
    • Organizer
      情報処理学会第34回ゲーム情報学研究発表会
    • Place of Presentation
      福岡
    • Year and Date
      2015-07-04
    • Related Report
      2015 Research-status Report
  • [Presentation] Policy Gradient Method Using Fuzzy Controller in Policies and Its Application2014

    • Author(s)
      Noor Imanina N.H. , Harukazu Igarashi
    • Organizer
      The International Conference on Artificial Intelligence and Pattern Recognition
    • Place of Presentation
      Kuala Lumpur, Malaysia
    • Year and Date
      2014-11-17 – 2014-11-19
    • Related Report
      2014 Research-status Report
  • [Presentation] 方策勾配法による探索制御の一考察2014

    • Author(s)
      五十嵐治一,森岡祐一,山本一将
    • Organizer
      第19回ゲーム・プログラミング ワークショップ2014
    • Place of Presentation
      箱根,神奈川県
    • Year and Date
      2014-11-07 – 2014-11-09
    • Related Report
      2014 Research-status Report
  • [Presentation] agent2d のチェーンアクションにおける評価関数の重み調整2014

    • Author(s)
      田川 諒,谷川俊策,五十嵐治一
    • Organizer
      第13回情報科学技術フォーラム(FIT2014)
    • Place of Presentation
      筑波,茨城県
    • Year and Date
      2014-09-03
    • Related Report
      2014 Research-status Report
  • [Presentation] RoboCupサッカーシミュレーションリーグ2Dにおける局面評価関数の設計と学習2014

    • Author(s)
      谷川俊策,五十嵐治一,石原聖司
    • Organizer
      ロボティクス・メカトロニクス講演会2014
    • Place of Presentation
      富山,富山県
    • Year and Date
      2014-05-26
    • Related Report
      2014 Research-status Report

URL: 

Published: 2014-04-04   Modified: 2018-03-22  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi