• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Analysis of reward appraisal evolution processes of reinforcement learning agents in a multiagent environment

Research Project

Project/Area Number 16K00302
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeMulti-year Fund
Section一般
Research Field Intelligent informatics
Research InstitutionNagoya Institute of Technology

Principal Investigator

Moriyama Koichi  名古屋工業大学, 工学(系)研究科(研究院), 准教授 (10361776)

Project Period (FY) 2016-04-01 – 2019-03-31
Project Status Completed (Fiscal Year 2018)
Budget Amount *help
¥4,550,000 (Direct Cost: ¥3,500,000、Indirect Cost: ¥1,050,000)
Fiscal Year 2018: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000)
Fiscal Year 2017: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000)
Fiscal Year 2016: ¥1,690,000 (Direct Cost: ¥1,300,000、Indirect Cost: ¥390,000)
Keywords知的エージェント / 強化学習 / 報酬設計 / 進化 / マルチエージェントシステム / ゲーム理論 / 報酬形成 / 進化計算 / 人工知能 / 機械学習
Outline of Final Research Achievements

This research targets the emergence of social behaviors, e.g., cooperation, of reinforcement learning agents in an environment where multiple agents exist. Such social behaviors may emerge if every agent has a different purpose due to learning its behaviors not only from comparable objective evaluation but from its own appraisal. Based on the above discussion, this work investigated how the appraisal system of each agent evolved from the objective evaluation and what society would appear, by computer simulation and mathematical analyses. In a dilemma situation where agents get less payoff by individually rational deception than that by cooperation, we found that the appraisal system evolved to the direction of facilitating cooperation. We also analyzed the direction of the evolution.

Academic Significance and Societal Importance of the Research Achievements

強化学習の実現には,状態・行動・報酬の設計が必要である.しかし,複数のエージェントが存在する開いた環境における報酬の設計は非常に困難である.一方で,我々人間は,価値観に基づく主観的な評価(うれしい,恥ずかしいなど)から,複数の人間が存在する開いた社会で適切な振る舞いを学習することができている.本研究は,エージェントの「価値観」の発生・進化を考えることで,開いた環境における報酬の設計を自動化する試みである.同時に,エージェントの「価値観」の形成過程から,人間の価値観などの非合理的側面の存在理由を考える研究でもある.

Report

(4 results)
  • 2018 Annual Research Report   Final Research Report ( PDF )
  • 2017 Research-status Report
  • 2016 Research-status Report
  • Research Products

    (7 results)

All 2018 2017

All Journal Article (3 results) (of which Peer Reviewed: 3 results) Presentation (4 results)

  • [Journal Article] Evolution Direction of Reward Appraisal in Reinforcement Learning Agents2018

    • Author(s)
      Masaya Miyawaki, Koichi Moriyama, Atsuko Mutoh, Tohgoroh Matsui, and Nobuhiro Inuzuka
    • Journal Title

      Proceedings of the 12th KES International Conference on Agent and Multi-agent Systems: Technologies and Applications

      Volume: - Pages: 13-22

    • DOI

      10.1007/978-3-319-92031-3_2

    • ISBN
      9783319920306, 9783319920313
    • Related Report
      2018 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Accelerating Deep Q Network by Weighting Experiences2018

    • Author(s)
      Kazuhiro Murakami, Koichi Moriyama, Atsuko Mutoh, Tohgoroh Matsui, and
    • Journal Title

      Proceedings of the 25th International Conference on Neural Information

      Volume: - Pages: 204-213

    • DOI

      10.1007/978-3-030-04167-0_19

    • NAID

      130007423935

    • ISBN
      9783030041663, 9783030041670
    • Related Report
      2018 Annual Research Report
    • Peer Reviewed
  • [Journal Article] The Resilience of Cooperation in a Dilemma Game Played by Reinforcement Learning Agents2017

    • Author(s)
      Koichi Moriyama, Kaori Nakase, Atsuko Mutoh, and Nobuhiro Inuzuka
    • Journal Title

      Proceedings of the 2nd IEEE International Conference on Agents

      Volume: - Pages: 33-39

    • DOI

      10.1109/agents.2017.8015297

    • Related Report
      2017 Research-status Report
    • Peer Reviewed
  • [Presentation] GPGPUを用いた強化学習エージェントの並列進化シミュレーション2018

    • Author(s)
      千賀喜貴, 森山甲一, 武藤敦子, 松井藤五郎, 犬塚信博
    • Organizer
      人工知能学会全国大会(第32回)
    • Related Report
      2018 Annual Research Report
  • [Presentation] 経験データ重み付けによるDeep Q Networkの高速化2018

    • Author(s)
      村上知優, 森山甲一, 武藤敦子, 松井藤五郎, 犬塚信博
    • Organizer
      人工知能学会全国大会(第32回)
    • Related Report
      2018 Annual Research Report
  • [Presentation] GPGPUを用いた2人ゲームにおける強化学習の高速化2017

    • Author(s)
      黒木是冶,森山甲一,武藤敦子,犬塚信博
    • Organizer
      情報処理学会 第79回全国大会
    • Place of Presentation
      名古屋大学(名古屋市)
    • Year and Date
      2017-03-16
    • Related Report
      2016 Research-status Report
  • [Presentation] マルチエージェント強化学習における主観的効用の進化過程に関する分析2017

    • Author(s)
      宮脇昌哉, 森山甲一, 武藤敦子, 松井藤五郎, 犬塚信博
    • Organizer
      人工知能学会全国大会(第31回)
    • Related Report
      2017 Research-status Report

URL: 

Published: 2016-04-21   Modified: 2020-03-30  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi