• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Research on the innovative evolution of deep reinforcement learning based on the profit sharing principle and its application to real problems

Research Project

Project/Area Number 21K12024
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeMulti-year Fund
Section一般
Review Section Basic Section 61030:Intelligent informatics-related
Research InstitutionNational Institution for Academic Degrees and Quality Enhancement of Higher Education

Principal Investigator

Miyazaki Kazuteru  独立行政法人大学改革支援・学位授与機構, 研究開発部, 教授 (20282866)

Co-Investigator(Kenkyū-buntansha) 山口 周  独立行政法人大学改革支援・学位授与機構, 研究開発部, 特任教授 (10182437)
原田 拓  東京理科大学, 創域理工学部経営システム工学科, 准教授 (70256668)
小玉 直樹  明治大学, 理工学部, 助教 (60908747)
Project Period (FY) 2021-04-01 – 2024-03-31
Project Status Completed (Fiscal Year 2023)
Budget Amount *help
¥4,290,000 (Direct Cost: ¥3,300,000、Indirect Cost: ¥990,000)
Fiscal Year 2023: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000)
Fiscal Year 2022: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000)
Fiscal Year 2021: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000)
Keywords深層強化学習 / 強化学習 / 深層学習 / 利益分配原理 / 経験強化型学習 / 深層経験強化型学習 / スマートエネルギーシステム / 道路交通信号機制御 / ツイートデータ / 意識的意思決定システム / 信号機制御 / ロボット制御
Outline of Research at the Start

近年、深層強化学習が注目されているが学習に多くの試行錯誤を要するという問題がある。それに対し研究代表者らは、経験を強く強化する接近法である経験強化型学習における利益分配原理に基づく手法を提案し、試行錯誤回数の削減を実現している。しかし、学習結果がばらつく場合が多く解決が望まれていた。そこで本研究では、ばらつきを抑えた深層経験強化型学習の提案を主目標に掲げる。さらに副目標としてマルコフ決定過程を超えるクラスやマルチエージェント環境下での挙動の明確化を掲げ、実問題への応用を通じ提案手法の有効性を主張する。その結果、新たな選択肢となり得る手法が確立し実問題への適用レベルを飛躍的に向上できると考える。

Outline of Final Research Achievements

In this study, after completing the basic design of the Deep Profit Sharing method, which is “deep exploitation-oriented learning with reduced variability of learning results based on the profit sharing principle,” which was the original goal of this study, we expanded the application examples to real problems considering two sub-goals related to target problem classes. Specifically, we achieved the originally planned “application to smart energy systems” and also obtained certain results for “application to curriculum analysis support systems." In addition, as an example of an application not initially envisioned, after achieving a certain level of success with the application to road traffic signal control, we began applying the system to the suppression of negative tweets and the Angry Bird AI Competition.

Academic Significance and Societal Importance of the Research Achievements

本研究では「利益分配原理に基づく学習結果のばらつきを抑えた深層経験強化型学習」であるDeep Profit Sharing method(DeePS)の有効性を主張できた。これは、動的計画法や政策の直接探索に基づく手法が主流を占める深層強化学習の世界に一石を投じるものであり、学術的意義が大きい。通常、それらの手法では、学習に多くの試行錯誤を要するが、DeePSは、より少ない経験でいかに学習するかを主眼に置いており、実問題への応用において、特に、威力を発揮するものと考える。実際、本研究では、複数の実問題に応用し、DeePSの有効性を示すことができたので、得られた成果の社会的意義は大きいと言える。

Report

(4 results)
  • 2023 Annual Research Report   Final Research Report ( PDF )
  • 2022 Research-status Report
  • 2021 Research-status Report
  • Research Products

    (39 results)

All 2024 2023 2022 2021

All Journal Article (11 results) (of which Peer Reviewed: 11 results,  Open Access: 4 results) Presentation (27 results) (of which Int'l Joint Research: 11 results) Book (1 results)

  • [Journal Article] Proposal of a Course-Classification Support System Using Deep Learning and its Evaluation When Combined with Reinforcement Learning2024

    • Author(s)
      Miyazaki Kazuteru、Yamaguchi Shu、Mori Rie、Yoshikawa Yumiko、Saito Takanori、Suzuki Toshiya
    • Journal Title

      Journal of Advanced Computational Intelligence and Intelligent Informatics

      Volume: 28 Issue: 2 Pages: 454-467

    • DOI

      10.20965/jaciii.2024.p0454

    • ISSN
      1343-0130, 1883-8014
    • Year and Date
      2024-03-20
    • Related Report
      2023 Annual Research Report
    • Peer Reviewed / Open Access
  • [Journal Article] Suppression of negative tweets using reinforcement learning systems2024

    • Author(s)
      Miyazaki Kazuteru、Miyazaki Hitomi
    • Journal Title

      Cognitive Systems Research

      Volume: 84 Pages: 101207-101207

    • DOI

      10.1016/j.cogsys.2023.101207

    • Related Report
      2023 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Performance evaluation of character-level CNNs using tweet data and analysis for weight perturbations2024

    • Author(s)
      Miyazaki Kazuteru、Ida Masaaki
    • Journal Title

      Artificial Life and Robotics

      Volume: 29 Issue: 2 Pages: 266-273

    • DOI

      10.1007/s10015-024-00944-9

    • Related Report
      2023 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Enhanced Naive Agent in Angry Birds AI Competition via Exploitation-oriented Learning2024

    • Author(s)
      Miyazaki Kazuteru
    • Journal Title

      Journal of Robotics and Mechatronics

      Volume: 掲載予定

    • Related Report
      2023 Annual Research Report
    • Peer Reviewed / Open Access
  • [Journal Article] Proposal and Evaluation of a Course-Classification-Support System Emphasizing Communication with the Sub-committees Within the Committee of Validation and Examination for Degrees2023

    • Author(s)
      Miyazaki Kazuteru、Yamaguchi Syu、Mori Rie、Yoshikawa Yumiko、Saito Takanori、Suzuki Toshiya
    • Journal Title

      Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

      Volume: 477 Pages: 123-130

    • DOI

      10.1007/978-3-031-29126-5_10

    • ISBN
      9783031291258, 9783031291265
    • Related Report
      2022 Research-status Report
    • Peer Reviewed
  • [Journal Article] Surface Hydroxyl-Ion Diffusion and Hierarchical Structure of Adsorbed Water on Hydrated Layered Double Hydroxides2023

    • Author(s)
      Yamasaki Tomoyuki、Iimura Soshi、Hosono Hideo、Yamaguchi Shu
    • Journal Title

      The Journal of Physical Chemistry C

      Volume: 127 Issue: 12 Pages: 6045-6053

    • DOI

      10.1021/acs.jpcc.3c00275

    • Related Report
      2022 Research-status Report
    • Peer Reviewed
  • [Journal Article] Research on the Consistency of Diploma Policies and the Nomenclature of Major Fields of Academic Degrees2022

    • Author(s)
      宮崎和光、高橋望、森利枝
    • Journal Title

      IEEJ Transactions on Electronics, Information and Systems

      Volume: 142 Issue: 2 Pages: 117-128

    • DOI

      10.1541/ieejeiss.142.117

    • NAID

      130008150248

    • ISSN
      0385-4221, 1348-8155
    • Year and Date
      2022-02-01
    • Related Report
      2021 Research-status Report
    • Peer Reviewed
  • [Journal Article] Traffic Signal Control System Using Deep Reinforcement Learning With Emphasis on Reinforcing Successful Experiences2022

    • Author(s)
      Kodama Naoki、Harada Taku、Miyazaki Kazuteru
    • Journal Title

      IEEE Access

      Volume: 10 Pages: 128943-128950

    • DOI

      10.1109/access.2022.3225431

    • Related Report
      2022 Research-status Report
    • Peer Reviewed / Open Access
  • [Journal Article] Modeling of placebo effect in stochastic reward tasks by reinforcement learning2022

    • Author(s)
      Miyazaki Kazuteru
    • Journal Title

      Procedia Computer Science

      Volume: 213 Pages: 255-262

    • DOI

      10.1016/j.procs.2022.11.064

    • Related Report
      2022 Research-status Report
    • Peer Reviewed
  • [Journal Article] Home Energy Management Algorithm Based on Deep Reinforcement Learning Using Multistep Prediction2021

    • Author(s)
      Kodama Naoki、Harada Taku、Miyazaki Kazuteru
    • Journal Title

      IEEE Access

      Volume: 9 Pages: 153108-153115

    • DOI

      10.1109/access.2021.3126365

    • Related Report
      2021 Research-status Report
    • Peer Reviewed / Open Access
  • [Journal Article] Proposal and evaluation of deep exploitation-oriented learning under multiple reward environment2021

    • Author(s)
      Miyazaki Kazuteru
    • Journal Title

      Cognitive Systems Research

      Volume: 70 Pages: 29-39

    • DOI

      10.1016/j.cogsys.2021.07.002

    • Related Report
      2021 Research-status Report
    • Peer Reviewed
  • [Presentation] Application of Deep Reinforcement Learning to Decentralized Control of Traffic Signals Considering Fairness in a Road Traffic Network Including Intersections Without Traffic Signals2024

    • Author(s)
      Shirasaka Shogo、Kodama Naoki、Harada Taku
    • Organizer
      The 10th IEEJ International Workshop on Sensing, Actuation, Motion Control, and Optimization
    • Related Report
      2023 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Suppression of Negative Tweets using Reinforcement Learning Systems in a Multi-Agent Environment2023

    • Author(s)
      Miyazaki Kazuteru、Miyazaki Hitomi
    • Organizer
      2023 Annual International Conference on Brain-Inspired Cognitive Architectures for Artificial Intelligence, the 14th Annual Meeting of the BICA Society (BICA*AI 2023)
    • Related Report
      2023 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Competencies to Be Cultivated in Higher Education and Their Evaluation in the Era of Generative AI: Through the Experiences With Self-Study Degree-Awarding Program in NIAD-QE2023

    • Author(s)
      Yamada Nodoka、Sakaguchi Kikue、Nakamura Yu、Miyazaki Kazuteru、Yamaguchi Shu
    • Organizer
      The 15th Higher Education International Conference, ARTIFICIAL INTELLIGENCE AND PEDAGOGICAL TRANSFORMATION: IMPLICATIONS FOR HIGHER EDUCATION QUALITY ASSURANCE
    • Related Report
      2023 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Rule-based generation of synthetic genetic circuits2023

    • Author(s)
      Yamamura Masayuki、Sekine Ryoji、Miyazaki Kazuteru、Okuda Sota、Kodama Naoki、Kiga Daisuke
    • Organizer
      15th International Workshop on Bio-Design Automation (IWBDA 2023)
    • Related Report
      2023 Annual Research Report
    • Int'l Joint Research
  • [Presentation] 意識的意思決定システムのマルチエージェント環境下への拡張2023

    • Author(s)
      宮崎和光
    • Organizer
      計測自動制御学会 システム・情報部門 学術講演会2023 (SSI2023)
    • Related Report
      2023 Annual Research Report
  • [Presentation] 燃料消費および走行時間を考慮したハイブリッド自動車走行制御に対する深層強化学習の適用2023

    • Author(s)
      LI ZHAOXI、原田拓
    • Organizer
      計測自動制御学会 システム・情報部門学術講演会2023 (SSI2023)
    • Related Report
      2023 Annual Research Report
  • [Presentation] 機械学習手法を利用したBioDOS にとって有用な論文の発見2023

    • Author(s)
      宮崎和光、木賀大介、安田翔也、濱田立輝、小玉直樹、山村雅幸
    • Organizer
      電気学会 システム/制御合同研究会
    • Related Report
      2023 Annual Research Report
  • [Presentation] マルチエージェント環境下における強化学習を用いたネガティブツイートの抑制2023

    • Author(s)
      宮崎和光
    • Organizer
      第50回知能システムシンポジウム
    • Related Report
      2022 Research-status Report
  • [Presentation] Effectiveness of Character-level CNN and its Examination of Perturbation for Weights2023

    • Author(s)
      Miyazaki Kazuteru、Ida Masaaki
    • Organizer
      28th International Symposium on Artificial Life and Robotics (AROB 28th 2023)
    • Related Report
      2022 Research-status Report
    • Int'l Joint Research
  • [Presentation] Learning Thresholds to Select Cooperative Partners by Applying Deep Reinforcement Learning in Distributed Traffic Signal Control2023

    • Author(s)
      Matsuta Shinya、Kodama Naoki、Harada Taku
    • Organizer
      38th International Conference on Computers and Their Applications
    • Related Report
      2022 Research-status Report
    • Int'l Joint Research
  • [Presentation] Distributed Traffic Signal Control with Fairness Using Deep Reinforcement Learning2023

    • Author(s)
      Shirasaka Shogo、Kodama Naoki、Harada Taku
    • Organizer
      SICE International Symposium on Control Systems 2023
    • Related Report
      2022 Research-status Report
    • Int'l Joint Research
  • [Presentation] 強化学習を用いたネガティブツイートの抑制2022

    • Author(s)
      宮崎和光
    • Organizer
      計測自動制御学会 システム・情報部門 学術講演会2022
    • Related Report
      2022 Research-status Report
  • [Presentation] 経験強化型深層強化学習による Atari2600 シミュレーション2022

    • Author(s)
      小玉直樹、原田拓、宮崎和光
    • Organizer
      計測自動制御学会 システム・情報部門 学術講演会2022
    • Related Report
      2022 Research-status Report
  • [Presentation] 説明可能な深層強化学習法の提案2022

    • Author(s)
      小玉直樹、原田拓、宮崎和光
    • Organizer
      電気学会C部門大会
    • Related Report
      2022 Research-status Report
  • [Presentation] 深層学習を利用したBioDOS にとって有用な論文の発見2022

    • Author(s)
      宮崎和光、木賀大介、安田翔也、濱田立輝、小玉直樹、山村雅幸
    • Organizer
      電気学会C部門大会
    • Related Report
      2022 Research-status Report
  • [Presentation] Rule-based generation of synthetic genetic circuits2022

    • Author(s)
      Kiga Daisuke、Miyazaki Kazuteru、Yasuda Shoya、Hamada Ritsuki、Okuda Sota、Sekine Ryoji、Kodama Naoki、Yamamura Masayuki
    • Organizer
      14th International Workshop on Bio-Design Automation (IWBDA 2022)
    • Related Report
      2022 Research-status Report
    • Int'l Joint Research
  • [Presentation] Profit Sharing による方策の直接強化手法の提案2022

    • Author(s)
      小玉直樹、宮崎和光、原田拓
    • Organizer
      第49回知能システムシンポジウム
    • Related Report
      2021 Research-status Report
  • [Presentation] Proposal and Evaluation of Deep Profit Sharing Method in a Mixed Reward and Penalty Environment2021

    • Author(s)
      Kazuteru Miyazaki
    • Organizer
      2021 Annual International Conference on Brain-Inspired Cognitive Architectures for Artificial Intelligence
    • Related Report
      2021 Research-status Report
    • Int'l Joint Research
  • [Presentation] 状態遷移予測型Deep Q-Networkの提案2021

    • Author(s)
      小玉直樹、宮崎和光、原田拓
    • Organizer
      計測自動制御学会 システム・情報部門 学術講演会2021
    • Related Report
      2021 Research-status Report
  • [Presentation] 確率的報酬課題におけるプラセボ効果の強化学習によるモデル化2021

    • Author(s)
      宮崎和光
    • Organizer
      計測自動制御学会 システム・情報部門 学術講演会2021
    • Related Report
      2021 Research-status Report
  • [Presentation] 状態遷移予測型強化学習法の提案2021

    • Author(s)
      小玉直樹、宮崎和光、原田拓
    • Organizer
      電気学会C部門大会
    • Related Report
      2021 Research-status Report
  • [Presentation] 報酬と罰が混合する環境における深層経験強化型学習に関する一考察2021

    • Author(s)
      宮崎和光
    • Organizer
      電気学会C部門大会
    • Related Report
      2021 Research-status Report
  • [Presentation] 学位に付記する専攻分野の名称とディプロマ・ポリシーの整合性判定支援システムの性能改善2021

    • Author(s)
      宮崎和光、吉田望、森利枝
    • Organizer
      電気学会 システム/制御 合同研究
    • Related Report
      2021 Research-status Report
  • [Presentation] Evaluation of Character-Level CNNs using the NTCIR-13 MedWeb Task2021

    • Author(s)
      Kazuteru Miyazaki、Masaaki Ida
    • Organizer
      The 22nd International Symposium on Advanced Intelligent Systems
    • Related Report
      2021 Research-status Report
    • Int'l Joint Research
  • [Presentation] Character-level CNN の重みの摂動に関する一考察 - NTCIR-13 MedWeb タスクを題材として -2021

    • Author(s)
      宮崎和光、井田正明
    • Organizer
      計測自動制御学会 システム・情報部門 学術講演会2021
    • Related Report
      2021 Research-status Report
  • [Presentation] NTCIR-13 MedWebタスクを用いたCharacter-level CNNの性能評価2021

    • Author(s)
      宮崎和光、井田正明
    • Organizer
      電気学会C部門大会
    • Related Report
      2021 Research-status Report
  • [Presentation] Proposal for selecting a cooperation partner in distributed control of traffic signals using deep reinforcement learning2021

    • Author(s)
      Shinya Matsuta、Naoki Kodama、Taku Harada
    • Organizer
      Proceedings of the 8th IIAE International Conference on Intelligent Systems and Image Processing 2021
    • Related Report
      2021 Research-status Report
    • Int'l Joint Research
  • [Book] 危機こそマネジメント改革の好機(第3部 第3章「研究者養成としての大学院教育」を山口周が執筆)2022

    • Author(s)
      川口昭彦 、栗田佳代子、山口周 、吉田塁、長谷川壽一(編集協力)、福田秀樹(編集協力)
    • Total Pages
      172
    • Publisher
      株式会社ぎょうせい
    • Related Report
      2021 Research-status Report

URL: 

Published: 2021-04-28   Modified: 2025-01-30  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi