Research on the innovative evolution of deep reinforcement learning based on the profit sharing principle and its application to real problems

Research Project

Project/Area Number	21K12024
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Multi-year Fund
Section	一般
Review Section	Basic Section 61030:Intelligent informatics-related
Research Institution	National Institution for Academic Degrees and Quality Enhancement of Higher Education
Principal Investigator	Miyazaki Kazuteru 独立行政法人大学改革支援・学位授与機構, 研究開発部, 教授 (20282866)
Co-Investigator(Kenkyū-buntansha)	山口周独立行政法人大学改革支援・学位授与機構, 研究開発部, 特任教授 (10182437) 原田拓東京理科大学, 創域理工学部経営システム工学科, 准教授 (70256668) 小玉直樹明治大学, 理工学部, 助教 (60908747)
Project Period (FY)	2021-04-01 – 2024-03-31
Project Status	Completed (Fiscal Year 2023)
Budget Amount *help	¥4,290,000 (Direct Cost: ¥3,300,000、Indirect Cost: ¥990,000) Fiscal Year 2023: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000) Fiscal Year 2022: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000) Fiscal Year 2021: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000)
Keywords	深層強化学習 / 強化学習 / 深層学習 / 利益分配原理 / 経験強化型学習 / 深層経験強化型学習 / スマートエネルギーシステム / 道路交通信号機制御 / ツイートデータ / 意識的意思決定システム / 信号機制御 / ロボット制御
Outline of Research at the Start	近年、深層強化学習が注目されているが学習に多くの試行錯誤を要するという問題がある。それに対し研究代表者らは、経験を強く強化する接近法である経験強化型学習における利益分配原理に基づく手法を提案し、試行錯誤回数の削減を実現している。しかし、学習結果がばらつく場合が多く解決が望まれていた。そこで本研究では、ばらつきを抑えた深層経験強化型学習の提案を主目標に掲げる。さらに副目標としてマルコフ決定過程を超えるクラスやマルチエージェント環境下での挙動の明確化を掲げ、実問題への応用を通じ提案手法の有効性を主張する。その結果、新たな選択肢となり得る手法が確立し実問題への適用レベルを飛躍的に向上できると考える。
Outline of Final Research Achievements	In this study, after completing the basic design of the Deep Profit Sharing method, which is “deep exploitation-oriented learning with reduced variability of learning results based on the profit sharing principle,” which was the original goal of this study, we expanded the application examples to real problems considering two sub-goals related to target problem classes. Specifically, we achieved the originally planned “application to smart energy systems” and also obtained certain results for “application to curriculum analysis support systems." In addition, as an example of an application not initially envisioned, after achieving a certain level of success with the application to road traffic signal control, we began applying the system to the suppression of negative tweets and the Angry Bird AI Competition.
Academic Significance and Societal Importance of the Research Achievements	本研究では「利益分配原理に基づく学習結果のばらつきを抑えた深層経験強化型学習」であるDeep Profit Sharing method（DeePS）の有効性を主張できた。これは、動的計画法や政策の直接探索に基づく手法が主流を占める深層強化学習の世界に一石を投じるものであり、学術的意義が大きい。通常、それらの手法では、学習に多くの試行錯誤を要するが、DeePSは、より少ない経験でいかに学習するかを主眼に置いており、実問題への応用において、特に、威力を発揮するものと考える。実際、本研究では、複数の実問題に応用し、DeePSの有効性を示すことができたので、得られた成果の社会的意義は大きいと言える。

Report

(4 results)

2023 Annual Research Report Final Research Report ( PDF )
2022 Research-status Report
2021 Research-status Report

Research Products
(39 results)

All 2024 2023 2022 2021

All Journal Article (11 results) (of which Peer Reviewed: 11 results, Open Access: 4 results) Presentation (27 results) (of which Int'l Joint Research: 11 results) Book (1 results)

[Journal Article] Proposal of a Course-Classification Support System Using Deep Learning and its Evaluation When Combined with Reinforcement Learning2024
- Author(s)
  Miyazaki Kazuteru、Yamaguchi Shu、Mori Rie、Yoshikawa Yumiko、Saito Takanori、Suzuki Toshiya
- Journal Title
  
  Journal of Advanced Computational Intelligence and Intelligent Informatics
  
  Volume: 28 Issue: 2 Pages: 454-467
- DOI
  10.20965/jaciii.2024.p0454
- ISSN
  1343-0130, 1883-8014
- Year and Date
  2024-03-20
- Related Report
  2023 Annual Research Report
- Peer Reviewed / Open Access
[Journal Article] Suppression of negative tweets using reinforcement learning systems2024
- Author(s)
  Miyazaki Kazuteru、Miyazaki Hitomi
- Journal Title
  
  Cognitive Systems Research
  
  Volume: 84 Pages: 101207-101207
- DOI
  10.1016/j.cogsys.2023.101207
- Related Report
  2023 Annual Research Report
- Peer Reviewed
[Journal Article] Performance evaluation of character-level CNNs using tweet data and analysis for weight perturbations2024
- Author(s)
  Miyazaki Kazuteru、Ida Masaaki
- Journal Title
  
  Artificial Life and Robotics
  
  Volume: 29 Issue: 2 Pages: 266-273
- DOI
  10.1007/s10015-024-00944-9
- Related Report
  2023 Annual Research Report
- Peer Reviewed
[Journal Article] Enhanced Naive Agent in Angry Birds AI Competition via Exploitation-oriented Learning2024
- Author(s)
  Miyazaki Kazuteru
- Journal Title
  
  Journal of Robotics and Mechatronics
  
  Volume: 掲載予定
- Related Report
  2023 Annual Research Report
- Peer Reviewed / Open Access
[Journal Article] Proposal and Evaluation of a Course-Classification-Support System Emphasizing Communication with the Sub-committees Within the Committee of Validation and Examination for Degrees2023
- Author(s)
  Miyazaki Kazuteru、Yamaguchi Syu、Mori Rie、Yoshikawa Yumiko、Saito Takanori、Suzuki Toshiya
- Journal Title
  
  Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
  
  Volume: 477 Pages: 123-130
- DOI
  10.1007/978-3-031-29126-5_10
- ISBN
  9783031291258, 9783031291265
- Related Report
  2022 Research-status Report
- Peer Reviewed
[Journal Article] Surface Hydroxyl-Ion Diffusion and Hierarchical Structure of Adsorbed Water on Hydrated Layered Double Hydroxides2023
- Author(s)
  Yamasaki Tomoyuki、Iimura Soshi、Hosono Hideo、Yamaguchi Shu
- Journal Title
  
  The Journal of Physical Chemistry C
  
  Volume: 127 Issue: 12 Pages: 6045-6053
- DOI
  10.1021/acs.jpcc.3c00275
- Related Report
  2022 Research-status Report
- Peer Reviewed
[Journal Article] Research on the Consistency of Diploma Policies and the Nomenclature of Major Fields of Academic Degrees2022
- Author(s)
  宮崎和光、高橋望、森利枝
- Journal Title
  
  IEEJ Transactions on Electronics, Information and Systems
  
  Volume: 142 Issue: 2 Pages: 117-128
- DOI
  10.1541/ieejeiss.142.117
- NAID
  130008150248
- ISSN
  0385-4221, 1348-8155
- Year and Date
  2022-02-01
- Related Report
  2021 Research-status Report
- Peer Reviewed
[Journal Article] Traffic Signal Control System Using Deep Reinforcement Learning With Emphasis on Reinforcing Successful Experiences2022
- Author(s)
  Kodama Naoki、Harada Taku、Miyazaki Kazuteru
- Journal Title
  
  IEEE Access
  
  Volume: 10 Pages: 128943-128950
- DOI
  10.1109/access.2022.3225431
- Related Report
  2022 Research-status Report
- Peer Reviewed / Open Access
[Journal Article] Modeling of placebo effect in stochastic reward tasks by reinforcement learning2022
- Author(s)
  Miyazaki Kazuteru
- Journal Title
  
  Procedia Computer Science
  
  Volume: 213 Pages: 255-262
- DOI
  10.1016/j.procs.2022.11.064
- Related Report
  2022 Research-status Report
- Peer Reviewed
[Journal Article] Home Energy Management Algorithm Based on Deep Reinforcement Learning Using Multistep Prediction2021
- Author(s)
  Kodama Naoki、Harada Taku、Miyazaki Kazuteru
- Journal Title
  
  IEEE Access
  
  Volume: 9 Pages: 153108-153115
- DOI
  10.1109/access.2021.3126365
- Related Report
  2021 Research-status Report
- Peer Reviewed / Open Access
[Journal Article] Proposal and evaluation of deep exploitation-oriented learning under multiple reward environment2021
- Author(s)
  Miyazaki Kazuteru
- Journal Title
  
  Cognitive Systems Research
  
  Volume: 70 Pages: 29-39
- DOI
  10.1016/j.cogsys.2021.07.002
- Related Report
  2021 Research-status Report
- Peer Reviewed
[Presentation] Application of Deep Reinforcement Learning to Decentralized Control of Traffic Signals Considering Fairness in a Road Traffic Network Including Intersections Without Traffic Signals2024
- Author(s)
  Shirasaka Shogo、Kodama Naoki、Harada Taku
- Organizer
  The 10th IEEJ International Workshop on Sensing, Actuation, Motion Control, and Optimization
- Related Report
  2023 Annual Research Report
- Int'l Joint Research
[Presentation] Suppression of Negative Tweets using Reinforcement Learning Systems in a Multi-Agent Environment2023
- Author(s)
  Miyazaki Kazuteru、Miyazaki Hitomi
- Organizer
  2023 Annual International Conference on Brain-Inspired Cognitive Architectures for Artificial Intelligence, the 14th Annual Meeting of the BICA Society (BICA*AI 2023)
- Related Report
  2023 Annual Research Report
- Int'l Joint Research
[Presentation] Competencies to Be Cultivated in Higher Education and Their Evaluation in the Era of Generative AI: Through the Experiences With Self-Study Degree-Awarding Program in NIAD-QE2023
- Author(s)
  Yamada Nodoka、Sakaguchi Kikue、Nakamura Yu、Miyazaki Kazuteru、Yamaguchi Shu
- Organizer
  The 15th Higher Education International Conference, ARTIFICIAL INTELLIGENCE AND PEDAGOGICAL TRANSFORMATION: IMPLICATIONS FOR HIGHER EDUCATION QUALITY ASSURANCE
- Related Report
  2023 Annual Research Report
- Int'l Joint Research
[Presentation] Rule-based generation of synthetic genetic circuits2023
- Author(s)
  Yamamura Masayuki、Sekine Ryoji、Miyazaki Kazuteru、Okuda Sota、Kodama Naoki、Kiga Daisuke
- Organizer
  15th International Workshop on Bio-Design Automation (IWBDA 2023)
- Related Report
  2023 Annual Research Report
- Int'l Joint Research
[Presentation] 意識的意思決定システムのマルチエージェント環境下への拡張2023
- Author(s)
  宮崎和光
- Organizer
  計測自動制御学会システム・情報部門学術講演会2023 (SSI2023)
- Related Report
  2023 Annual Research Report
[Presentation] 燃料消費および走行時間を考慮したハイブリッド自動車走行制御に対する深層強化学習の適用2023
- Author(s)
  LI ZHAOXI、原田拓
- Organizer
  計測自動制御学会　システム・情報部門学術講演会2023 (SSI2023)
- Related Report
  2023 Annual Research Report
[Presentation] 機械学習手法を利用したBioDOS にとって有用な論文の発見2023
- Author(s)
  宮崎和光、木賀大介、安田翔也、濱田立輝、小玉直樹、山村雅幸
- Organizer
  電気学会システム/制御合同研究会
- Related Report
  2023 Annual Research Report
[Presentation] マルチエージェント環境下における強化学習を用いたネガティブツイートの抑制2023
- Author(s)
  宮崎和光
- Organizer
  第50回知能システムシンポジウム
- Related Report
  2022 Research-status Report
[Presentation] Effectiveness of Character-level CNN and its Examination of Perturbation for Weights2023
- Author(s)
  Miyazaki Kazuteru、Ida Masaaki
- Organizer
  28th International Symposium on Artificial Life and Robotics (AROB 28th 2023)
- Related Report
  2022 Research-status Report
- Int'l Joint Research
[Presentation] Learning Thresholds to Select Cooperative Partners by Applying Deep Reinforcement Learning in Distributed Traffic Signal Control2023
- Author(s)
  Matsuta Shinya、Kodama Naoki、Harada Taku
- Organizer
  38th International Conference on Computers and Their Applications
- Related Report
  2022 Research-status Report
- Int'l Joint Research
[Presentation] Distributed Traffic Signal Control with Fairness Using Deep Reinforcement Learning2023
- Author(s)
  Shirasaka Shogo、Kodama Naoki、Harada Taku
- Organizer
  SICE International Symposium on Control Systems 2023
- Related Report
  2022 Research-status Report
- Int'l Joint Research
[Presentation] 強化学習を用いたネガティブツイートの抑制2022
- Author(s)
  宮崎和光
- Organizer
  計測自動制御学会システム・情報部門学術講演会2022
- Related Report
  2022 Research-status Report
[Presentation] 経験強化型深層強化学習による Atari2600 シミュレーション2022
- Author(s)
  小玉直樹、原田拓、宮崎和光
- Organizer
  計測自動制御学会システム・情報部門学術講演会2022
- Related Report
  2022 Research-status Report
[Presentation] 説明可能な深層強化学習法の提案2022
- Author(s)
  小玉直樹、原田拓、宮崎和光
- Organizer
  電気学会C部門大会
- Related Report
  2022 Research-status Report
[Presentation] 深層学習を利用したBioDOS にとって有用な論文の発見2022
- Author(s)
  宮崎和光、木賀大介、安田翔也、濱田立輝、小玉直樹、山村雅幸
- Organizer
  電気学会C部門大会
- Related Report
  2022 Research-status Report
[Presentation] Rule-based generation of synthetic genetic circuits2022
- Author(s)
  Kiga Daisuke、Miyazaki Kazuteru、Yasuda Shoya、Hamada Ritsuki、Okuda Sota、Sekine Ryoji、Kodama Naoki、Yamamura Masayuki
- Organizer
  14th International Workshop on Bio-Design Automation (IWBDA 2022)
- Related Report
  2022 Research-status Report
- Int'l Joint Research
[Presentation] Profit Sharing による方策の直接強化手法の提案2022
- Author(s)
  小玉直樹、宮崎和光、原田拓
- Organizer
  第49回知能システムシンポジウム
- Related Report
  2021 Research-status Report
[Presentation] Proposal and Evaluation of Deep Profit Sharing Method in a Mixed Reward and Penalty Environment2021
- Author(s)
  Kazuteru Miyazaki
- Organizer
  2021 Annual International Conference on Brain-Inspired Cognitive Architectures for Artificial Intelligence
- Related Report
  2021 Research-status Report
- Int'l Joint Research
[Presentation] 状態遷移予測型Deep Q-Networkの提案2021
- Author(s)
  小玉直樹、宮崎和光、原田拓
- Organizer
  計測自動制御学会システム・情報部門学術講演会2021
- Related Report
  2021 Research-status Report
[Presentation] 確率的報酬課題におけるプラセボ効果の強化学習によるモデル化2021
- Author(s)
  宮崎和光
- Organizer
  計測自動制御学会システム・情報部門学術講演会2021
- Related Report
  2021 Research-status Report
[Presentation] 状態遷移予測型強化学習法の提案2021
- Author(s)
  小玉直樹、宮崎和光、原田拓
- Organizer
  電気学会C部門大会
- Related Report
  2021 Research-status Report
[Presentation] 報酬と罰が混合する環境における深層経験強化型学習に関する一考察2021
- Author(s)
  宮崎和光
- Organizer
  電気学会C部門大会
- Related Report
  2021 Research-status Report
[Presentation] 学位に付記する専攻分野の名称とディプロマ・ポリシーの整合性判定支援システムの性能改善2021
- Author(s)
  宮崎和光、吉田望、森利枝
- Organizer
  電気学会システム/制御合同研究
- Related Report
  2021 Research-status Report
[Presentation] Evaluation of Character-Level CNNs using the NTCIR-13 MedWeb Task2021
- Author(s)
  Kazuteru Miyazaki、Masaaki Ida
- Organizer
  The 22nd International Symposium on Advanced Intelligent Systems
- Related Report
  2021 Research-status Report
- Int'l Joint Research
[Presentation] Character-level CNN の重みの摂動に関する一考察 - NTCIR-13 MedWeb タスクを題材として -2021
- Author(s)
  宮崎和光、井田正明
- Organizer
  計測自動制御学会システム・情報部門学術講演会2021
- Related Report
  2021 Research-status Report
[Presentation] NTCIR-13 MedWebタスクを用いたCharacter-level CNNの性能評価2021
- Author(s)
  宮崎和光、井田正明
- Organizer
  電気学会C部門大会
- Related Report
  2021 Research-status Report
[Presentation] Proposal for selecting a cooperation partner in distributed control of traffic signals using deep reinforcement learning2021
- Author(s)
  Shinya Matsuta、Naoki Kodama、Taku Harada
- Organizer
  Proceedings of the 8th IIAE International Conference on Intelligent Systems and Image Processing 2021
- Related Report
  2021 Research-status Report
- Int'l Joint Research
[Book] 危機こそマネジメント改革の好機（第3部第3章「研究者養成としての大学院教育」を山口周が執筆）2022
- Author(s)
  川口昭彦、栗田佳代子、山口周、吉田塁、長谷川壽一（編集協力）、福田秀樹（編集協力）
- Total Pages
  172
- Publisher
  株式会社ぎょうせい
- Related Report
  2021 Research-status Report

Research on the innovative evolution of deep reinforcement learning based on the profit sharing principle and its application to real problems

Principal Investigator

Miyazaki Kazuteru 独立行政法人大学改革支援・学位授与機構, 研究開発部, 教授 (20282866)

¥4,290,000 (Direct Cost: ¥3,300,000、Indirect Cost: ¥990,000)

Report

Research Products

[Journal Article] Proposal of a Course-Classification Support System Using Deep Learning and its Evaluation When Combined with Reinforcement Learning2024

Author(s)

Journal Title

DOI

ISSN

Year and Date

Related Report

[Journal Article] Suppression of negative tweets using reinforcement learning systems2024

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Performance evaluation of character-level CNNs using tweet data and analysis for weight perturbations2024

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Enhanced Naive Agent in Angry Birds AI Competition via Exploitation-oriented Learning2024

Author(s)

Journal Title

Related Report

[Journal Article] Proposal and Evaluation of a Course-Classification-Support System Emphasizing Communication with the Sub-committees Within the Committee of Validation and Examination for Degrees2023

Author(s)

Journal Title

DOI

ISBN

Related Report

[Journal Article] Surface Hydroxyl-Ion Diffusion and Hierarchical Structure of Adsorbed Water on Hydrated Layered Double Hydroxides2023

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Research on the Consistency of Diploma Policies and the Nomenclature of Major Fields of Academic Degrees2022

Author(s)

Journal Title

DOI

NAID

ISSN

Year and Date

Related Report

[Journal Article] Traffic Signal Control System Using Deep Reinforcement Learning With Emphasis on Reinforcing Successful Experiences2022

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Modeling of placebo effect in stochastic reward tasks by reinforcement learning2022

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Home Energy Management Algorithm Based on Deep Reinforcement Learning Using Multistep Prediction2021

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Proposal and evaluation of deep exploitation-oriented learning under multiple reward environment2021

Author(s)

Journal Title

DOI

Related Report

[Presentation] Application of Deep Reinforcement Learning to Decentralized Control of Traffic Signals Considering Fairness in a Road Traffic Network Including Intersections Without Traffic Signals2024

Author(s)

Organizer

Related Report

[Presentation] Suppression of Negative Tweets using Reinforcement Learning Systems in a Multi-Agent Environment2023

Author(s)

Organizer

Related Report

[Presentation] Competencies to Be Cultivated in Higher Education and Their Evaluation in the Era of Generative AI: Through the Experiences With Self-Study Degree-Awarding Program in NIAD-QE2023

Author(s)

Organizer

Related Report

[Presentation] Rule-based generation of synthetic genetic circuits2023

Author(s)