Designing the autonomous learning system by the continuous reinforcement learning agent with the coach

Research Project

Project/Area Number	16K00317
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Multi-year Fund
Section	一般
Research Field	Intelligent informatics
Research Institution	Nara National College of Technology
Principal Investigator	Yamaguchi Tomohiro 奈良工業高等専門学校, 情報工学科, 教授 (00240838)
Co-Investigator(Kenkyū-buntansha)	高玉圭樹電気通信大学, 大学院情報理工学研究科, 教授 (20345367)
Project Period (FY)	2016-04-01 – 2019-03-31
Project Status	Completed (Fiscal Year 2018)
Budget Amount *help	¥4,550,000 (Direct Cost: ¥3,500,000、Indirect Cost: ¥1,050,000) Fiscal Year 2018: ¥1,170,000 (Direct Cost: ¥900,000、Indirect Cost: ¥270,000) Fiscal Year 2017: ¥1,560,000 (Direct Cost: ¥1,200,000、Indirect Cost: ¥360,000) Fiscal Year 2016: ¥1,820,000 (Direct Cost: ¥1,400,000、Indirect Cost: ¥420,000)
Keywords	機械学習 / 学習過程 / 自律学習 / 逆強化学習 / 継続的学習 / 多目的強化学習 / 目標生成 / 報酬生起確率 / 強化学習 / 継続的学習支援 / 学習エージェント / 振り返り / 学習目標空間 / 学習目標生成 / 上達過程の可視化 / 冗長解 / 派生問題生成 / 学習目標の空白域 / 気づき支援 / 継続的強化学習 / 報酬獲得解 / 生起確率ベクトル空間 / 凸包 / 一括強化学習 / 上達過程 / 人工知能 / 自律学習システム / コーチ機能
Outline of Final Research Achievements	This research proposed the autonomously continuous learning system by visualizing the learning processes which is easy to understand for a human. The main objective of this research is the continuous improvement for a human learning skill and the visualization of the improvement process. In this research, we investigated the method for indirectly visualizing the gap area of undiscovered goals by visualizing the positional relation among derived goals in an automated way. The result of the comparative experiment by the human subjects suggested that it is important that the display condition which indicates the positional relation with the gap of learning (which is the distance between a new goal found by the learner and the area of known goals) as the learning feedback information during the improvement process. In other words, it is the condition to facilitate the awareness of the learner’s unknown sense of values.
Academic Significance and Societal Importance of the Research Achievements	近年，注目されている深層学習の主な弱点は(1)人が実現不能な学習手法と(2)内部の学習過程の理解困難さである．これに対し，本研究では深層学習の弱点を補うため，(1)様々な問題を生成し提供することで，人が学習の仕方を学べる機能，(2)学習結果の解釈を行い，人が理解しやすくなるように学習過程・上達過程を可視化する機能を考案した．本研究によって学習目標となる報酬設計が難しかった強化学習法の幅広い分野への適用が可能になる．また，自律学習システムは問題領域ごとに初期問題を与えると様々な派生問題とその解を反復的に生成するため，問題や解のバリエーションを大量に必要とするタスクに応用できる．

Report

(4 results)

2018 Annual Research Report Final Research Report ( PDF )
2017 Research-status Report
2016 Research-status Report

Research Products
(31 results)

All 2019 2018 2017 2016

All Journal Article (13 results) (of which Int'l Joint Research: 4 results, Peer Reviewed: 13 results, Open Access: 4 results, Acknowledgement Compliant: 1 results) Presentation (16 results) (of which Int'l Joint Research: 7 results) Book (2 results)

[Journal Article] Model-based Multi-Objective Reinforcement Learning with Unknown Weights2019
- Author(s)
  Yamaguchi, T., Nagahama, S., Ichikawa, Y., and Takadama, K.
- Journal Title
  
  Human Interface and the Management of Information, Lecture Notes in Computer Science
  
  Volume: 印刷中
- Related Report
  2018 Annual Research Report
- Peer Reviewed
[Journal Article] Strategy for Learning Cooperative Behavior with Local Information for Multi-agent Systems2018
- Author(s)
  Uwano, F. and Takadama, K.
- Journal Title
  
  Principles and Practice of Multi-Agent Systems, Lecture Notes in Computer Science
  
  Volume: 11224 Pages: 663-670
- DOI
  10.1007/978-3-030-03098-8_54
- ISBN
  9783030030971, 9783030030988
- Related Report
  2018 Annual Research Report
- Peer Reviewed
[Journal Article] Awareness Based Recommendation by Passively Interactive Learning: Toward a Probabilistic Event2018
- Author(s)
  - Yamaguchi, T., Nishimura, T., Nagahama, S., and Takadama, K.
- Journal Title
  
  Novel Design and Applications of Robotics Technologies
  
  Volume: Chapter 9 Pages: 247-275
- DOI
  10.4018/978-1-5225-5276-5.ch009
- Related Report
  2018 Annual Research Report
- Peer Reviewed
[Journal Article] Correcting Wrongly Determined Opinions of Agents in Opinion Sharing Model2018
- Author(s)
  Kitajima, E., Zhang, C. Ishii, H., Uwano, F., and Takadama, K.
- Journal Title
  
  Human Interface and the Management of Information
  
  Volume: LNCS 10904 Pages: 658-676
- DOI
  10.1007/978-3-319-92043-6_52
- ISBN
  9783319920429, 9783319920436
- Related Report
  2018 Annual Research Report
- Peer Reviewed / Open Access / Int'l Joint Research
[Journal Article] Generating Learning Environments Derived from Found Solutions by Adding Sub-goals toward the Creative Learning Support2018
- Author(s)
  Okudo, T., Yamaguchi, T., and Takadama, K.
- Journal Title
  
  Human Interface and the Management of Information, Lecture Notes in Computer Science
  
  Volume: 10905 Pages: 313-330
- DOI
  10.1007/978-3-319-92046-7_28
- ISBN
  9783319920450, 9783319920467
- Related Report
  2018 Annual Research Report
- Peer Reviewed
[Journal Article] Analyzing the Goal Finding Process of Human's Continuous Learning with the Reflection Subtask2018
- Author(s)
  Yamaguchi, T. Tamai, Y., Y. Honma and Takadama, K.
- Journal Title
  
  SICE Journal of Control, Measurement, and System Integration (JCMSI)
  
  Volume: Vol. 11, No. 1 Issue: 1 Pages: 40-47
- DOI
  10.9746/jcmsi.11.40
- Related Report
  2017 Research-status Report
- Peer Reviewed / Open Access
[Journal Article] Supporting the Exploration of the Learning Goals for a Continuous Learner Toward Creative Learning2017
- Author(s)
  Okudo, T., Yamaguchi, T., Murata, A., Tatsumi, T., Uwano, F. and Takadama, K.
- Journal Title
  
  Journal of Advanced Computational Intelligence and Intelligent Informatics
  
  Volume: 21 Issue: 5 Pages: 907-916
- DOI
  10.20965/jaciii.2017.p0907
- NAID
  130007520194
- ISSN
  1343-0130, 1883-8014
- Year and Date
  2017-09-20
- Related Report
  2017 Research-status Report
- Peer Reviewed / Open Access
[Journal Article] Exemplar-Based Learning Classifier System with Dynamic Matching Range for Imbalanced Data2017
- Author(s)
  Matsumoto, K., Tatsumi, T., Sato, H., Kovacs, T. and Takadama, K.
- Journal Title
  
  Journal of Advanced Computational Intelligence and Intelligent Informatics
  
  Volume: 21 Issue: 5 Pages: 868-875
- DOI
  10.20965/jaciii.2017.p0868
- NAID
  130007520189
- ISSN
  1343-0130, 1883-8014
- Year and Date
  2017-09-20
- Related Report
  2017 Research-status Report
- Peer Reviewed / Open Access
[Journal Article] Analyzing the goal findingprocess of human's learning with the reflection subtask2017
- Author(s)
  Yamaguchi, T. Tamai, Y. and Takadama, K.
- Journal Title
  
  Handbook of Research on Biomimetics and Biomedical Robotics
  
  Volume: Chapter 19 Pages: 442-459
- Related Report
  2017 Research-status Report
- Peer Reviewed
[Journal Article] Designing the learning goal space for human toward acquiring a creative learning skill2017
- Author(s)
  Okudo, T., Yamaguchi, T. and Takadama, K.
- Journal Title
  
  Handbook of Research on Biomimetics and Biomedical Robotics
  
  Volume: Chapter 20 Pages: 460-475
- Related Report
  2017 Research-status Report
- Peer Reviewed
[Journal Article] Multi-Agent Cooperation Based on Reinforcement Learning with Internal Reward in Maze Problem2017
- Author(s)
  Uwano, F., Tatebe, N., Nakata, M., Tajima, Y., Kovacs, T., and Takadama, K.
- Journal Title
  
  SICE Journal of Control, Measurement, and System Integration (JCMSI)
  
  Volume: 10
- Related Report
  2016 Research-status Report
- Peer Reviewed / Int'l Joint Research
[Journal Article] Reinforcement Learning with Internal Reward for Multi-Agent Cooperation: A Theoretical Approach2016
- Author(s)
  Uwano, F., Tatebe, N., Nakata, M., Takadama, K., and Kovacs, T.
- Journal Title
  
  EAI Endorsed Transactions on Collaborative Computing
  
  Volume: 16
- Related Report
  2016 Research-status Report
- Peer Reviewed / Int'l Joint Research
[Journal Article] Awareness based recommendation - passively interactive learning system2016
- Author(s)
  Yamaguchi, T., Nishimura, T., and Takadama, K.
- Journal Title
  
  International Journal of Robotics Applications and Technologies
  
  Volume: 4 Issue: 1 Pages: 83-99
- DOI
  10.4018/ijrat.2016010105
- Related Report
  2016 Research-status Report
- Peer Reviewed / Int'l Joint Research / Acknowledgement Compliant
[Presentation] Complex-Valued-based Learning Classifier System for POMDP Environments2019
- Author(s)
  Takadama, K., Yamazaki, D, Nakata, M., and H. Sato
- Organizer
  2019 IEEE Congress on Evolutionary Computation (CEC2019)
- Related Report
  2018 Annual Research Report
- Int'l Joint Research
[Presentation] Maximum Entropy Inverse Reinforcement Learning with incomplete expert2019
- Author(s)
  Hasegawa, S., Uwano, F., and Takadama, K.
- Organizer
  The 24th International Symposium on Artificial Life and Robotics (AROB 2019)
- Related Report
  2018 Annual Research Report
- Int'l Joint Research
[Presentation] 報酬の動的変化に適応する通信なしマルチエージェント協調学習のための公平性に基づく内部報酬設定法2018
- Author(s)
  上野史，高玉圭樹
- Organizer
  計測自動制御学会，システム・情報部門学術講演会 2018 (SSI2018)
- Related Report
  2018 Annual Research Report
[Presentation] 行動系列分割に基づく不完全なエキスパートからの逆強化学習2018
- Author(s)
  長谷川智，上野史，高玉圭樹
- Organizer
  計測自動制御学会，システム・情報部門学術講演会 2018 (SSI2018)
- Related Report
  2018 Annual Research Report
[Presentation] 負の報酬生成による環境変化に適応可能な逆強化学習2018
- Author(s)
  長谷川智，梅内祐太，上野史，佐藤寛之，山口智浩，高玉圭樹
- Organizer
  計測自動制御学会，第45回知能システムシンポジウム
- Related Report
  2017 Research-status Report
[Presentation] 報酬生起確率ベクトルに基づくあらゆる状況に対する強化学習2018
- Author(s)
  長濵将太, 市川嘉裕, 高玉圭樹，山口智浩
- Organizer
  計測自動制御学会，第45回知能システムシンポジウム
- Related Report
  2017 Research-status Report
[Presentation] 難易度と技術偏差に基づく学習目標生成を促すインタラクティブ学習支援2017
- Author(s)
  福田千賀，村田暁紀，石井晴之，佐藤寛之，高玉圭樹
- Organizer
  計測自動制御学会，第44回知能システムシンポジウム
- Place of Presentation
  東京
- Year and Date
  2017-03-14
- Related Report
  2016 Research-status Report
[Presentation] Designing the learning goal space toward acquiring a creative learning skill2017
- Author(s)
  Okudo, T., Takadama, K., and Yamaguchi, T.
- Organizer
  The 22nd International Symposium on Artificial Life and Robotics (AROB'17)
- Place of Presentation
  Beppu, Oita
- Year and Date
  2017-01-21
- Related Report
  2016 Research-status Report
- Int'l Joint Research
[Presentation] Designing the learning goal space for human toward acquiring a creative learning skill2017
- Author(s)
  Okudo, T., Takadama, K. and Yamaguchi, T.
- Organizer
  HCI International 2017 (HCII2017)
- Related Report
  2017 Research-status Report
- Int'l Joint Research
[Presentation] 深層学習による次元圧縮ルールの学習分類子システムにおける初期ルールとしての可能性2017
- Author(s)
  松本和馬, 高野諒, 上野史，佐藤寛之, 高玉圭樹
- Organizer
  進化計算学会，第11回進化計算シンポジウム 2017
- Related Report
  2017 Research-status Report
[Presentation] 報酬生起確率ベクトルと重みベクトルに基づく全ての最適方策の一括強化学習2017
- Author(s)
  長濵将太，山口智，高玉圭樹
- Organizer
  計測自動制御学会，システム・情報部門学術講演会 2017 (SSI2017)
- Related Report
  2017 Research-status Report
[Presentation] 深層学習による圧縮ルールを復元する学習分類子システムとその精度向上2017
- Author(s)
  松本和馬, 高野諒, 佐藤寛之, 高玉圭樹
- Organizer
  第13回進化計算学会研究会，進化計算学会
- Related Report
  2017 Research-status Report
[Presentation] サブゴールの振り返りによる学習者の継続的学習支援2016
- Author(s)
  玉井雄貴，山口智浩，高玉圭樹
- Organizer
  計測自動制御学会，システム・情報部門学術講演会 2016 (SSI2016)
- Place of Presentation
  滋賀，大津
- Year and Date
  2016-12-07
- Related Report
  2016 Research-status Report
[Presentation] Communication-less Cooperative Q-learning Agents in Maze Problem2016
- Author(s)
  Uwano, F. and Takadama, K.
- Organizer
  The 20th International Symposium on Intelligent and Evolutionary Systems (IES 2016)
- Place of Presentation
  Canberra, Australia
- Year and Date
  2016-11-17
- Related Report
  2016 Research-status Report
- Int'l Joint Research
[Presentation] Preventing Incorrect Opinion Sharing with Weighted Relationship among Agents2016
- Author(s)
  Saito, R., Nakata, M., Sato, H., Kovacs, T., and Takadama, K.
- Organizer
  The 18th International Conference on Human-Computer Interaction (HCI International 2016)
- Place of Presentation
  Toronto, Canada
- Year and Date
  2016-07-20
- Related Report
  2016 Research-status Report
- Int'l Joint Research
[Presentation] Possibility of Education Project based on Cansat2016
- Author(s)
  Saito, R., Murata, A., and Takadama, K.
- Organizer
  The 13th International Symposium on Artificial Intelligence, Robotics and Automation in Space (i-SAIRAS2016)
- Place of Presentation
  (Beijing, China
- Year and Date
  2016-06-22
- Related Report
  2016 Research-status Report
- Int'l Joint Research
[Book] Novel Design and Applications of Robotics Technologies, Chapter 92018
- Author(s)
  Yamaguchi, T., Nishimura, T., Nagahama, S., and Takadama, K.
- Total Pages
  341
- Publisher
  IGI Global
- Related Report
  2018 Annual Research Report
[Book] Handbook of Research on Biomimetics and Biomedical Robotics2017
- Author(s)
  Maki Habib
- Total Pages
  532
- Publisher
  IGI Global
- Related Report
  2017 Research-status Report

Designing the autonomous learning system by the continuous reinforcement learning agent with the coach

Principal Investigator

Yamaguchi Tomohiro 奈良工業高等専門学校, 情報工学科, 教授 (00240838)

¥4,550,000 (Direct Cost: ¥3,500,000、Indirect Cost: ¥1,050,000)

Report

Research Products

[Journal Article] Model-based Multi-Objective Reinforcement Learning with Unknown Weights2019

Author(s)

Journal Title

Related Report

[Journal Article] Strategy for Learning Cooperative Behavior with Local Information for Multi-agent Systems2018

Author(s)

Journal Title

DOI

ISBN

Related Report

[Journal Article] Awareness Based Recommendation by Passively Interactive Learning: Toward a Probabilistic Event2018

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Correcting Wrongly Determined Opinions of Agents in Opinion Sharing Model2018

Author(s)

Journal Title

DOI

ISBN

Related Report

[Journal Article] Generating Learning Environments Derived from Found Solutions by Adding Sub-goals toward the Creative Learning Support2018

Author(s)

Journal Title

DOI

ISBN

Related Report

[Journal Article] Analyzing the Goal Finding Process of Human's Continuous Learning with the Reflection Subtask2018

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Supporting the Exploration of the Learning Goals for a Continuous Learner Toward Creative Learning2017

Author(s)

Journal Title

DOI

NAID

ISSN

Year and Date

Related Report

[Journal Article] Exemplar-Based Learning Classifier System with Dynamic Matching Range for Imbalanced Data2017

Author(s)

Journal Title

DOI

NAID

ISSN

Year and Date

Related Report

[Journal Article] Analyzing the goal findingprocess of human's learning with the reflection subtask2017

Author(s)

Journal Title

Related Report

[Journal Article] Designing the learning goal space for human toward acquiring a creative learning skill2017

Author(s)

Journal Title

Related Report

[Journal Article] Multi-Agent Cooperation Based on Reinforcement Learning with Internal Reward in Maze Problem2017

Author(s)

Journal Title

Related Report

[Journal Article] Reinforcement Learning with Internal Reward for Multi-Agent Cooperation: A Theoretical Approach2016

Author(s)

Journal Title

Related Report

[Journal Article] Awareness based recommendation - passively interactive learning system2016

Author(s)

Journal Title

DOI

Related Report

[Presentation] Complex-Valued-based Learning Classifier System for POMDP Environments2019

Author(s)

Organizer

Related Report

[Presentation] Maximum Entropy Inverse Reinforcement Learning with incomplete expert2019