• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Research on new machine learning method combining Exploitation-oriented Learning and Deep Learning

Research Project

Project/Area Number 17K00327
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeMulti-year Fund
Section一般
Research Field Intelligent informatics
Research InstitutionNational Institution for Academic Degrees and Quality Enhancement of Higher Education

Principal Investigator

MIYAZAKI Kazuteru  独立行政法人大学改革支援・学位授与機構, 研究開発部, 准教授 (20282866)

Project Period (FY) 2017-04-01 – 2020-03-31
Project Status Completed (Fiscal Year 2019)
Budget Amount *help
¥4,420,000 (Direct Cost: ¥3,400,000、Indirect Cost: ¥1,020,000)
Fiscal Year 2019: ¥1,170,000 (Direct Cost: ¥900,000、Indirect Cost: ¥270,000)
Fiscal Year 2018: ¥1,170,000 (Direct Cost: ¥900,000、Indirect Cost: ¥270,000)
Fiscal Year 2017: ¥2,080,000 (Direct Cost: ¥1,600,000、Indirect Cost: ¥480,000)
Keywords強化学習 / 経験強化型学習 / 深層学習 / 深層強化学習 / 知能ロボット / ロボット
Outline of Final Research Achievements

We have proposed new machine learning methods such as LADQN and DPN that have been combined exploitation-oriented learning XoL with deep learning. In particular, DPN can be learned by 1/10th number of trials and errors than that of DQN, a typical deep reinforcement learning method, in the Atari2600 game environment under certain conditions.
In addition, we have demonstrated the effectiveness of the XoL method combined with deep learning by applying it to the detection of drowsiness in car drivers and the identification of disease symptoms based on pseudo-tweets. We believe that it has contributed to expanding the applicability of the method based on trial-and-error searches to domains that require real-time performance, which has been difficult in conventional deep reinforcement learning.

Academic Significance and Societal Importance of the Research Achievements

強化学習などの試行錯誤に基づく学習は、膨大なデータの中から有効な制御則や戦略を獲得するのに適した接近法である。しかし、一般に、学習には膨大な試行錯誤回数を要するという問題がある。特に近年は、深層学習と融合した深層強化学習の登場により、今まで以上に、試行錯誤回数の削減が重要となっていた。
この問題に対し、本研究課題では、試行錯誤回数の大幅な削減を実現する手法の提案を行った。この成果は、ロボット制御などの、今まで困難であったリアルタイム性が重視される領域への深層強化学習の適用可能性を高めることにつながり、人工知能技術の応用範囲をこれまで以上に広げるものであると考える。

Report

(4 results)
  • 2019 Annual Research Report   Final Research Report ( PDF )
  • 2018 Research-status Report
  • 2017 Research-status Report
  • Research Products

    (41 results)

All 2020 2019 2018 2017

All Journal Article (7 results) (of which Peer Reviewed: 6 results,  Open Access: 6 results) Presentation (34 results) (of which Int'l Joint Research: 11 results,  Invited: 1 results)

  • [Journal Article] Construction of Consistency Judgment System of Diploma Policy and Curriculum Policy using Character-level CNN(雑誌論文「Character-level CNNを用いたディプロマ・ポリシーとカリキュラム・ポリシーの整合性判定システムの構築」の翻訳版)2020

    • Author(s)
      Kazuteru Miyazaki、Masaaki Ida
    • Journal Title

      Electronics and Communications in Japan

      Volume: 102 Issue: 12 Pages: 30-39

    • DOI

      10.1002/ecj.12223

    • Related Report
      2019 Annual Research Report
    • Open Access
  • [Journal Article] Construction of Consistency Judgment System of Diploma Policy and Curriculum Policy using Character-level CNN2019

    • Author(s)
      宮崎 和光、井田 正明
    • Journal Title

      IEEJ Transactions on Electronics, Information and Systems

      Volume: 139 Issue: 10 Pages: 1119-1127

    • DOI

      10.1541/ieejeiss.139.1119

    • NAID

      130007722362

    • ISSN
      0385-4221, 1348-8155
    • Year and Date
      2019-10-01
    • Related Report
      2019 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Proposal and Evaluation of Detour Path Suppression Method in PS Reinforcement Learning2019

    • Author(s)
      SHIRAISHI Daisuke、MIYAZAKI Kazuteru、KOBAYASHI Hiroaki
    • Journal Title

      SICE Journal of Control, Measurement, and System Integration

      Volume: 12 Issue: 5 Pages: 190-198

    • DOI

      10.9746/jcmsi.12.190

    • Related Report
      2019 Annual Research Report
    • Peer Reviewed / Open Access
  • [Journal Article] Proposal and Evaluation of Reward Sharing Method Based on Safety Level2018

    • Author(s)
      KODAMA Naoki、MIYAZAKI Kazuteru、KOBAYASHI Hiroaki
    • Journal Title

      SICE Journal of Control, Measurement, and System Integration

      Volume: 11 Issue: 3 Pages: 207-213

    • DOI

      10.9746/jcmsi.11.207

    • Related Report
      2018 Research-status Report
    • Peer Reviewed / Open Access
  • [Journal Article] Proposal of a Deep Q-network with Profit Sharing2018

    • Author(s)
      Miyazaki Kazuteru
    • Journal Title

      Procedia Computer Science

      Volume: 123 Pages: 302-307

    • DOI

      10.1016/j.procs.2018.01.047

    • Related Report
      2018 Research-status Report
    • Peer Reviewed / Open Access
  • [Journal Article] Exploitation-Oriented Learning with Deep Learning – Introducing Profit Sharing to a Deep Q-Network –2017

    • Author(s)
      Kazuteru Miyazaki
    • Journal Title

      Journal of Advanced Computational Intelligence and Intelligent Informatics

      Volume: 21 Issue: 5 Pages: 849-855

    • DOI

      10.20965/jaciii.2017.p0849

    • NAID

      130007520180

    • ISSN
      1343-0130, 1883-8014
    • Year and Date
      2017-09-20
    • Related Report
      2017 Research-status Report
    • Peer Reviewed / Open Access
  • [Journal Article] Proposal of PSwithEFP and its Evaluation in Multi-Agent Reinforcement Learning2017

    • Author(s)
      Kazuteru Miyazaki, Koudai Furukawa, and Hiroaki Kobayashi
    • Journal Title

      Journal of Advanced Computational Intelligence and Intelligent Informatics

      Volume: 21 Issue: 5 Pages: 930-938

    • DOI

      10.20965/jaciii.2017.p0930

    • NAID

      130007520101

    • ISSN
      1343-0130, 1883-8014
    • Year and Date
      2017-09-20
    • Related Report
      2017 Research-status Report
    • Peer Reviewed / Open Access
  • [Presentation] Classification of Medical Data using Character-level CNN2020

    • Author(s)
      Kazuteru Miyazaki
    • Organizer
      The 3rd International Conference on Information Science and System (ICISS 2020)
    • Related Report
      2019 Annual Research Report
    • Int'l Joint Research
  • [Presentation] 深層強化学習を利用したドライバーの眠気防止システムに関する一考察2020

    • Author(s)
      宮崎和光
    • Organizer
      第47回 知能システムシンポジウム
    • Related Report
      2019 Annual Research Report
  • [Presentation] Profit Sharingによる方策の直接強化手法の提案2020

    • Author(s)
      小玉直樹、原田拓、宮崎和光
    • Organizer
      第47回 知能システムシンポジウム
    • Related Report
      2019 Annual Research Report
  • [Presentation] Deep Reinforcement Learning with Dual Targeting Algorithm2019

    • Author(s)
      Naoki Kodama、Taku Harada、Kazuteru Miyazaki
    • Organizer
      2019 International Joint Conference on Neural Networks (IJCNN)
    • Related Report
      2019 Annual Research Report
    • Int'l Joint Research
  • [Presentation] 意識的意思決定システムへの深層強化学習の適用可能性に関する一考察2019

    • Author(s)
      宮崎和光
    • Organizer
      計測自動制御学会 システム・情報部門 学術講演会2019
    • Related Report
      2019 Annual Research Report
  • [Presentation] 経験強化型学習によるAngry Birds AI Competitionへの挑戦2019

    • Author(s)
      宮崎和光
    • Organizer
      計測自動制御学会 システム・情報部門 学術講演会2019
    • Related Report
      2019 Annual Research Report
  • [Presentation] Character-level CNNを用いたディプロマ・ポリシーマッチングテストの大規模調査結果との比較2019

    • Author(s)
      宮崎和光、高橋望、森利枝
    • Organizer
      計測自動制御学会 システム・情報部門 学術講演会2019
    • Related Report
      2019 Annual Research Report
  • [Presentation] 経験強化型学習を用いた分散深層強化学習手法の提案2019

    • Author(s)
      小玉直樹、原田拓、宮崎和光
    • Organizer
      計測自動制御学会 システム・情報部門 学術講演会2019
    • Related Report
      2019 Annual Research Report
  • [Presentation] ディプロマ・ポリシーと学位に付記する専攻分野の名称の整合性に関する研究 - 大規模調査結果の分析 -2019

    • Author(s)
      宮崎和光、高橋望、森利枝
    • Organizer
      電気学会C部門大会
    • Related Report
      2019 Annual Research Report
  • [Presentation] Research on Consistency between Diploma Policies and Nomenclature of Major Disciplines : Deep Learning Approach2019

    • Author(s)
      Kazuteru Miyazaki、Nozomi Takahashi、Rie Mori
    • Organizer
      7th International Conference on Information and Education Technology (ICIET 2019)
    • Related Report
      2018 Research-status Report
    • Int'l Joint Research
  • [Presentation] 非ブートストラップ手法を利用した深層強化学習アルゴリズムの提案2019

    • Author(s)
      小玉直樹、原田拓、宮崎和光
    • Organizer
      第46回 知能システムシンポジウム
    • Related Report
      2018 Research-status Report
  • [Presentation] A Proposal for Reducing the Number of Trial-and-Error Searches for Deep Q-Networks Combined with Exploitation-Oriented Learning2018

    • Author(s)
      Naoki Kodama、Kazuteru Miyazaki、Taku Harada
    • Organizer
      17th IEEE International Conference on Machine Learning and Applications (ICMLA 2018)
    • Related Report
      2018 Research-status Report
    • Int'l Joint Research
  • [Presentation] Consistency Assessment between Diploma Policy and Curriculum Policy using Character-level CNN2018

    • Author(s)
      Kazuteru Miyazaki、Masaaki Ida
    • Organizer
      Joint 10th International Conference on Soft Computing and Intelligent Systems and 19th International Symposium on Advanced Intelligent Systems (SCIS&ISIS 2018)
    • Related Report
      2018 Research-status Report
    • Int'l Joint Research
  • [Presentation] Proposal of Detour Path Suppression Method in PS Reinforcement Learning and Its Application to Altruistic Multi-agent Environment2018

    • Author(s)
      Daisuke Shiraishi、Kazuteru Miyazaki、Hiroaki Kobayashi
    • Organizer
      International Conference on Principles and Practice of Multi-Agent Systems (PRIMA 2018)
    • Related Report
      2018 Research-status Report
    • Int'l Joint Research
  • [Presentation] On Stable Profit Sharing Reinforcement Learning with Expected Failure Probability2018

    • Author(s)
      Daisuke Mizuno、Kazuteru Miyazaki、Hiroaki Kobayashi
    • Organizer
      Biologically Inspired Cognitive Architectures Meeting (BICA 2018)
    • Related Report
      2018 Research-status Report
    • Int'l Joint Research
  • [Presentation] Proposal and Evaluation of an Indirect Reward Assignment Method for Reinforcement Learning by Profit Sharing2018

    • Author(s)
      Kazuteru Miyazaki、Naoki Kodama、Hiroaki Kobayashi
    • Organizer
      IntelliSys 2018
    • Related Report
      2018 Research-status Report
    • Int'l Joint Research
  • [Presentation] Character-level CNNを用いたディプロマ・ポリシーマッチングテスト2018

    • Author(s)
      宮崎和光、高橋望、森利枝
    • Organizer
      計測自動制御学会 システム・情報部門 学術講演会2018
    • Related Report
      2018 Research-status Report
  • [Presentation] 深層強化学習アルゴリズムRainbowとProfit Sharingベース学習の結合2018

    • Author(s)
      小玉直樹、原田拓、宮崎和光
    • Organizer
      計測自動制御学会 システム・情報部門 学術講演会2018
    • Related Report
      2018 Research-status Report
  • [Presentation] 経験強化型学習XoLに関する最近の発展2018

    • Author(s)
      宮崎和光
    • Organizer
      計測自動制御学会 システム・情報部門 学術講演会2018
    • Related Report
      2018 Research-status Report
  • [Presentation] Character-level CNN を用いたディプロマポリシーとカリキュラムポリシーの整合性判定2018

    • Author(s)
      宮崎和光、井田正明
    • Organizer
      システム研究会 インテリジェント・システム (FAN2018)
    • Related Report
      2018 Research-status Report
  • [Presentation] 2つのエピソードを持つ経験強化型深層強化学習手法の提案2018

    • Author(s)
      小玉直樹、原田拓、宮崎和光
    • Organizer
      平成30年 電気学会 電子・情報・システム部門大会
    • Related Report
      2018 Research-status Report
  • [Presentation] Proposal and Evaluation of an Indirect Reward Assignment Method for Reinforcement Learning by Profit Sharing2018

    • Author(s)
      Kazuteru Miyazaki, Naoki Kodama and Hiroaki Kobayashi
    • Organizer
      IntelliSys 2018
    • Related Report
      2017 Research-status Report
    • Int'l Joint Research
  • [Presentation] 将来成功・失敗期待確率を用いた報酬分配型強化学習に関する研究2018

    • Author(s)
      水野大介, 小林博明, 宮崎和光
    • Organizer
      電気学会 システム研究会(ちよだプラットフォームスクウェア 会議室504)
    • Related Report
      2017 Research-status Report
  • [Presentation] Character-level CNNを用いたテキスト分類に関する一考察2018

    • Author(s)
      宮崎和光
    • Organizer
      電気学会 システム研究会(ちよだプラットフォームスクウェア 会議室504)
    • Related Report
      2017 Research-status Report
  • [Presentation] 学習機能を利用したディプロマ・ポリシーマッチングテストの性能改善2018

    • Author(s)
      宮崎和光, 高橋望, 森利枝
    • Organizer
      第45回知能システムシンポジウム
    • Related Report
      2017 Research-status Report
  • [Presentation] 経験強化型学習を利用したdeep Q-networkの学習加速化手法の提案と有効性の検証2018

    • Author(s)
      小玉直樹, 宮崎和光, 小林博明
    • Organizer
      第45回知能システムシンポジウム
    • Related Report
      2017 Research-status Report
  • [Presentation] Proposal of reward sharing method based on safety level and verification of its effectiveness in multi-agent environment2017

    • Author(s)
      Naoki Kodama, Kazuteru Miyazaki, and Hiroaki Kobayashi
    • Organizer
      SICE Annual Conference 2017
    • Related Report
      2017 Research-status Report
    • Int'l Joint Research
  • [Presentation] Proposal of a Deep Q-network with Profit Sharing2017

    • Author(s)
      Kazuteru Miyazaki
    • Organizer
      2017 Annual International Conference on Biologically Inspired Cognitive Architectures (BICA 2017)
    • Related Report
      2017 Research-status Report
    • Int'l Joint Research
  • [Presentation] 深層学習と強化学習 - 経験強化型学習を組み込んだ深層強化学習の評価 -2017

    • Author(s)
      宮崎和光
    • Organizer
      第61回システム制御情報学会研究発表講演会 (SCI’17)
    • Related Report
      2017 Research-status Report
    • Invited
  • [Presentation] 予想失敗確率を組み込んだ新たな罰利用法の提案とマルチエージェント環境下での有効性の検証2017

    • Author(s)
      小玉直樹, 宮崎和光, 小林博明
    • Organizer
      平成29年電気学会 電子・情報・システム部門大会
    • Related Report
      2017 Research-status Report
  • [Presentation] Profit Sharingにおける迂回系列抑制法のマルチエージェント環境下での有効性の検証2017

    • Author(s)
      白石大介, 宮崎和光, 小林博明
    • Organizer
      計測自動制御学会 システム・情報部門 学術講演会2017
    • Related Report
      2017 Research-status Report
  • [Presentation] EFP利用による罰回避を実現したProfit Sharingの現状と課題2017

    • Author(s)
      宮崎和光, 小玉直樹, 小林博明
    • Organizer
      計測自動制御学会 システム・情報部門 学術講演会2017
    • Related Report
      2017 Research-status Report
  • [Presentation] 経験強化型学習を組み込んだ深層強化学習DQNwithPSの改良と有効性の検証2017

    • Author(s)
      小玉直樹, 宮崎和光, 小林博明
    • Organizer
      計測自動制御学会 システム・情報部門 学術講演会2017
    • Related Report
      2017 Research-status Report
  • [Presentation] ディプロマ・ポリシーと学位に付記する専攻分野の名称の整合性に関する研究 - 深層学習による接近 -2017

    • Author(s)
      宮崎和光, 森利枝, 高橋望
    • Organizer
      電気学会 システム研究会 機械学習研究の最新動向
    • Related Report
      2017 Research-status Report

URL: 

Published: 2017-04-28   Modified: 2021-02-19  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi