Cognitive Economy in Reusing Policy Selection for Reinforcement Learning Robots Based on Prototype Theory

Research Project

Project/Area Number	19K12173
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Multi-year Fund
Section	一般
Review Section	Basic Section 61050:Intelligent robotics-related
Research Institution	Tokyo Denki University
Principal Investigator	Suzuki Tsuyoshi 東京電機大学, 工学部, 教授 (00349789)
Co-Investigator(Kenkyū-buntansha)	藤井浩光千葉工業大学, 先進工学部, 准教授 (30781215) 温文東京大学, 大学院工学系研究科(工学部), 特任准教授 (50646601) 河野仁東京工芸大学, 工学部, 准教授 (70758367)
Project Period (FY)	2019-04-01 – 2022-03-31
Project Status	Completed (Fiscal Year 2021)
Budget Amount *help	¥4,420,000 (Direct Cost: ¥3,400,000、Indirect Cost: ¥1,020,000) Fiscal Year 2021: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000) Fiscal Year 2020: ¥1,300,000 (Direct Cost: ¥1,000,000、Indirect Cost: ¥300,000) Fiscal Year 2019: ¥1,690,000 (Direct Cost: ¥1,300,000、Indirect Cost: ¥390,000)
Keywords	転移学習 / 認知心理学モデル / プロトタイプ理論 / 機械学習 / 強化学習 / マルチエージェントロボットシステム / マルチエージェント強化学習 / マルチロボット強化学習 / 認知的経済性
Outline of Research at the Start	本研究課題では，転移学習を用いた強化学習ロボットにおける認知的経済性の実現を目指し，既獲得の複数の学習知識を選択的に再利用する際，保存されている全ての知識を検索して選択するのではなく，視覚等のセンサ入力情報から，再利用する「知識群」の候補をあらかじめ選択し，さらにその知識群にある複数の方策を結合して同時に利用する手法を確立する．特に，認知言語学や心理学で議論されてきたヒトにおけるプロトタイプ理論を用いて知識群をカテゴリ化し，選択すべき部分知識群（カテゴリ）を選択する手法を，強化学習ロボットで実現する．
Outline of Final Research Achievements	In order to realize cognitive economy of reinforcement learning robot using transfer learning, we studied categorization of reusing learning policies, extraction of prototypes in category, and speed-up of reusing policy selection. For shortest path search problem, we performed networking of reusing policies based on spreading activation model, categorization of policies using K-means++ based on prototype theory, and extraction of prototypes by averaging policies within a category, and confirmed the reduction of learning time through computer experiments. We also performed parallel computation using computer clusters for speed-up of computation during policy selection, and verified the effectiveness by implementing the method on an autonomous mobile robot. For object shape categorization and prototype extraction, primitive shape recognition was performed by learning, and shape-appropriate object manipulation was executed.
Academic Significance and Societal Importance of the Research Achievements	本研究の最終的な目標は，「直観」や「直感」といったヒトの無意識的な判断機構をモデル化し，学習ロボットに実装することである．これにより，知能ロボットの新たなタスクへの導入時に，環境の拘束条件の緩和，タスク適応的な行動の迅速な獲得，咄嗟の環境条件変化への対応などが期待できる．本研究課題では，その基礎検討として，認知心理学の知見である活性化拡散モデルおよびプロトタイプ理論の導入による強化学習ロボットの効率的かつ高速な知識のカテゴリ化と選択（認知的経済性）の方法について提案し，実験により検証した．本研究課題の成果は，学習ロボットや学習エージェントの実用化・普及への貢献が期待できる．

Report

(4 results)

2021 Annual Research Report Final Research Report ( PDF )
2020 Research-status Report
2019 Research-status Report

Research Products
(6 results)

All 2021 2020 2019

All Journal Article (2 results) (of which Peer Reviewed: 2 results, Open Access: 2 results) Presentation (4 results)

[Journal Article] Autonomous Reusing Policy Selection using Spreading Activation Model in Deep Reinforcement Learning2021
- Author(s)
  Takakuwa Yusaku、Kono Hitoshi、Fujii Hiromitsu、Wen Wen、Suzuki Tsuyoshi
- Journal Title
  
  International Journal of Advanced Computer Science and Applications
  
  Volume: 12 Issue: 4 Pages: 8-15
- DOI
  10.14569/ijacsa.2021.0120402
- Related Report
  2021 Annual Research Report 2020 Research-status Report
- Peer Reviewed / Open Access
[Journal Article] Activation and Spreading Sequence for Spreading Activation Policy Selection Method in Transfer Reinforcement Learning2019
- Author(s)
  Hitoshi Kono, Ren Katayama, Yusaku Takakuwa, Wen Wen, Tsuyoshi Suzuki
- Journal Title
  
  International Journal of Advanced Computer Science and Applications
  
  Volume: 10 Issue: 12 Pages: 7-16
- DOI
  10.14569/ijacsa.2019.0101202
- Related Report
  2019 Research-status Report
- Peer Reviewed / Open Access
[Presentation] プロトタイプ理論に基づきカテゴライズした知識を用いた転移学習2021
- Author(s)
  秋山智暉、河野仁、温文、藤井浩光、鈴木剛
- Organizer
  第22回計測自動制御学会システムインテグレーション部門講演会
- Related Report
  2021 Annual Research Report
[Presentation] 転移強化学習のためのPCクラスタを用いた再利用知識選択アルゴリズムの開発2021
- Author(s)
  坂本裕都、河野仁、温文、藤井浩光、鈴木剛
- Organizer
  2021年電気学会電子・情報・システム部門大会
- Related Report
  2021 Annual Research Report
[Presentation] 転移学習を用いた強化学習ロボットの方策選択における認知的経済性の検討2020
- Author(s)
  坂本裕都，河野仁，温文，藤井浩光，鈴木剛
- Organizer
  日本機械学会ロボティクス・メカトロニクス講演会2020
- Related Report
  2019 Research-status Report
[Presentation] 深層学習を用いた物体抽出と距離情報を用いた面検出によるプリミティブ形状の位置姿勢推定2020
- Author(s)
  野口達矢, 藤井浩光
- Organizer
  日本機械学会ロボティクス・メカトロニクス講演会2020
- Related Report
  2019 Research-status Report

Cognitive Economy in Reusing Policy Selection for Reinforcement Learning Robots Based on Prototype Theory

Principal Investigator

Suzuki Tsuyoshi 東京電機大学, 工学部, 教授 (00349789)

¥4,420,000 (Direct Cost: ¥3,400,000、Indirect Cost: ¥1,020,000)

Report

Research Products

[Journal Article] Autonomous Reusing Policy Selection using Spreading Activation Model in Deep Reinforcement Learning2021

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Activation and Spreading Sequence for Spreading Activation Policy Selection Method in Transfer Reinforcement Learning2019

Author(s)

Journal Title

DOI

Related Report

[Presentation] プロトタイプ理論に基づきカテゴライズした知識を用いた転移学習2021

Author(s)

Organizer

Related Report

[Presentation] 転移強化学習のためのPCクラスタを用いた再利用知識選択アルゴリズムの開発2021

Author(s)

Organizer

Related Report

[Presentation] 転移学習を用いた強化学習ロボットの方策選択における認知的経済性の検討2020

Author(s)

Organizer

Related Report

[Presentation] 深層学習を用いた物体抽出と距離情報を用いた面検出によるプリミティブ形状の位置姿勢推定2020

Author(s)

Organizer

Related Report