Developing a theory of deep reinforcement learning equipped with bounded rationality

Research Project

Project/Area Number	17H04696
Research Category	Grant-in-Aid for Young Scientists (A)
Allocation Type	Single-year Grants
Research Field	Soft computing
Research Institution	Tokyo Denki University
Principal Investigator	Takahashi Tatsuji 東京電機大学, 理工学部, 准教授 (00514514)
Project Period (FY)	2017-04-01 – 2020-03-31
Project Status	Completed (Fiscal Year 2019)
Budget Amount *help	¥23,660,000 (Direct Cost: ¥18,200,000、Indirect Cost: ¥5,460,000) Fiscal Year 2019: ¥6,630,000 (Direct Cost: ¥5,100,000、Indirect Cost: ¥1,530,000) Fiscal Year 2018: ¥6,370,000 (Direct Cost: ¥4,900,000、Indirect Cost: ¥1,470,000) Fiscal Year 2017: ¥10,660,000 (Direct Cost: ¥8,200,000、Indirect Cost: ¥2,460,000)
Keywords	限定合理性 / 強化学習 / 満足化 / 社会学習 / 弱教示的学習 / 判定問題 / 仮説検証 / 試行錯誤 / 動機付け / 教示的フィードバック / 評価的フィードバック / 対抗模倣 / 競争 / 満足化原理 / 半教示的フィードバック / メタ情報 / 模倣学習 / エミュレーション / 教示フィードバック / 評価フィードバック / 社会的満足化 / プロスペクト理論 / 社会的学習 / 意志決定 / 因果推論 / 機械学習
Outline of Final Research Achievements	The real world agents such as human beings or animals learn and act in some (bounded) rational way toward their respective goals. The learning and acting are under severe restrictions as for perception, information processing, and actuation. In this project, we hypothesize that the efficient learning and acting are enabled by exploiting the search and decision-making policy called "satisficing" that was proposed as an alternative of optimization. We gave a new implementation (RS) of satisficing, establishing it as a useful algorithm, and we proved its efficiency. We applied RS to various tasks in reinforcement learning and showed its efficiency, including the most basic bandit problems and general tabular and non-tabular MDPs.
Academic Significance and Societal Importance of the Research Achievements	人間や動物の扱う、試行錯誤を伴う自律的な学習のロジックの重要な一端を明らかにした。特に、なぜ人間や動物が競争と「対抗模倣」により効率的なパフォーマンスの向上を見せるのかについて機械論的な説明を与えた。さらに、数学的に効率性を証明するとともに、様々な状況で効率性を示した。また、資本主義や市場の観点から、競争や対抗模倣の効率性と、表裏一体であるその危険性についても論じた。

Report

(4 results)

2019 Annual Research Report Final Research Report ( PDF )
2018 Annual Research Report
2017 Annual Research Report

Research Products
(16 results)

All 2020 2019 2018 2017 Other

All Int'l Joint Research (4 results) Journal Article (4 results) (of which Int'l Joint Research: 2 results, Peer Reviewed: 4 results, Open Access: 3 results) Presentation (8 results) (of which Int'l Joint Research: 2 results)

[Int'l Joint Research] ダーラム大学(英国)
- Related Report
  2018 Annual Research Report
[Int'l Joint Research] パリ第8大学/高等研究実習院(EPHE)(フランス)
- Related Report
  2018 Annual Research Report
[Int'l Joint Research] パリ第８大学/高等師範学校/高等研究実習院(フランス)
- Related Report
  2017 Annual Research Report
[Int'l Joint Research] ダーラム大学(英国)
- Related Report
  2017 Annual Research Report
[Journal Article] Extended Bayesian inference incorporating symmetry bias2020
- Author(s)
  Shinohara Shuji、Manome Nobuhito、Suzuki Kouta、Chung Ung-il、Takahashi Tatsuji、Gunji Pegio-Yukio、Nakajima Yoshihiro、Mitsuyoshi Shunji
- Journal Title
  
  Biosystems
  
  Volume: 190 Pages: 104104-104104
- DOI
  10.1016/j.biosystems.2020.104104
- Related Report
  2019 Annual Research Report
- Peer Reviewed / Open Access / Int'l Joint Research
[Journal Article] Flexibility of Emulation Learning from Pioneers in Nonstationary Environments2020
- Author(s)
  Shinriki Moto、Wakabayashi Hiroaki、Kono Yu、Takahashi Tatsuji
- Journal Title
  
  Advances in Artificial Intelligence.
  
  Volume: 1128 Pages: 90-101
- DOI
  10.1007/978-3-030-39878-1_9
- NAID
  130007658419
- ISBN
  9783030398774, 9783030398781
- Related Report
  2019 Annual Research Report
- Peer Reviewed
[Journal Article] Guaranteed satisficing and finite regret: Analysis of a cognitive satisficing value function2019
- Author(s)
  Tamatsukuri Akihiro、Takahashi Tatsuji
- Journal Title
  
  Biosystems
  
  Volume: 180 Pages: 46-53
- DOI
  10.1016/j.biosystems.2019.02.009
- Related Report
  2019 Annual Research Report 2018 Annual Research Report
- Peer Reviewed / Open Access
[Journal Article] The Psychology of Uncertainty and Three-Valued Truth Tables2018
- Author(s)
  Baratgin Jean、Politzer Guy、Over David E.、Takahashi Tatsuji
- Journal Title
  
  Frontiers in Psychology
  
  Volume: 9 Pages: 1479-1479
- DOI
  10.3389/fpsyg.2018.01479
- Related Report
  2018 Annual Research Report
- Peer Reviewed / Open Access / Int'l Joint Research
[Presentation] 満足化原理の強化学習全般への適用に向けて2018
- Author(s)
  佐鳥玖仁朗, 吉田豊, 山岸健太, 牛田有哉, 神谷匠, 高橋達二
- Organizer
  2018年度人工知能学会全国大会 (第32回) (JSAI 2018)
- Related Report
  2017 Annual Research Report
[Presentation] 認知的満足化価値関数の分析2018
- Author(s)
  玉造晃弘, 高橋達二
- Organizer
  2018年度人工知能学会全国大会 (第32回) (JSAI 2018)
- Related Report
  2017 Annual Research Report
[Presentation] 満足化を通じた最適な自律的探索2018
- Author(s)
  甲野佑, 高橋達二
- Organizer
  2018年度人工知能学会全国大会 (第32回) (JSAI 2018)
- Related Report
  2017 Annual Research Report
[Presentation] 満足化基準値共有を用いた社会的強化学習2018
- Author(s)
  其田憲明, 神谷匠, 甲野佑, 高橋達二
- Organizer
  2018年度人工知能学会全国大会 (第32回) (JSAI 2018)
- Related Report
  2017 Annual Research Report
[Presentation] Are word learning biases based on symmetry in cognition?2018
- Author(s)
  Kamiya, T., Takahashi, T.
- Organizer
  The Twenty-Third International Symposium on Artificial Life and Robotics 2018 (AROB 23rd 2018)
- Related Report
  2017 Annual Research Report
- Int'l Joint Research
[Presentation] Causal induction under rarity and small data2018
- Author(s)
  Yokokawa, J., Oyo, K., Takahashi, T.
- Organizer
  The Twenty-Third International Symposium on Artificial Life and Robotics 2018 (AROB 23rd 2018)
- Related Report
  2017 Annual Research Report
- Int'l Joint Research
[Presentation] 稀少性仮定の下での非独立性の判断としての人間の観察的因果推論2017
- Author(s)
  高橋達二, 大用庫智, 玉造晃弘, 横川純貴
- Organizer
  2017年度人工知能学会全国大会 (第31回) (JSAI 2017)
- Related Report
  2017 Annual Research Report
[Presentation] 生存を目的とする満足化強化学習2017
- Author(s)
  牛田有哉, 甲野佑, 高橋達二
- Organizer
  2017年度人工知能学会全国大会 (第31回) (JSAI 2017)
- Related Report
  2017 Annual Research Report

Developing a theory of deep reinforcement learning equipped with bounded rationality

Principal Investigator

Takahashi Tatsuji 東京電機大学, 理工学部, 准教授 (00514514)

¥23,660,000 (Direct Cost: ¥18,200,000、Indirect Cost: ¥5,460,000)

Report

Research Products

[Int'l Joint Research] ダーラム大学(英国)

Related Report

[Int'l Joint Research] パリ第8大学/高等研究実習院(EPHE)(フランス)

Related Report

[Int'l Joint Research] パリ第８大学/高等師範学校/高等研究実習院(フランス)

Related Report

[Int'l Joint Research] ダーラム大学(英国)

Related Report

[Journal Article] Extended Bayesian inference incorporating symmetry bias2020

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Flexibility of Emulation Learning from Pioneers in Nonstationary Environments2020

Author(s)

Journal Title

DOI

NAID

ISBN

Related Report

[Journal Article] Guaranteed satisficing and finite regret: Analysis of a cognitive satisficing value function2019

Author(s)

Journal Title

DOI

Related Report

[Journal Article] The Psychology of Uncertainty and Three-Valued Truth Tables2018

Author(s)

Journal Title

DOI

Related Report

[Presentation] 満足化原理の強化学習全般への適用に向けて2018

Author(s)

Organizer

Related Report

[Presentation] 認知的満足化価値関数の分析2018

Author(s)

Organizer

Related Report

[Presentation] 満足化を通じた最適な自律的探索2018

Author(s)

Organizer

Related Report

[Presentation] 満足化基準値共有を用いた社会的強化学習2018

Author(s)

Organizer

Related Report

[Presentation] Are word learning biases based on symmetry in cognition?2018

Author(s)

Organizer

Related Report

[Presentation] Causal induction under rarity and small data2018

Author(s)

Organizer

Related Report

[Presentation] 稀少性仮定の下での非独立性の判断としての人間の観察的因果推論2017

Author(s)

Organizer

Related Report

[Presentation] 生存を目的とする満足化強化学習2017

Author(s)

Organizer

Related Report