• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

On study of temporal difference method in decision process and its application

Research Project

Project/Area Number 19740060
Research Category

Grant-in-Aid for Young Scientists (B)

Allocation TypeSingle-year Grants
Research Field General mathematics (including Probability theory/Statistical mathematics)
Research InstitutionKanagawa University (2008-2009)
Yuge National College of Maritime Technology (2007)

Principal Investigator

HORIGUCHI Masayuki  Kanagawa University, 工学部, 准教授 (90366401)

Project Period (FY) 2007 – 2009
Project Status Completed (Fiscal Year 2009)
Budget Amount *help
¥2,760,000 (Direct Cost: ¥2,400,000、Indirect Cost: ¥360,000)
Fiscal Year 2009: ¥780,000 (Direct Cost: ¥600,000、Indirect Cost: ¥180,000)
Fiscal Year 2008: ¥780,000 (Direct Cost: ¥600,000、Indirect Cost: ¥180,000)
Fiscal Year 2007: ¥1,200,000 (Direct Cost: ¥1,200,000)
Keywordsマルコフ決定過程 / 計画数学 / 適応政策 / 学習理論 / マルコフ集合連鎖 / 区間ベイズ推定法 / 確信区間
Research Abstract

In decision process with uncertainty, the optimal solutions are constructed by using dynamic programming(DP) algorithms. In order to solve practical problems involving very large state space, we need to decrease the amount of computation necessary for learning algorithm since DP algorithm cannot be applied directly. Based on using the data of state-action process, the value function is estimated by learning algorithm. We consider temporal difference method of Neuro Dynamic Programming (Neuro-DP) and Bayesian interval estimation in Markov decision processes with unknown transition law. We derive algorithms of constructing optimal solution theoretically. We also treat numerical examples to show the validity of algorithms and improve corresponding algorithms.

Report

(4 results)
  • 2009 Annual Research Report   Final Research Report ( PDF )
  • 2008 Annual Research Report
  • 2007 Annual Research Report
  • Research Products

    (28 results)

All 2010 2009 2008 2007 Other

All Journal Article (17 results) (of which Peer Reviewed: 6 results) Presentation (10 results) Remarks (1 results)

  • [Journal Article] 不確実性の下でのマルコフ決定過程に対する区間ベイズ手法2009

    • Author(s)
      伊喜哲一郎、堀口正之、安田正實、蔵野正美
    • Journal Title

      京都大学数理解析研究所講究録 1636

      Pages: 1-8

    • Related Report
      2009 Annual Research Report 2009 Final Research Report
  • [Journal Article] ダイナミックプログラミングを用いたファジィメトリッククラスタリング(Fuzzy Metric Clustering Based on Dynamic Programming)2009

    • Author(s)
      岩村覚三、堀口正之、堀池真琴
    • Journal Title

      京都大学数理解析研究所講究録 1630

      Pages: 77-88

    • Related Report
      2009 Final Research Report
  • [Journal Article] ダイナミックプログラミングを用いたファジィメトリッククラスタリング (Fuzzy Metric Clustering Based on Dynamic Programming)2009

    • Author(s)
      岩村覚三、堀口正之、堀池真琴
    • Journal Title

      京都大学数理解析研究所講究録1630「非加法性の数理と情報 : 非加法性と凸解析」 1630

      Pages: 77-88

    • Related Report
      2008 Annual Research Report
  • [Journal Article] A pattern-matrix learning algorithm for adaptive MDPs: The regularly communicating case2008

    • Author(s)
      伊喜哲一郎、堀口正之、蔵野正美、安田正實
    • Journal Title

      京都大学数理解析研究所講究録 1589

      Pages: 110-119

    • Related Report
      2009 Final Research Report
  • [Journal Article] 区間ベイズ推定による適応型品質管理2008

    • Author(s)
      佐々木稔、堀口正之、蔵野正美
    • Journal Title

      京都大学数理解析研究所講究録 1589

      Pages: 120-129

    • Related Report
      2009 Final Research Report
  • [Journal Article] マルコフ決定過程における適応型アルゴリズム(Adaptive Algorithms for Markov Decision Processes)2008

    • Author(s)
      堀口正之
    • Journal Title

      神奈川大学工学研究所所報

      Pages: 22-29

    • Related Report
      2009 Final Research Report
  • [Journal Article] A pattern-matrix learning algorithm for adaptive MDPs : The regularly communicating case2008

    • Author(s)
      伊喜哲一郎、堀口正之、蔵野正美、安田正實
    • Journal Title

      京都大学数理解析研究所講究録1589「不確実な状況における意思決定の理論と応用」 1589

      Pages: 110-119

    • Related Report
      2008 Annual Research Report
  • [Journal Article] 区間ベイズ推定による適応型品質管理2008

    • Author(s)
      佐々木稔、堀口正之、蔵野正美
    • Journal Title

      京都大学数理解析研究所講究録1589「不確実な状況における意思決定の理論と応用」 1589

      Pages: 120-129

    • Related Report
      2008 Annual Research Report
  • [Journal Article] マルコフ決定過程における適応型アルゴリズム (Adaptive Algohthms for Markov Decision Processes)2008

    • Author(s)
      堀口正之
    • Journal Title

      神奈川大学工学研究所所報 31

      Pages: 22-29

    • Related Report
      2008 Annual Research Report
  • [Journal Article] A structured pattern matrix algorithm for multichain Markov decision processes2007

    • Author(s)
      T. Iki, M. Horiguchi, M. Kurano
    • Journal Title

      Mathematical Methods of Operations Research 66

      Pages: 545-555

    • Related Report
      2009 Final Research Report
    • Peer Reviewed
  • [Journal Article] A learning algorithm for communicating Markov decision processes with unknown transition matrices2007

    • Author(s)
      T. Iki, M. Horiguchi, M. Yasuda, M. Kurano
    • Journal Title

      Bulletin of Information and Cybernetics 39

      Pages: 11-24

    • NAID

      120001944229

    • Related Report
      2009 Final Research Report
    • Peer Reviewed
  • [Journal Article] Temporal Difference-Based Adaptive Policies in Neuro Dynamic Programming2007

    • Author(s)
      T. Iki, M. Horiguchi, M. Yasuda, M. Kurano
    • Journal Title

      4th International conference on Proceedings of Modeling Decisions for Artificial Intelligence (MDAI)(Vicenc Torra, Yasuo Narukawa, Yuji Yoshida (Eds. )) (CD-ROM Proceedings)

      Pages: 112-122

    • Related Report
      2009 Final Research Report
    • Peer Reviewed
  • [Journal Article] マルコフ決定過程におけるTD法による学習アルゴリズムについて(A learning algorithm of TD method for Markov decision processes)2007

    • Author(s)
      堀口正之、蔵野正美、安田正實
    • Journal Title

      京都大学数理解析研究所講究録 1559

      Pages: 34-49

    • Related Report
      2009 Final Research Report
  • [Journal Article] "A structured pattern matrix algorithm for multichain Markov decision processes"2007

    • Author(s)
      T. Iki, M. Horiguchi, M. Kurano.
    • Journal Title

      Mathematical Methods of Operations Research 66

      Pages: 545-555

    • Related Report
      2007 Annual Research Report
    • Peer Reviewed
  • [Journal Article] "A Iearning algorithm for communicating Markov decision processes with unknown transition matrices"2007

    • Author(s)
      T. Iki, M. Horiguchi, M. Yasuda, M. Kurano
    • Journal Title

      Bulletin of Information and Cybernetics 39

      Pages: 11-24

    • Related Report
      2007 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Temporal Difference-Based Adaptive Policies in Neuro Dyriamic Programming.2007

    • Author(s)
      T. Iki, M. Horiguchi, M. Yasuda, M. Kurano
    • Journal Title

      In: 4th International conference on Proceedings of Modeling Decisions for Artificial Intelligence(MDAI)2007(CD-ROM Proceedings), Vicenc Torra, Yasuo Narukawa, Yuji Yoshida (Eds.), (CD-ROM)ISBN978-84-00-08359-1

      Pages: 112-122

    • Related Report
      2007 Annual Research Report
    • Peer Reviewed
  • [Journal Article] "マルコフ決定過程におけるTD法による学習アルゴリズムについて(A learning algorithm of TD method for Markov decision processes)"2007

    • Author(s)
      堀口正之、蔵野正美、安田正實
    • Journal Title

      京都大学数理解析研究所講究録1559「最適化問題における確率モデルの展開と応用」 1559

      Pages: 34-49

    • Related Report
      2007 Annual Research Report
  • [Presentation] Uncertain Markov decision processes and Bayesian intervals2010

    • Author(s)
      堀口正之
    • Organizer
      日本数学会2010年度年会統計数学分科会
    • Place of Presentation
      慶應義塾大学
    • Year and Date
      2010-03-26
    • Related Report
      2009 Final Research Report
  • [Presentation] Uncertain Markov decision processes and Bayesian intervals2010

    • Author(s)
      堀口正之
    • Organizer
      日本数学会
    • Place of Presentation
      慶應義塾大学矢上キャンパス
    • Year and Date
      2010-03-26
    • Related Report
      2009 Annual Research Report
  • [Presentation] On bounds for Bayes estimate intervals in uncertain MDPs2009

    • Author(s)
      堀口正之
    • Organizer
      日本数学会2009年度秋季総合分科会
    • Place of Presentation
      大阪大学
    • Year and Date
      2009-09-27
    • Related Report
      2009 Final Research Report
  • [Presentation] On bounds for Bayes estimate intervals in uncertain MDPs2009

    • Author(s)
      堀口正之、安田正實
    • Organizer
      日本数学会
    • Place of Presentation
      大阪大学豊中キャンパス
    • Year and Date
      2009-09-27
    • Related Report
      2009 Annual Research Report
  • [Presentation] Bayesian approach to uncertain MDPs with intervals of prior measures2009

    • Author(s)
      堀口正之
    • Organizer
      日本数学会2009年度年会統計数学分科会
    • Place of Presentation
      東京大学
    • Year and Date
      2009-03-27
    • Related Report
      2009 Final Research Report 2008 Annual Research Report
  • [Presentation] Adaptive algorithm for MDPs using pattern matrix learning method2008

    • Author(s)
      堀口正之
    • Organizer
      日本数学会2008年度秋季総合分科会統計数学分科会
    • Place of Presentation
      東京工業大学
    • Year and Date
      2008-09-27
    • Related Report
      2009 Final Research Report 2008 Annual Research Report
  • [Presentation] 未知の推移法則を持つマルコフ決定過程における学習アルゴリズムについて2007

    • Author(s)
      堀口正之
    • Organizer
      日本数学会第117回九州支部例会
    • Place of Presentation
      宮崎大学
    • Year and Date
      2007-10-13
    • Related Report
      2009 Final Research Report
  • [Presentation] "未知の推移法則を持つマルコフ決定過程における学習アルゴリズムについて"2007

    • Author(s)
      発表者:堀口正之、共同研究者:伊喜哲一郎
    • Organizer
      日本数学会第117回九州支部例会
    • Place of Presentation
      宮崎大学
    • Year and Date
      2007-10-13
    • Related Report
      2007 Annual Research Report
  • [Presentation] Adaptive Markov decision processes based on temporal difference method2007

    • Author(s)
      堀口正之
    • Organizer
      日本数学会2007年度秋季総合分科会統計数学分科会
    • Place of Presentation
      東北大学
    • Year and Date
      2007-09-24
    • Related Report
      2009 Final Research Report
  • [Presentation] "Adaptive Markov decision processes based on temporal difference method"2007

    • Author(s)
      発表者:堀口正之、共同研究者:伊喜哲一郎、蔵野正美、安田正實
    • Organizer
      日本数学会2007年度秋季総合分科会統計数学分科会
    • Place of Presentation
      東北大学
    • Year and Date
      2007-09-24
    • Related Report
      2007 Annual Research Report
  • [Remarks]

    • URL

      http://www.math.kanagawa-u.ac.jp/~horiguchi

    • Related Report
      2009 Final Research Report

URL: 

Published: 2007-04-01   Modified: 2016-04-21  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi