• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Studies on optimal theory and its application in probabilistic decision processes with general utility functions.

Research Project

Project/Area Number 11640118
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeSingle-year Grants
Section一般
Research Field General mathematics (including Probability theory/Statistical mathematics)
Research InstitutionWakayama Univ.

Principal Investigator

KADOTA Yoshinobu  Wakayama Univ., Edu., Prof., 教育学部, 教授 (90116294)

Co-Investigator(Kenkyū-buntansha) YASUDA Masami  Chiba Univ., Sci., Prof., 理学部, 教授 (00041244)
KURANO Masami  Chiba Univ., Edu., Prof., 教育学部, 教授 (70029487)
Project Period (FY) 1999 – 2000
Project Status Completed (Fiscal Year 2000)
Budget Amount *help
¥1,400,000 (Direct Cost: ¥1,400,000)
Fiscal Year 2000: ¥700,000 (Direct Cost: ¥700,000)
Fiscal Year 1999: ¥700,000 (Direct Cost: ¥700,000)
KeywordsMarkov / decision / stopping / utility / optimal / concave / risk-averse / non-discounted
Research Abstract

Stopped decision process is a combined model of Markov decision processes (MDPs) and the stopping problem. MDPs are specified by the set of countable states S, compact action space A (i) assigned at each state i ∈ S, transition probabilities q= (q_<ij>(a)), and a uniformly bounded immediate reward function r(i, a, j), which are continuous in a ∈ A(i) for any i, j ∈ S.A policy π is a sequence of probabilities on A (i_t) conditioned by each histories (i_0, a_0, i_1, …, i_t ) for t=0,1, …. Denote by σ a stopping time and by g a utility function.
Let B(t)=Σ^t_<k=1> r (X_<k-1>, Δ_<k-1>, X_k), where X_t and Δ_t are the state and action at time t, respectively. The pair (π, σ) is called (i_0, α_0)-optimal if it maximizes E^π_<i_0> [g (α_0+B(σ))], where E^π_<i_0> is the expectation by the probability measure on the sample space Ω=(S×A)^∞ for an initial state i_0.
It is assumed that g is non-decreasing, concave and bounded above, or that g has an bounded derivative on any compact subset of the real line R satisfying E^π_i [sup_<t【greater than or equal】0> g^+(α_0+B(t))] < ∞ for any π, i, where g^+ is the positive part of g. Let v(i, α) = max_<{(π, σ)}> E^π_i (g(α+B(σ)). Then, we have following results.
1. For any i ∈ S and α, υ(i, α) satisfies optimality equations
υ(i, α) = max {g(α), max_<α∈A> Σ_<j∈S> q_<ij>(a) υ (j, α+r(i, a, j)}(1)Furthermore, suppose (π, σ) satisfies P^π_<i_0> (σ>1)=1.
2. If (π, σ) is (i_0, α_0)-optimal pair, then E^π_<i_0> [g(α_0 + B(σ))] satisfies (1).
3. If E^π_<i_0>[g(α_0 + B(σ))] satisfies (1), then (π, σ) is (i_0, α_0)-optimal.

Report

(3 results)
  • 2000 Annual Research Report   Final Research Report Summary
  • 1999 Annual Research Report
  • Research Products

    (10 results)

All Other

All Publications (10 results)

  • [Publications] Y.Kadota,M.Kurano and M.Yasuda.: "Stopped decision processes in conjunction with general utility."To appear in J.Information and Optimization Science.. (2001)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] Y.Kadota,M.Kurano and M.Yasuda: "Risk-averse stopped Markov decision processes"第4回情報・統計科学(BIC)シンポジウム報告.. (1999)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] Y.Kadota.: "Deviation matrix, Laurent series and Blackwell optimality in countable state Markov decision processes."数理解析研究所講究録「不確実なモデルによる動的形画理論の課題とその展望」. (掲載予定). (2001)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] Y.Kadota, M.Kurano and M.Yasuda: "Stopped decision processes in conjunction with general utility."To appear in J.Inform. & Optim.Sci.. (2001)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] Y.Kadota, M.Kurano and M.Yasuda: "Risk-averse stopped Markov decision processes."The 4th BIC (Bull. Inform. & Cybernet.) symposium. (1999)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] Y.Kadota: "Deviation matrix, Laurent series and Blackwell optimality in countable state Markov decision Processes."To appear. Lecture note in Institute of Math. Anal. in Kyoto Univ.. (2001)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] Y.Kadota,M.Kurano and M.Yasuda.: "Stopped decision processes in conjunction with general utility."To appear in J.Information and Optimization Science.. (2001)

    • Related Report
      2000 Annual Research Report
  • [Publications] Y.Kadota,M.Kurano and M.Yasuda: "Risk-averse stopped Markov decision processes"第4回情報・統計科学(BIC)シンポジウム報告.. (1999)

    • Related Report
      2000 Annual Research Report
  • [Publications] Y.Kadota.: "Deviation matrix,Laurent series and Blackwell optimality in countable state Markov decision processes."数理解析研究所講究録「不確実なモデルによる動的計画理論の課題とその展望」. (掲載予定)(某雑誌に掲載予定). (2001)

    • Related Report
      2000 Annual Research Report
  • [Publications] Y.Kadota,M.Kurano,M.Yasuda: "Stopped Decision Processes in conjunction with General Utility"Accepted to Jounal of Information & Optimization Sciences..

    • Related Report
      1999 Annual Research Report

URL: 

Published: 1999-04-01   Modified: 2016-04-21  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi