Grant-in-Aid for Scientific Research (C)
|Allocation Type||Single-year Grants|
General mathematics (including Probability theory/Statistical mathematics)
|Research Institution||Chiba University|
KURANO Masami Chiba Univ., Faculty of Education, Professor, 教育学部, 教授 (70029487)
KENMOCHI Nobuyuki Chiba Univ., Faculty of Education, Professor, 教育学部, 教授 (00033887)
YASUDA Masami Chiba Univ., Faculty of Science, Professor, 理学部, 教授 (00041244)
NAKAGAMI Jun-ichi Chiba Univ., Faculty of Science, Professor, 理学部, 教授 (30092076)
KADOTA Yoshinobu Wakayama Univ., Faculty of Education, Professor, 教育学部, 教授 (90116294)
YOSHIDA Yuji Univ.of Kitakyushu, Faculty of Economics and Business Administration, Professor, 経済学部, 教授 (90192426)
|Project Period (FY)
2003 – 2005
Completed(Fiscal Year 2005)
|Budget Amount *help
¥3,400,000 (Direct Cost : ¥3,400,000)
Fiscal Year 2005 : ¥1,000,000 (Direct Cost : ¥1,000,000)
Fiscal Year 2004 : ¥1,100,000 (Direct Cost : ¥1,100,000)
Fiscal Year 2003 : ¥1,300,000 (Direct Cost : ¥1,300,000)
|Keywords||Flexibly structured model / Markov decision process / Fuzzy dynamic programming / Adaptive Markov model / General utility model / Fuzzy stopping model / Perceptive information / Structured algorithm / 知覚情報 / ファジィ最適方程式 / 強化学習 / 適応政策 / ファジィ動的計画法 / 知覚的マルコフ決定モデル / 知覚値 / 平均基準の最適停止問題 / 多重連鎖マルコフモデル / 適応型マルコフ決定過程 / ファジイグループ決定 / 一般効用関数 / セミ・マルコフモデル / ファジイストッピング / ファジイクラスターリング|
In this project, our objective is to establish the mathematical theory on decision making processes with more flexible sand soft structure, applying the ideas of symthesis and integration. To this end, we have dealt with various flexibly structured models.
The main research results are as follows.
1. Uncertain Markov decision processes (MDPs) with perception-based information
Formulating a fuzzy perceptive model for MDPs in which the perception for transition matrices is described by fuzzy sets, we have succeeds in deriving a fuzzy optimality relation to estimate the optimal fuzzy reward by which a soft computing algorithm becomes to be possible.
2. Adaptive Markov decision models
We have developed a learning algorithm of the reward-penalty type for the communicating case of multichain MDPs by which an adaptively optimal policy is constructed. Also, a numerical experiment has been done successfully.
3. Fuzzy stopping models and their applications to group decision processes
The perceptive fuzzy model for stopping problems has been formulated and the method of computing the fuzzy perceptive reward when stopped optimally has been obtained. These ideas are applied to develop a perception-based theory for a multivariate stopping problem with a monotone rule. Moreover, to some extent, it succeeds in applying our analytic results to finance engineering (American put option and so on).
4.Decision making models with general regret utility functions
The optimization problem of general regret utility case for countable sate semi-MDPs with an absorbing set is considered., We have succeeded in deriving the optimality equations which determine the optimal regret policy. These results has been extended to the case of multiple constraints