Title:
|
First passage risk probability optimality for continuous time Markov decision processes (English) |
Author:
|
Huo, Haifeng |
Author:
|
Wen, Xian |
Language:
|
English |
Journal:
|
Kybernetika |
ISSN:
|
0023-5954 (print) |
ISSN:
|
1805-949X (online) |
Volume:
|
55 |
Issue:
|
1 |
Year:
|
2019 |
Pages:
|
114-133 |
Summary lang:
|
English |
. |
Category:
|
math |
. |
Summary:
|
In this paper, we study continuous time Markov decision processes (CTMDPs) with a denumerable state space, a Borel action space, unbounded transition rates and nonnegative reward function. The optimality criterion to be considered is the first passage risk probability criterion. To ensure the non-explosion of the state processes, we first introduce a so-called drift condition, which is weaker than the well known regular condition for semi-Markov decision processes (SMDPs). Furthermore, under some suitable conditions, by value iteration recursive approximation technique, we establish the optimality equation, obtain the uniqueness of the value function and the existence of optimal policies. Finally, two examples are used to illustrate our results. (English) |
Keyword:
|
continuous time Markov decision processes |
Keyword:
|
first passage time |
Keyword:
|
risk probability criterion |
Keyword:
|
optimal policy |
MSC:
|
60E20 |
MSC:
|
90C40 |
idZBL:
|
Zbl 07088881 |
idMR:
|
MR3935417 |
DOI:
|
10.14736/kyb-2019-1-0114 |
. |
Date available:
|
2019-05-07T11:11:46Z |
Last updated:
|
2020-02-27 |
Stable URL:
|
http://hdl.handle.net/10338.dmlcz/147708 |
. |
Reference:
|
[1] Bertsekas, D., S.Shreve: Stochastic Optimal Control: The Discrete-Time Case..Academic Press Inc 1996 MR 0511544 |
Reference:
|
[2] Bauerle, N., Rieder, U.: Markov Decision Processes with Applications to Finance..Springer, Heidelberg 2011 MR 2808878 |
Reference:
|
[3] Feinberg, E.: Continuous time discounted jump Markov decision processes: a discrete-event approach..Math. Operat. Res. 29 (2004), 492-524. MR 2082616, 10.1287/moor.1040.0089 |
Reference:
|
[4] Guo, X. P., Hernández-Lerma, O.: Continuous-Time Markov Decision Process: Theorey and Applications..Springer-Verlag, Berlin 2009. MR 2554588 |
Reference:
|
[5] Guo, X. P., Hernández-Del-Valle, A., Hernández-Lerma, O.: First passage problems for nonstationary discrete-time stochastic control systems..Europ. J. Control 18 (2012), 528-538. MR 3086896, 10.3166/ejc.18.528-538 |
Reference:
|
[6] Guo, X. P., Song, X. Y., Zhang, Y.: First passage optimality for continuous time Markov decision processes with varying discount factors and history-dependent policies..IEEE Trans. Automat. Control 59 (2014), 163-174. MR 3163332, 10.1109/tac.2013.2281475 |
Reference:
|
[7] Guo, X. P., Huang, X. X., Huang, Y. H.: Finite-horizon optimality for continuous-time Markov decision processs with unbounded transition rates..Adv. Appl. Prob. 47 (2015), 1064-1087. MR 3433296, 10.1017/s0001867800049016 |
Reference:
|
[8] Hernández-Lerma, O., Lasserre, J. B.: Discrete-Time Markov Control Process: Basic Optimality Criteria..Springer-Verlag, New York 1996. MR 1363487, 10.1007/978-1-4612-0729-0 |
Reference:
|
[9] Hernández-Lerma, O., Lasserre, J. B.: Further Topics on Discrete-Time Markov Control Process..Springer-Verlag, New York 1999. MR 1697198, 10.1007/978-1-4612-0561-6 |
Reference:
|
[10] Huang, Y. H., Guo, X. P.: Optimal risk probability for first passage models in Semi-Markov processes..J. Math. Anal. Appl. 359 (2009), 404-420. MR 2542184, 10.1016/j.jmaa.2009.05.058 |
Reference:
|
[11] Huang, Y. H., Guo, X. P.: First passage models for denumberable Semi-Markov processes with nonnegative discounted cost..Acta. Math. Appl. Sinica 27 (2011), 177-190. MR 2784052, 10.1007/s10255-011-0061-2 |
Reference:
|
[12] Huang, Y. H., Wei, Q. D., Guo, X. P.: Constrained Markov decision processes with first passage criteria..Ann. Oper. Res. 206 (2013), 197-219. MR 3073845, 10.1007/s10479-012-1292-1 |
Reference:
|
[13] Huang, Y. H., Guo, X. P., Li, Z. F.: Minimum risk probability for finite horizon semi-Markov decision process..J. Math. Anal. Appl. 402 (2013), 378-391. MR 3023265, 10.1016/j.jmaa.2013.01.021 |
Reference:
|
[14] Huang, X. X., Zou, X. L., Guo, X. P.: A minimization problem of the risk probability in first passage semi-Markov decision processes with loss rates..Sci. China Math. 58 (2015), 1923-1938. MR 3383991, 10.1007/s11425-015-5029-x |
Reference:
|
[15] Huang, X. X., Huang, Y. H.: Mean-variance optimality for semi-Markov decision processes under first passage..Kybernetika 53 (2017), 59-81. MR 3638556, 10.14736/kyb-2017-1-0059 |
Reference:
|
[16] Huo, H. F., Zou, X. L., Guo, X. P.: The risk probability criterion for discounted continuous-time Markov decision processes..Discrete Event Dynamic system: Theory Appl. 27 (2017), 675-699. MR 3712415, 10.1007/s10626-017-0257-6 |
Reference:
|
[17] Janssen, J., Manca, R.: Semi-Markov Risk Models For Finance, Insurance, and Reliability..Springer, New York 2006. MR 2301626 |
Reference:
|
[18] Lin, Y. L., Tomkins, R. J., Wang, C. L.: Optimal models for the first arrival time distribution function in continuous time with a special case..Acta. Math. Appl. Sinica 10 (1994), 194-212. MR 1289720, 10.1007/bf02006119 |
Reference:
|
[19] Liu, J. Y., Liu, K.: Markov decision programming - the first passage model with denumerable state space..Systems Sci. Math. Sci. 5 (1992), 340-351. MR 1196196 |
Reference:
|
[20] Liu, J. Y., Huang, S. M.: Markov decision processes with distribution function criterion of first-passage time..Appl. Math. Optim. 43 (2001), 187-201. MR 1885696, 10.1007/s00245-001-0007-9 |
Reference:
|
[21] Ohtsubo, Y.: Optimal threshold probability in undiscounted Markov decision processes with a target set..Appl. Math. Anal. Comp. 149 (2004), 519-532. MR 2033087, 10.1016/s0096-3003(03)00158-9 |
Reference:
|
[22] Puterman, M. L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. MR 1270015 |
Reference:
|
[23] Piunovskiy, A., Zhang, Y.: Discounted continuous-time Markov decision processes with unbounded rates: the convex analytic approach..SIAM J. Control Optim. 49 (2011), 2032-2061. MR 2837510, 10.1137/10081366x |
Reference:
|
[24] Schäl, M.: Control of ruin probabilities by discrete-time investments..Math. Meth. Oper. Res. 70 (2005), 141-158. MR 2226972, 10.1007/s00186-005-0445-2 |
Reference:
|
[25] Wu, C. B., Lin, Y. L.: Minimizing risk models in Markov decision processes with policies depending on target values..J. Math. Anal. Appl. 231 (1999), 47-57. MR 1676741, 10.1006/jmaa.1998.6203 |
Reference:
|
[26] Wu, X., Guo, X. P.: First passage optimality and variance minimization of Markov decision processes with varying discount factors..J. Appl. Prob. 52 (2015), 441-456. MR 3372085, 10.1017/s0021900200012560 |
Reference:
|
[27] Yu, S. X., Lin, Y. L., Yan, P. F.: Optimization models for the first arrival target distribution function in discrete time..J. Math. Anal. Appl. 225 (1998), 193-223. MR 1639236, 10.1006/jmaa.1998.6015 |
Reference:
|
[28] Zou, X. L., Guo, X. P.: Another set of verifiable conditions for average Markov decision processes with Borel spaces..Kybernetika 51 (2015), 276-292. MR 3350562, 10.14736/kyb-2015-2-0276 |
. |