Previous |  Up |  Next

Article

Title: Strong average optimality criterion for continuous-time Markov decision processes (English)
Author: Wei, Qingda
Author: Chen, Xian
Language: English
Journal: Kybernetika
ISSN: 0023-5954 (print)
ISSN: 1805-949X (online)
Volume: 50
Issue: 6
Year: 2014
Pages: 950-977
Summary lang: English
.
Category: math
.
Summary: This paper deals with continuous-time Markov decision processes with the unbounded transition rates under the strong average cost criterion. The state and action spaces are Borel spaces, and the costs are allowed to be unbounded from above and from below. Under mild conditions, we first prove that the finite-horizon optimal value function is a solution to the optimality equation for the case of uncountable state spaces and unbounded transition rates, and that there exists an optimal deterministic Markov policy. Then, using the two average optimality inequalities, we show that the set of all strong average optimal policies coincides with the set of all average optimal policies, and thus obtain the existence of strong average optimal policies. Furthermore, employing the technique of the skeleton chains of controlled continuous-time Markov chains and Chapman-Kolmogorov equation, we give a new set of sufficient conditions imposed on the primitive data of the model for the verification of the uniform exponential ergodicity of continuous-time Markov chains governed by stationary policies. Finally, we illustrate our main results with an example. (English)
Keyword: continuous-time Markov decision processes
Keyword: strong average optimality criterion
Keyword: finite-horizon expected total cost criterion
Keyword: unbounded transition rates
Keyword: optimal policy
Keyword: optimal value function
MSC: 49K45
MSC: 90C40
MSC: 93E20
idZBL: Zbl 1307.93467
idMR: MR3301781
DOI: 10.14736/kyb-2014-6-0950
.
Date available: 2015-01-13T09:57:31Z
Last updated: 2016-01-03
Stable URL: http://hdl.handle.net/10338.dmlcz/144118
.
Reference: [1] Bäuerle, N., Rieder, U.: Markov Decision Processes with Applications to Finance..Springer, Berlin 2011. Zbl 1236.90004, MR 2808878
Reference: [2] Bertsekas, D. P., Shreve, S. E.: Stochastic Optimal Control: The Discrete-time Case..Academic Press, New York 1978. Zbl 0633.93001, MR 0511544
Reference: [3] Cavazos-Cadena, R., Fernández-Gaucherand, E.: Denumerable controlled Markov chains with strong average optimality criterion: bounded and unbounded costs..Math. Methods Oper. Res. 43 (1996), 281-300. Zbl 0851.90135, MR 1398350, 10.1007/BF01194549
Reference: [4] Dijk, N. M. van: On the finite horizon Bellman equation for controlled Markov jump models with unbounded characteristics: existence and approximation..Stochastic Process. Appl. 28 (1988), 141-157. MR 0936380
Reference: [5] Dynkin, E. B., Yushkevich, A. A.: Controlled Markov Processes..Springer, New York 1979. MR 0554083
Reference: [6] Feller, W.: On the integro-differential equations of purely discontinuous Markoff processes..Trans. Amer. Math. Soc. 48 (1940), 488-515. Zbl 0025.34704, MR 0002697, 10.1090/S0002-9947-1940-0002697-3
Reference: [7] Flynn, J.: On optimality criteria for dynamic programs with long finite horizons..J. Math. Anal. Appl. 76 (1980), 202-208. Zbl 0438.90100, MR 0586657, 10.1016/0022-247X(80)90072-4
Reference: [8] Ghosh, M. K., Marcus, S. I.: On strong average optimality of Markov decision processes with unbounded costs..Oper. Res. Lett. 11 (1992), 99-104. Zbl 0768.90085, MR 1167429, 10.1016/0167-6377(92)90040-A
Reference: [9] Ghosh, M. K., Saha, S.: Continuous-time controlled jump Markov processes on the finite horizon..In: Optimization, Control, and Applications of Stochastic Systems (D. Hernández-Hernández and J. A. Minjárez-Sosa, eds.), Springer, New York 2012, pp. 99-109. MR 2961381
Reference: [10] Gihman, I. I., Skohorod, A. V.: Controlled Stochastic Processes..Springer, Berlin 1979. MR 0544839
Reference: [11] Guo, X. P., Rieder, U.: Average optimality for continuous-time Markov decision processes in Polish spaces..Ann. Appl. Probab. 16 (2006), 730-756. Zbl 1160.90010, MR 2244431, 10.1214/105051606000000105
Reference: [12] Guo, X. P.: Continuous-time Markov decision processes with discounted rewards: the case of Polish spaces..Math. Oper. Res. 32 (2007), 73-87. Zbl 1278.90426, MR 2292498, 10.1287/moor.1060.0210
Reference: [13] Guo, X. P., Hernández-Lerma, O.: Continuous-Time Markov Decision Processes: Theory and Applications..Springer, Berlin 2009. Zbl 1209.90002
Reference: [14] Guo, X.P., Ye, L. E.: New discount and average opti mality conditions for continuous-time Markov decision processes..Adv. in Appl. Probab. 42 (2010), 953-985. MR 2796672, 10.1239/aap/1293113146
Reference: [15] Hernández-Lerma, O., Lasserre, J. B.: Discrete-Time Markov Control Processes: Basic Optimality Criteria..Springer, New York 1996. Zbl 0840.93001, MR 1363487
Reference: [16] Hernández-Lerma, O., Lasserre, J. B.: Further Topics on Discrete-Time Markov Control Processes..Springer, New York 1999. Zbl 0928.93002, MR 1697198
Reference: [17] Meyn, S. P., Tweedie, R. L.: Computable bounds for geometric convergence rates of Markov chains..Ann. Appl. Probab. 4 (1994), 981-1011. Zbl 0812.60059, MR 1304770, 10.1214/aoap/1177004900
Reference: [18] Miller, B. L.: Finite state continuous time Markov decision processes with finite planning horizon..SIAM J. Control 6 (1968), 266-280. MR 0241153, 10.1137/0306020
Reference: [19] Pliska, S. R.: Controlled jump processes..Stochastic Process. Appl. 3 (1975), 259-282. Zbl 0313.60055, MR 0406531
Reference: [20] Puterman, M. L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming..Wiley, New York 1994. Zbl 1184.90170, MR 1270015
Reference: [21] Ye, L. E., Guo, X. P.: New sufficient conditions for average optimality in continuous-time Markov decision processes..Math. Methods Oper. Res. 72 (2010), 75-94. Zbl 1203.90176, MR 2678707, 10.1007/s00186-010-0307-4
Reference: [22] Yushkevich, A. A.: Controlled jump Markov models..Theory Probab. Appl. 25 (1980), 244-266. Zbl 0458.90078, 10.1137/1125034
Reference: [23] Zhu, Q. X.: Average optimality inequality for continuous-time Markov decision processes in Polish spaces..Math. Methods Oper. Res. 66 (2007), 299-313. Zbl 1138.90038, MR 2342216, 10.1007/s00186-007-0157-x
Reference: [24] Zhu, Q.X.: Average optimality for continuous-time Markov decision processes with a policy iteration approach..J. Math. Anal. Appl. 339 (2008), 691-704. Zbl 1156.90023, MR 2370686, 10.1016/j.jmaa.2007.06.071
.

Files

Files Size Format View
Kybernetika_50-2014-6_7.pdf 415.1Kb application/pdf View/Open
Back to standard record
Partner of
EuDML logo