Previous |  Up |  Next

Article

Title: Constrained optimality problem of Markov decision processes with Borel spaces and varying discount factors (English)
Author: Wu, Xiao
Author: Tang, Yanqiu
Language: English
Journal: Kybernetika
ISSN: 0023-5954 (print)
ISSN: 1805-949X (online)
Volume: 57
Issue: 2
Year: 2021
Pages: 295-311
Summary lang: English
.
Category: math
.
Summary: This paper focuses on the constrained optimality of discrete-time Markov decision processes (DTMDPs) with state-dependent discount factors, Borel state and compact Borel action spaces, and possibly unbounded costs. By means of the properties of so-called occupation measures of policies and the technique of transforming the original constrained optimality problem of DTMDPs into a convex program one, we prove the existence of an optimal randomized stationary policies under reasonable conditions. (English)
Keyword: constrained optimality problem
Keyword: discrete-time Markov decision processes
Keyword: Borel state and action spaces
Keyword: varying discount factors
Keyword: unbounded costs
MSC: 60J27
MSC: 90C40
idZBL: Zbl 07396268
idMR: MR4273577
DOI: 10.14736/kyb-2021-2-0295
.
Date available: 2021-07-30T13:09:16Z
Last updated: 2021-11-01
Stable URL: http://hdl.handle.net/10338.dmlcz/149040
.
Reference: [1] Altman, E.: Denumerable constrained Markov decision processes and finite approximations..Math. Meth. Operat. Res. 19 (1994), 169-191. MR 1290018,
Reference: [2] Altman, E.: Constrained Markov decision processes..Chapman and Hall/CRC, Boca Raton 1999. MR 1703380
Reference: [3] Alvarez-Mena, J., Hernández-Lerma, O.: Convergence of the optimal values of constrained Markov control processes..Math. Meth. Oper. Res. 55 (2002), 461-484. MR 1913577, 10.1007/s001860200209
Reference: [4] Borkar, V.: A convex analytic approach to Markov decision processes..Probab. Theory Relat. Fields 78 (1988), 583-602. MR 0950347, 10.1007/BF00353877
Reference: [5] González-Hernández, J., Hernández-Lerma, O.: Extreme points of sets of randomized strategies in constrained optimization and control problems..SIAM. J. Optim. 15 (2005), 1085-1104. MR 2178489,
Reference: [6] Guo, X. P., Hernández-del-Valle, A., Hernández-Lerma, O.: First passage problems for nonstationary discrete-time stochastic control systems..Europ. J. Control 18 (2012), 528-538. Zbl 1291.93328, MR 3086896,
Reference: [7] Guo, X. P., Zhang, W. Z.: Convergence of controlled models and finite-state approximation for discounted continuous-time Markov decision processes with constraints..Europ. J, Oper. Res. 238 (2014), 486-496. MR 3210941,
Reference: [8] Guo, X. P., Song, X. Y., Zhang, Y.: First passage criteria for continuous-time Markov decision processes with varying discount factors and history-dependent policies..IEEE Trans. Automat. Control 59 (2014), 163-174. MR 3163332,
Reference: [9] Hernández-Lerma, O., González-Hernández, J.: Constrained Markov Decision Processes in Borel spaces: the discounted case..Math. Meth. Operat. Res. 52 (2000), 271-285. MR 1797253,
Reference: [10] Hernández-Lerma, O., Lasserre, J. B.: Discrete-Time Markov Control Processes..Springer-Verlag, New York 1996. Zbl 0928.93002, MR 1363487
Reference: [11] Hernández-Lerma, O., Lasserre, J. B.: Discrete-Time Markov Control Processes..Springer-Verlag, New York 1999. Zbl 0928.93002, MR 1363487
Reference: [12] Hernández-Lerma, O., Lasserre, J. B.: Fatou's lemma and Lebesgue's convergence theorem for measures..J. Appl. Math. Stoch. Anal. 13(2) (2000), 137-146. MR 1768500,
Reference: [13] Huang, Y. H., Guo, X. P.: First passage models for denumerable semi-Markov decision processes with nonnegative discounted costs..Acta. Math. Appl. Sin-E. 27(2) (2011), 177-190. Zbl 1235.90177, MR 2784052,
Reference: [14] Huang, Y. H., Wei, Q. D., Guo, X. P.: Constrained Markov decision processes with first passage criteria..Ann. Oper. Res. 206 (2013), 197-219. MR 3073845,
Reference: [15] Mao, X., Piunovskiy, A.: Strategic measures in optimal control problems for stochastic sequences..Stoch. Anal. Appl. 18 (2000), 755-776. MR 1780169,
Reference: [16] Piunovskiy, A.: Optimal Control of Random Sequences in Problems with Constraints..Kluwer Academic, Dordrecht 1997. MR 1472738
Reference: [17] Piunovskiy, A.: Controlled random sequences: the convex analytic approach and constrained problems..Russ. Math. Surv., 53 (2000), 1233-1293. MR 1702690,
Reference: [18] Prokhorov, Y.: Convergence of random processes and limit theorems in probability theory..Theory Probab Appl. 1 (1956), 157-214. MR 0084896,
Reference: [19] Wei, Q. D., Guo, X. P.: Markov decision processes with state-dependent discount factors and unbounded rewards/costs..Oper. Res. Lett. 39 (2011), 369-374. MR 2835530,
Reference: [20] Wu, X., Guo, X. P.: First passage optimality and variance minimization of Markov decision processes with varying discount factors..J. Appl. Probab. 52(2) (2015), 441-456. MR 3372085,
Reference: [21] Zhang, Y.: Convex analytic approach to constrained discounted Markov decision processes with non-constant discount factors..TOP 21 (2013), 378-408. Zbl 1273.90235, MR 3068494,
.

Files

Files Size Format View
Kybernetika_57-2021-2_6.pdf 424.4Kb application/pdf View/Open
Back to standard record
Partner of
EuDML logo