Another set of verifiable conditions for average Markov decision processes with Borel spaces

Zou, Xiaolong; Guo, Xianping

About DML-CZ | FAQ | Conditions of Use | Math Archives | Contact Us

Previous | Up | Next

Article

Title:	Another set of verifiable conditions for average Markov decision processes with Borel spaces (English)
Author:	Zou, Xiaolong
Author:	Guo, Xianping
Language:	English
Journal:	Kybernetika
ISSN:	0023-5954 (print)
ISSN:	1805-949X (online)
Volume:	51
Issue:	2
Year:	2015
Pages:	276-292
Summary lang:	English
.
Category:	math
.
Summary:	In this paper we give a new set of verifiable conditions for the existence of average optimal stationary policies in discrete-time Markov decision processes with Borel spaces and unbounded reward/cost functions. More precisely, we provide another set of conditions, which only consists of a Lyapunov-type condition and the common continuity-compactness conditions. These conditions are imposed on the primitive data of the model of Markov decision processes and thus easy to verify. We also give two examples for which all our conditions are satisfied, but some of conditions in the related literature fail to hold. (English)
Keyword:	discrete-time Markov decision processes
Keyword:	average reward criterion
Keyword:	optimal stationary policy
Keyword:	Lyapunov-type condition
Keyword:	unbounded reward/cost function
MSC:	90C40
MSC:	93E20
idZBL:	Zbl 06487079
idMR:	MR3350562
DOI:	10.14736/kyb-2015-2-0276
.
Date available:	2015-06-19T15:22:38Z
Last updated:	2016-01-03
Stable URL:	http://hdl.handle.net/10338.dmlcz/144298
.
Reference:	[1] Arapostathis, A., al, et: Discrete time controlled Markov processes with average cost criterion: a survey..SIAM J. Control Optim. 31 (1993), 282-344. MR 1205981, 10.1137/0331018
Reference:	[2] Casella, G., Berger, R. L.: Statistical Inference. Second edition..Duxbury Thomson Learning 2002.
Reference:	[3] Dynkin, E. B., Yushkevich, A. A.: Controlled Markov Processes..Springer, New York 1979. MR 0554083
Reference:	[4] Gordienko, E., Hernández-Lerma, O.: Average cost Markov control processes with weighted norms: existence of canonical policies..Appl. Math. (Warsaw) 23 (1995), 2, 199-218. Zbl 0829.93067, MR 1341223
Reference:	[5] Guo, X. P., Shi, P.: Limiting average criteria for nonstationary Markov decision processes..SIAM J. Optim. 11 (2001), 4, 1037-1053. Zbl 1010.90092, MR 1855220, 10.1137/s1052623499355235
Reference:	[6] Guo, X. P., Zhu, Q. X.: Average optimality for Markov decision processes in Borel spaces: A new condition and approach..J. Appl. Probab. 43 (2006), 318-334. Zbl 1121.90122, MR 2248567, 10.1239/jap/1152413725
Reference:	[7] Hernández-Lerma, O., Lasserre, J. B.: Discrete-Time Markov Control Processes..Springer, New York 1996. Zbl 0928.93002, MR 1363487, 10.1007/978-1-4612-0729-0
Reference:	[8] Hernández-Lerma, O., Lasserre, J. B.: Further Topics on Discrete-Time Markov Control Processes..Springer, New York 1999. Zbl 0928.93002, MR 1697198, 10.1007/978-1-4612-0561-6
Reference:	[9] Kakumanu, M.: Nondiscounted continuous time Markov decision process with countable state space..SIAM J. Control Optim. 10 (1972), 1, 210-220. MR 0307785, 10.1137/0310016
Reference:	[10] Lund, R. B., Tweedie, R. L.: Geometric convergence rates for stochastically ordered Markov chains..Math. Oper. Res. 21 (1996), 1, 182-194. Zbl 0847.60053, MR 1385873, 10.1287/moor.21.1.182
Reference:	[11] Meyn, S. P., Tweedie, R. L.: Markov Chains and Stochastic Stability..Cambridge Univ. Press, New York 2009. Zbl 1165.60001, MR 2509253, 10.1017/cbo9780511626630
Reference:	[12] Puterman, M. L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming..John Wiley, New York 1994. Zbl 1184.90170, MR 1270015, 10.1002/9780470316887
Reference:	[13] Sennott, L. I.: Average reward optimization theory for denumerable state spaces..In: Handbook of Markov Decision Processes (Int. Ser. Operat. Res. Manag. Sci. 40) (E. A. Feinberg and A. Shwartz Kluwer, eds.), Boston, pp. 153-172. Zbl 1008.90068, MR 1887202, 10.1007/978-1-4615-0805-2_5
Reference:	[14] Sennott, L. I.: Stochastic Dynamic Programming and the Control of Queueing Systems..Wiley, New York 1999. Zbl 0997.93503, MR 1645435, 10.1002/9780470317037
Reference:	[15] Zhu, Q. X.: Average optimality for continuous-time jump Markov decision processes with a policy iteration approach..J. Math. Anal. Appl. 339 (2008), 1, 691-704. MR 2370686, 10.1016/j.jmaa.2007.06.071
.

Files

Files	Size	Format	View
Kybernetika_51-2015-2_7.pdf	364.0Kb	application/pdf	View/Open

Back to standard record

Browse
- Collections
- Titles
- Authors
- MSC

About DML-CZ

Partner of

Article

Files

Search

Browse