Partially observable Markov decision processes with partially observable random discount factors

Martinez-Garcia, E. Everardo; Minjárez-Sosa, J. Adolfo; Vega-Amaya, Oscar

About DML-CZ | FAQ | Conditions of Use | Math Archives | Contact Us

Previous | Up | Next

Article

Title:	Partially observable Markov decision processes with partially observable random discount factors (English)
Author:	Martinez-Garcia, E. Everardo
Author:	Minjárez-Sosa, J. Adolfo
Author:	Vega-Amaya, Oscar
Language:	English
Journal:	Kybernetika
ISSN:	0023-5954 (print)
ISSN:	1805-949X (online)
Volume:	58
Issue:	6
Year:	2022
Pages:	960-983
Summary lang:	English
.
Category:	math
.
Summary:	This paper deals with a class of partially observable discounted Markov decision processes defined on Borel state and action spaces, under unbounded one-stage cost. The discount rate is a stochastic process evolving according to a difference equation, which is also assumed to be partially observable. Introducing a suitable control model and filtering processes, we prove the existence of optimal control policies. In addition, we illustrate our results in a class of GI/GI/1 queueing systems where we obtain explicitly the corresponding optimality equation and the filtering process. (English)
Keyword:	partially observable systems
Keyword:	discounted criterion
Keyword:	random discount factors
Keyword:	queueing models
Keyword:	optimal policies
MSC:	90B22
MSC:	90C39
idZBL:	Zbl 07655866
idMR:	MR4548223
DOI:	10.14736/kyb-2022-6-0960
.
Date available:	2023-02-10T13:51:20Z
Last updated:	2023-03-13
Stable URL:	http://hdl.handle.net/10338.dmlcz/151538
.
Reference:	[1] Bensoussan, A., Cakanyildirim, M., Sethi, S. P.: Partially observed inventory systems: the case of zero-balance walk..SIAM J. Control Optim. 46 (2007), 176-209.
Reference:	[2] Bertsekas, D. P., Shreve, S. E.: Stochastic Optimal Control: The Discrete Time Case..Academic Press, New York 1978. Zbl 0633.93001, MR 0511544,
Reference:	[3] Carmon, Y., Shwartz, A.: Markov decision processes with exponentially representable discounting..Oper. Res. Lett. 37 (2009), 51-55. Zbl 1154.90610, MR 2488083,
Reference:	[4] Cruz-Suárez, H., Montes-de-Oca, R.: Discounted Markov control processes induced by deterministic systems..Kybernetika 42 (2006), 647-664. MR 2296506
Reference:	[5] Dynkin, E. B., Yushkevich, A. A.: Controlled Markov Processes..Springer-Verlag, New York 1979. MR 0554083,
Reference:	[6] Elliott, R. J., Aggoun, L., Moore, J. B.: Hidden Markov Models: Estimation and Control..Springer-Verlag, New York 1994. MR 1323178,
Reference:	[7] Feinberg, E. A., Shwartz, A.: Constrained dynamic programming with two discount factors: applications and an algorithm..IEEE Trans. Automat. Control 44 (1999), 628-631. Zbl 0957.90127, MR 1680195,
Reference:	[8] González-Hernández, J., López-Martínez, R R., Minjárez-Sosa, J. A.: Approximation, estimation and control of stochastic systems under a randomized discounted cost criterion..Kybernetika 45 (2009), 737-754. MR 2599109,
Reference:	[9] González-Hernández, J., López-Martínez, R. R., Minjárez-Sosa, J. A., R.Gabriel-Arguelles, J.: Constrained Markov control processes with randomized discounted rate: infinite linear programming approach..Optim. Control Appl. Meth. 35 (2014), 575-591. MR 3262763,
Reference:	[10] García, Y. H., Diaz-Infante, S., Minjarez-Sosa, J. A.: Partially observable queueing systems with controlled service rates under a discounted optimality criterion..Kybernetika 57 (2021), 493-512. MR 4299460,
Reference:	[11] Gordienko, E- I-, Salem, F. S.: Robustness inequality for Markov control processes with unbounded costs..Syst. Control Lett. 33 (1998), 125-130. MR 1607814,
Reference:	[12] Gordienko, E., Lemus-Rodríguez, E., Montes-de-Oca, R.: Discounted cost optimality problem: stability with respect to weak metrics..Math. Methods Oper. Res. 68 (2008), 77-96. MR 2429561,
Reference:	[13] Gordienko, E., Minjarez-Sosa, J. A.: Adaptive control for discrete-time Markov processes with unbounded costs: discounted criterion..Kybernetika 34 (1998), 217-234. MR 1621512
Reference:	[14] Hernandez-Lerma, O.: Adaptive Markov Control Processes..Springer-Verlag, New York 1989. MR 0995463,
Reference:	[15] Hernandez-Lerma, O., Runggaldier, W.: Monotone approximations for convex stochastic control problems..J. Math. Syst. Estim. Control 4 (1994), 99-140. MR 1298550
Reference:	[16] Hernandez-Lerma, O., Munoz-de-Ozak, M.: Discrete-time Markov control processes with discounted unbounded costs: optimality criteria..Kybernetika 28 (1992), 191-221. MR 1174656,
Reference:	[17] Hernández-Lerma, O., Lasserre, J. B.: Discrete-Time Markov Control Processes: Basic Optimality Criteria..Springer-Verlag, New York 1996. Zbl 0840.93001, MR 1363487
Reference:	[18] Hilgert, N., Minjarez-Sosa, J. A.: Adaptive policies for time-varying stochastic systems under discounted criterion..Math. Methods Oper. Res. 54 (2001), 491-505. MR 1890916,
Reference:	[19] Hinderer, K.: Foundations of Non-stationary Dynamic Programming with Discrete Time parameter..In: Lecture Notes Oper. Res. 33, Springer, New York 1979. MR 0267890
Reference:	[20] Jasso-Fuentes, H., Menaldi, J. L., Prieto-Rumeau, T.: Discrete-time control with non-constant discount factor..Math. Methods Oper. Res. 92 (2020), 377-399. MR 4182024,
Reference:	[21] Minjarez-Sosa, J. A.: Approximation and estimation in Markov control processes under discounted criterion..Kybernetika 40 (2004), 681-690. MR 2120390,
Reference:	[22] Minjarez-Sosa, J. A.: Markov control models with unknown random state-action-dependent discount factors..TOP 23 (2015), 743-772. MR 3407674,
Reference:	[23] Rieder, U.: Measurable selection theorems for optimization problems..Manuscripta Math. 24 (1978), 115-131. Zbl 0385.28005, MR 0493590,
Reference:	[24] Runggaldier, W. J., Stettner, L.: Approximations of Discrete Time Partially Observed Control Problems..Applied Mathematics Monographs CNR 6, Giardini, Pisa 1994.
Reference:	[25] Striebel, C.: Optimal Control of Discrete Time Stochastic Systems..Lecture Notes Econ. Math. Syst. 110, Springer-Verlag, Berlin 1975. MR 0414212, 10.1007/978-3-642-45470-7
Reference:	[26] Wei, Q., Guo, X.: Markov decision processes with state-dependent discount factors and unbounded rewards/costs..Oper. Res. Lett. 39 (2011), 368-274. MR 2835530,
.

Files

Files	Size	Format	View
Kybernetika_58-2022-6_5.pdf	540.3Kb	application/pdf	View/Open

Back to standard record

Browse
- Collections
- Titles
- Authors
- MSC

About DML-CZ

Partner of

Article

Files

Search

Browse