Title:
|
Partially observable Markov decision processes with partially observable random discount factors (English) |
Author:
|
Martinez-Garcia, E. Everardo |
Author:
|
Minjárez-Sosa, J. Adolfo |
Author:
|
Vega-Amaya, Oscar |
Language:
|
English |
Journal:
|
Kybernetika |
ISSN:
|
0023-5954 (print) |
ISSN:
|
1805-949X (online) |
Volume:
|
58 |
Issue:
|
6 |
Year:
|
2022 |
Pages:
|
960-983 |
Summary lang:
|
English |
. |
Category:
|
math |
. |
Summary:
|
This paper deals with a class of partially observable discounted Markov decision processes defined on Borel state and action spaces, under unbounded one-stage cost. The discount rate is a stochastic process evolving according to a difference equation, which is also assumed to be partially observable. Introducing a suitable control model and filtering processes, we prove the existence of optimal control policies. In addition, we illustrate our results in a class of GI/GI/1 queueing systems where we obtain explicitly the corresponding optimality equation and the filtering process. (English) |
Keyword:
|
partially observable systems |
Keyword:
|
discounted criterion |
Keyword:
|
random discount factors |
Keyword:
|
queueing models |
Keyword:
|
optimal policies |
MSC:
|
90B22 |
MSC:
|
90C39 |
idZBL:
|
Zbl 07655866 |
idMR:
|
MR4548223 |
DOI:
|
10.14736/kyb-2022-6-0960 |
. |
Date available:
|
2023-02-10T13:51:20Z |
Last updated:
|
2023-03-13 |
Stable URL:
|
http://hdl.handle.net/10338.dmlcz/151538 |
. |
Reference:
|
[1] Bensoussan, A., Cakanyildirim, M., Sethi, S. P.: Partially observed inventory systems: the case of zero-balance walk..SIAM J. Control Optim. 46 (2007), 176-209. |
Reference:
|
[2] Bertsekas, D. P., Shreve, S. E.: Stochastic Optimal Control: The Discrete Time Case..Academic Press, New York 1978. Zbl 0633.93001, MR 0511544, |
Reference:
|
[3] Carmon, Y., Shwartz, A.: Markov decision processes with exponentially representable discounting..Oper. Res. Lett. 37 (2009), 51-55. Zbl 1154.90610, MR 2488083, |
Reference:
|
[4] Cruz-Suárez, H., Montes-de-Oca, R.: Discounted Markov control processes induced by deterministic systems..Kybernetika 42 (2006), 647-664. MR 2296506 |
Reference:
|
[5] Dynkin, E. B., Yushkevich, A. A.: Controlled Markov Processes..Springer-Verlag, New York 1979. MR 0554083, |
Reference:
|
[6] Elliott, R. J., Aggoun, L., Moore, J. B.: Hidden Markov Models: Estimation and Control..Springer-Verlag, New York 1994. MR 1323178, |
Reference:
|
[7] Feinberg, E. A., Shwartz, A.: Constrained dynamic programming with two discount factors: applications and an algorithm..IEEE Trans. Automat. Control 44 (1999), 628-631. Zbl 0957.90127, MR 1680195, |
Reference:
|
[8] González-Hernández, J., López-Martínez, R R., Minjárez-Sosa, J. A.: Approximation, estimation and control of stochastic systems under a randomized discounted cost criterion..Kybernetika 45 (2009), 737-754. MR 2599109, |
Reference:
|
[9] González-Hernández, J., López-Martínez, R. R., Minjárez-Sosa, J. A., R.Gabriel-Arguelles, J.: Constrained Markov control processes with randomized discounted rate: infinite linear programming approach..Optim. Control Appl. Meth. 35 (2014), 575-591. MR 3262763, |
Reference:
|
[10] García, Y. H., Diaz-Infante, S., Minjarez-Sosa, J. A.: Partially observable queueing systems with controlled service rates under a discounted optimality criterion..Kybernetika 57 (2021), 493-512. MR 4299460, |
Reference:
|
[11] Gordienko, E- I-, Salem, F. S.: Robustness inequality for Markov control processes with unbounded costs..Syst. Control Lett. 33 (1998), 125-130. MR 1607814, |
Reference:
|
[12] Gordienko, E., Lemus-Rodríguez, E., Montes-de-Oca, R.: Discounted cost optimality problem: stability with respect to weak metrics..Math. Methods Oper. Res. 68 (2008), 77-96. MR 2429561, |
Reference:
|
[13] Gordienko, E., Minjarez-Sosa, J. A.: Adaptive control for discrete-time Markov processes with unbounded costs: discounted criterion..Kybernetika 34 (1998), 217-234. MR 1621512 |
Reference:
|
[14] Hernandez-Lerma, O.: Adaptive Markov Control Processes..Springer-Verlag, New York 1989. MR 0995463, |
Reference:
|
[15] Hernandez-Lerma, O., Runggaldier, W.: Monotone approximations for convex stochastic control problems..J. Math. Syst. Estim. Control 4 (1994), 99-140. MR 1298550 |
Reference:
|
[16] Hernandez-Lerma, O., Munoz-de-Ozak, M.: Discrete-time Markov control processes with discounted unbounded costs: optimality criteria..Kybernetika 28 (1992), 191-221. MR 1174656, |
Reference:
|
[17] Hernández-Lerma, O., Lasserre, J. B.: Discrete-Time Markov Control Processes: Basic Optimality Criteria..Springer-Verlag, New York 1996. Zbl 0840.93001, MR 1363487 |
Reference:
|
[18] Hilgert, N., Minjarez-Sosa, J. A.: Adaptive policies for time-varying stochastic systems under discounted criterion..Math. Methods Oper. Res. 54 (2001), 491-505. MR 1890916, |
Reference:
|
[19] Hinderer, K.: Foundations of Non-stationary Dynamic Programming with Discrete Time parameter..In: Lecture Notes Oper. Res. 33, Springer, New York 1979. MR 0267890 |
Reference:
|
[20] Jasso-Fuentes, H., Menaldi, J. L., Prieto-Rumeau, T.: Discrete-time control with non-constant discount factor..Math. Methods Oper. Res. 92 (2020), 377-399. MR 4182024, |
Reference:
|
[21] Minjarez-Sosa, J. A.: Approximation and estimation in Markov control processes under discounted criterion..Kybernetika 40 (2004), 681-690. MR 2120390, |
Reference:
|
[22] Minjarez-Sosa, J. A.: Markov control models with unknown random state-action-dependent discount factors..TOP 23 (2015), 743-772. MR 3407674, |
Reference:
|
[23] Rieder, U.: Measurable selection theorems for optimization problems..Manuscripta Math. 24 (1978), 115-131. Zbl 0385.28005, MR 0493590, |
Reference:
|
[24] Runggaldier, W. J., Stettner, L.: Approximations of Discrete Time Partially Observed Control Problems..Applied Mathematics Monographs CNR 6, Giardini, Pisa 1994. |
Reference:
|
[25] Striebel, C.: Optimal Control of Discrete Time Stochastic Systems..Lecture Notes Econ. Math. Syst. 110, Springer-Verlag, Berlin 1975. MR 0414212, 10.1007/978-3-642-45470-7 |
Reference:
|
[26] Wei, Q., Guo, X.: Markov decision processes with state-dependent discount factors and unbounded rewards/costs..Oper. Res. Lett. 39 (2011), 368-274. MR 2835530, |
. |