Previous |  Up |  Next

Article

Keywords:
partially observable systems; discounted criterion; random discount factors; queueing models; optimal policies
Summary:
This paper deals with a class of partially observable discounted Markov decision processes defined on Borel state and action spaces, under unbounded one-stage cost. The discount rate is a stochastic process evolving according to a difference equation, which is also assumed to be partially observable. Introducing a suitable control model and filtering processes, we prove the existence of optimal control policies. In addition, we illustrate our results in a class of GI/GI/1 queueing systems where we obtain explicitly the corresponding optimality equation and the filtering process.
References:
[1] Bensoussan, A., Cakanyildirim, M., Sethi, S. P.: Partially observed inventory systems: the case of zero-balance walk. SIAM J. Control Optim. 46 (2007), 176-209. DOI 
[2] Bertsekas, D. P., Shreve, S. E.: Stochastic Optimal Control: The Discrete Time Case. Academic Press, New York 1978. DOI  | MR 0511544 | Zbl 0633.93001
[3] Carmon, Y., Shwartz, A.: Markov decision processes with exponentially representable discounting. Oper. Res. Lett. 37 (2009), 51-55. DOI  | MR 2488083 | Zbl 1154.90610
[4] Cruz-Suárez, H., Montes-de-Oca, R.: Discounted Markov control processes induced by deterministic systems. Kybernetika 42 (2006), 647-664. MR 2296506
[5] Dynkin, E. B., Yushkevich, A. A.: Controlled Markov Processes. Springer-Verlag, New York 1979. DOI  | MR 0554083
[6] Elliott, R. J., Aggoun, L., Moore, J. B.: Hidden Markov Models: Estimation and Control. Springer-Verlag, New York 1994. DOI  | MR 1323178
[7] Feinberg, E. A., Shwartz, A.: Constrained dynamic programming with two discount factors: applications and an algorithm. IEEE Trans. Automat. Control 44 (1999), 628-631. DOI  | MR 1680195 | Zbl 0957.90127
[8] González-Hernández, J., López-Martínez, R R., Minjárez-Sosa, J. A.: Approximation, estimation and control of stochastic systems under a randomized discounted cost criterion. Kybernetika 45 (2009), 737-754. DOI  | MR 2599109
[9] González-Hernández, J., López-Martínez, R. R., Minjárez-Sosa, J. A., R.Gabriel-Arguelles, J.: Constrained Markov control processes with randomized discounted rate: infinite linear programming approach. Optim. Control Appl. Meth. 35 (2014), 575-591. DOI  | MR 3262763
[10] García, Y. H., Diaz-Infante, S., Minjarez-Sosa, J. A.: Partially observable queueing systems with controlled service rates under a discounted optimality criterion. Kybernetika 57 (2021), 493-512. DOI  | MR 4299460
[11] Gordienko, E- I-, Salem, F. S.: Robustness inequality for Markov control processes with unbounded costs. Syst. Control Lett. 33 (1998), 125-130. DOI  | MR 1607814
[12] Gordienko, E., Lemus-Rodríguez, E., Montes-de-Oca, R.: Discounted cost optimality problem: stability with respect to weak metrics. Math. Methods Oper. Res. 68 (2008), 77-96. DOI  | MR 2429561
[13] Gordienko, E., Minjarez-Sosa, J. A.: Adaptive control for discrete-time Markov processes with unbounded costs: discounted criterion. Kybernetika 34 (1998), 217-234. MR 1621512
[14] Hernandez-Lerma, O.: Adaptive Markov Control Processes. Springer-Verlag, New York 1989. DOI  | MR 0995463
[15] Hernandez-Lerma, O., Runggaldier, W.: Monotone approximations for convex stochastic control problems. J. Math. Syst. Estim. Control 4 (1994), 99-140. MR 1298550
[16] Hernandez-Lerma, O., Munoz-de-Ozak, M.: Discrete-time Markov control processes with discounted unbounded costs: optimality criteria. Kybernetika 28 (1992), 191-221. DOI  | MR 1174656
[17] Hernández-Lerma, O., Lasserre, J. B.: Discrete-Time Markov Control Processes: Basic Optimality Criteria. Springer-Verlag, New York 1996. MR 1363487 | Zbl 0840.93001
[18] Hilgert, N., Minjarez-Sosa, J. A.: Adaptive policies for time-varying stochastic systems under discounted criterion. Math. Methods Oper. Res. 54 (2001), 491-505. DOI  | MR 1890916
[19] Hinderer, K.: Foundations of Non-stationary Dynamic Programming with Discrete Time parameter. In: Lecture Notes Oper. Res. 33, Springer, New York 1979. MR 0267890
[20] Jasso-Fuentes, H., Menaldi, J. L., Prieto-Rumeau, T.: Discrete-time control with non-constant discount factor. Math. Methods Oper. Res. 92 (2020), 377-399. DOI  | MR 4182024
[21] Minjarez-Sosa, J. A.: Approximation and estimation in Markov control processes under discounted criterion. Kybernetika 40 (2004), 681-690. DOI  | MR 2120390
[22] Minjarez-Sosa, J. A.: Markov control models with unknown random state-action-dependent discount factors. TOP 23 (2015), 743-772. DOI  | MR 3407674
[23] Rieder, U.: Measurable selection theorems for optimization problems. Manuscripta Math. 24 (1978), 115-131. DOI  | MR 0493590 | Zbl 0385.28005
[24] Runggaldier, W. J., Stettner, L.: Approximations of Discrete Time Partially Observed Control Problems. Applied Mathematics Monographs CNR 6, Giardini, Pisa 1994. DOI 
[25] Striebel, C.: Optimal Control of Discrete Time Stochastic Systems. Lecture Notes Econ. Math. Syst. 110, Springer-Verlag, Berlin 1975. DOI 10.1007/978-3-642-45470-7 | MR 0414212
[26] Wei, Q., Guo, X.: Markov decision processes with state-dependent discount factors and unbounded rewards/costs. Oper. Res. Lett. 39 (2011), 368-274. DOI  | MR 2835530
Partner of
EuDML logo