Previous |  Up |  Next

Article

Title: Estimates of stability of Markov control processes with unbounded costs (English)
Author: Gordienko, Evgueni I.
Author: Salem, Francisco
Language: English
Journal: Kybernetika
ISSN: 0023-5954
Volume: 36
Issue: 2
Year: 2000
Pages: [195]-210
Summary lang: English
.
Category: math
.
Summary: For a discrete-time Markov control process with the transition probability $p$, we compare the total discounted costs $V_\beta $ $(\pi _\beta )$ and $V_\beta (\tilde{\pi }_\beta )$, when applying the optimal control policy $\pi _\beta $ and its approximation $\tilde{\pi }_\beta $. The policy $\tilde{\pi }_\beta $ is optimal for an approximating process with the transition probability $\tilde{p}$. A cost per stage for considered processes can be unbounded. Under certain ergodicity assumptions we establish the upper bound for the relative stability index $[V_\beta (\tilde{\pi }_\beta )-V_\beta (\pi _\beta )]/V_\beta (\pi _\beta )$. This bound does not depend on a discount factor $\beta \in (0,1)$ and this is given in terms of the total variation distance between $p$ and $\tilde{p}$. (English)
Keyword: discrete-time Markov control process
Keyword: unbounded cost
MSC: 60J99
MSC: 90C40
MSC: 93C55
MSC: 93E20
idZBL: Zbl 1249.93176
idMR: MR1760024
.
Date available: 2009-09-24T19:32:10Z
Last updated: 2015-03-26
Stable URL: http://hdl.handle.net/10338.dmlcz/135344
.
Reference: [1] Dynkin E. B., Yushkevich A. A.: Controlled Markov Processes.Springer–Verlag, New York 1979 MR 0554083
Reference: [4] Gordienko E., Hernández–Lerma O.: Average cost Markov control processes with weighted norms: exitence of canonical policies.Appl. Math. 23 (1995), 199–218 MR 1341223
Reference: [5] Gordienko E., Hernández–Lerma O.: Average cost Markov control processes with weighted norms: value iteration.Appl. Math. 23 (1995), 219–237 Zbl 0829.93068, MR 1341224
Reference: [6] Gordienko E. I., Isauro-Martínez M. E., Carrillo R. M. Marcos: Estimation of stability in controlled storage systems.Research Report No. 04.0405.I.01.001.97, Dep. de Matemáticas, Universidad Autónoma Metropolitana, México 1997
Reference: [7] Gordienko E. I., Salem F. S.: Robustness inequality for Markov control processes with unbounded costs.Systems Control Lett. 33 (1998), 125–130 Zbl 0902.93068, MR 1607814, 10.1016/S0167-6911(97)00077-7
Reference: [8] Hernández-Lerma O., Lasserre J. B.: Average cost optimal policies for Markov control processes with Borel state space and unbounded costs.Systems Control Lett. 15 (1990), 349–356 MR 1078813, 10.1016/0167-6911(90)90108-7
Reference: [9] Hernández-Lerma O., Lassere J. B.: Discrete–time Markov Control Processes.Springer–Verlag, New York 1995
Reference: [10] Hinderer H.: Foundations of Non–Stationary Dynamic Programming with Discrete Time Parameter.(Lecture Notes in Operations Research 33.) Springer–Verlag, New York 1970 Zbl 0202.18401, MR 0267890
Reference: [11] Kartashov N. V.: Inequalities in theorems of ergodicity and stability for Markov chains with common phase space.II. Theory Probab. Appl. 30 (1985), 507–515 10.1137/1130063
Reference: [12] Kumar P. R., Varaiya P.: Stochastic Systems: Estimation, Identification and Adaptive Control.Prentice–Hall, Englewood Cliffs, N. J. 1986 Zbl 0706.93057
Reference: [13] Meyn S. P., Tweedie R. L.: Markov Chains and Stochastic Stability.Springer–Verlag, Berlin 1993 Zbl 1165.60001, MR 1287609
Reference: [14] Nummelin E.: General Irreducible Markov Chains and Non–Negative Operators.Cambridge University Press, Cambridge 1984 Zbl 0551.60066, MR 0776608
Reference: [15] Rachev S. T.: Probability Metrics and the Stability of Stochastic Models.Wiley, New York 1991 Zbl 0744.60004, MR 1105086
Reference: [16] Scott D. J., Tweedie R. L.: Explicit rates of convergence of stochastically ordered Markov chains.In: Proc. Athens Conference of Applied Probability and Time Series Analysis: Papers in Honour of J. M. Gani and E. J. Hannan (C. C. Heyde, Yu. V. Prohorov, R. Pyke and S. T. Rachev, eds.). Springer–Verlag, New York 1995, pp. 176–191 MR 1466715
Reference: [17] Dijk N. M. Van: Perturbation theory for unbounded Markov reward processes with applications to queueing.Adv. in Appl. Probab. 20 (1988), 99–111 MR 0932536, 10.2307/1427272
Reference: [18] Dijk N. M. Van, Puterman M. L.: Perturbation theory for Markov reward processes with applications to queueing systems.Adv. in Appl. Probab. 20 (1988), 79–98 MR 0932535, 10.2307/1427271
Reference: [19] Weber R. R., jr. S. Stidham: Optimal control of service rates in networks of queues.Adv. in Appl. Probab. 19 (1987), 202–218 Zbl 0617.60090, MR 0876537, 10.2307/1427380
Reference: [20] Whitt W.: Approximations of dynamic programs I.Math. Oper. Res. 3 (1978), 231–243 Zbl 0393.90094, MR 0506661, 10.1287/moor.3.3.231
Reference: [21] Zolotarev V. M.: On stochastic continuity of queueing systems of type $G\vert G\vert 1$.Theory Probab. Appl. 21 (1976), 250–269 MR 0420920
.

Files

Files Size Format View
Kybernetika_36-2000-2_4.pdf 1.793Mb application/pdf View/Open
Back to standard record
Partner of
EuDML logo