Estimates of stability of Markov control processes with unbounded costs

Gordienko, Evgueni I.; Salem, Francisco

About DML-CZ | FAQ | Conditions of Use | Math Archives | Contact Us

Previous | Up | Next

Article

Gordienko, Evgueni I. ; Salem, Francisco

Estimates of stability of Markov control processes with unbounded costs. (English). Kybernetika, vol. 36 (2000), issue 2, pp. [195]-210

MSC: 60J99, 90C40, 93C55, 93E20 | MR 1760024 | Zbl 1249.93176

Full entry |

PDF (1.7 MB) Feedback

Keywords:
discrete-time Markov control process; unbounded cost

Summary:
For a discrete-time Markov control process with the transition probability $p$, we compare the total discounted costs $V_\beta $ $(\pi _\beta )$ and $V_\beta (\tilde{\pi }_\beta )$, when applying the optimal control policy $\pi _\beta $ and its approximation $\tilde{\pi }_\beta $. The policy $\tilde{\pi }_\beta $ is optimal for an approximating process with the transition probability $\tilde{p}$. A cost per stage for considered processes can be unbounded. Under certain ergodicity assumptions we establish the upper bound for the relative stability index $[V_\beta (\tilde{\pi }_\beta )-V_\beta (\pi _\beta )]/V_\beta (\pi _\beta )$. This bound does not depend on a discount factor $\beta \in (0,1)$ and this is given in terms of the total variation distance between $p$ and $\tilde{p}$.

Similar articles:

References:

[1] Dynkin E. B., Yushkevich A. A.: Controlled Markov Processes. Springer–Verlag, New York 1979 MR 0554083

[4] Gordienko E., Hernández–Lerma O.: Average cost Markov control processes with weighted norms: exitence of canonical policies. Appl. Math. 23 (1995), 199–218 MR 1341223

[5] Gordienko E., Hernández–Lerma O.: Average cost Markov control processes with weighted norms: value iteration. Appl. Math. 23 (1995), 219–237 MR 1341224 | Zbl 0829.93068

[6] Gordienko E. I., Isauro-Martínez M. E., Carrillo R. M. Marcos: Estimation of stability in controlled storage systems. Research Report No. 04.0405.I.01.001.97, Dep. de Matemáticas, Universidad Autónoma Metropolitana, México 1997

[7] Gordienko E. I., Salem F. S.: Robustness inequality for Markov control processes with unbounded costs. Systems Control Lett. 33 (1998), 125–130 DOI 10.1016/S0167-6911(97)00077-7 | MR 1607814 | Zbl 0902.93068

[8] Hernández-Lerma O., Lasserre J. B.: Average cost optimal policies for Markov control processes with Borel state space and unbounded costs. Systems Control Lett. 15 (1990), 349–356 DOI 10.1016/0167-6911(90)90108-7 | MR 1078813

[9] Hernández-Lerma O., Lassere J. B.: Discrete–time Markov Control Processes. Springer–Verlag, New York 1995

[10] Hinderer H.: Foundations of Non–Stationary Dynamic Programming with Discrete Time Parameter. (Lecture Notes in Operations Research 33.) Springer–Verlag, New York 1970 MR 0267890 | Zbl 0202.18401

[11] Kartashov N. V.: Inequalities in theorems of ergodicity and stability for Markov chains with common phase space. II. Theory Probab. Appl. 30 (1985), 507–515 DOI 10.1137/1130063

[12] Kumar P. R., Varaiya P.: Stochastic Systems: Estimation, Identification and Adaptive Control. Prentice–Hall, Englewood Cliffs, N. J. 1986 Zbl 0706.93057

[13] Meyn S. P., Tweedie R. L.: Markov Chains and Stochastic Stability. Springer–Verlag, Berlin 1993 MR 1287609 | Zbl 1165.60001

[14] Nummelin E.: General Irreducible Markov Chains and Non–Negative Operators. Cambridge University Press, Cambridge 1984 MR 0776608 | Zbl 0551.60066

[15] Rachev S. T.: Probability Metrics and the Stability of Stochastic Models. Wiley, New York 1991 MR 1105086 | Zbl 0744.60004

[16] Scott D. J., Tweedie R. L.: Explicit rates of convergence of stochastically ordered Markov chains. In: Proc. Athens Conference of Applied Probability and Time Series Analysis: Papers in Honour of J. M. Gani and E. J. Hannan (C. C. Heyde, Yu. V. Prohorov, R. Pyke and S. T. Rachev, eds.). Springer–Verlag, New York 1995, pp. 176–191 MR 1466715

[17] Dijk N. M. Van: Perturbation theory for unbounded Markov reward processes with applications to queueing. Adv. in Appl. Probab. 20 (1988), 99–111 DOI 10.2307/1427272 | MR 0932536

[18] Dijk N. M. Van, Puterman M. L.: Perturbation theory for Markov reward processes with applications to queueing systems. Adv. in Appl. Probab. 20 (1988), 79–98 DOI 10.2307/1427271 | MR 0932535

[19] Weber R. R., jr. S. Stidham: Optimal control of service rates in networks of queues. Adv. in Appl. Probab. 19 (1987), 202–218 DOI 10.2307/1427380 | MR 0876537 | Zbl 0617.60090

[20] Whitt W.: Approximations of dynamic programs I. Math. Oper. Res. 3 (1978), 231–243 DOI 10.1287/moor.3.3.231 | MR 0506661 | Zbl 0393.90094

[21] Zolotarev V. M.: On stochastic continuity of queueing systems of type $G\vert G\vert 1$. Theory Probab. Appl. 21 (1976), 250–269 MR 0420920

Browse
- Collections
- Titles
- Authors
- MSC

About DML-CZ

Partner of

Article

Search

Browse