Title:
|
Estimates of stability of Markov control processes with unbounded costs (English) |
Author:
|
Gordienko, Evgueni I. |
Author:
|
Salem, Francisco |
Language:
|
English |
Journal:
|
Kybernetika |
ISSN:
|
0023-5954 |
Volume:
|
36 |
Issue:
|
2 |
Year:
|
2000 |
Pages:
|
[195]-210 |
Summary lang:
|
English |
. |
Category:
|
math |
. |
Summary:
|
For a discrete-time Markov control process with the transition probability $p$, we compare the total discounted costs $V_\beta $ $(\pi _\beta )$ and $V_\beta (\tilde{\pi }_\beta )$, when applying the optimal control policy $\pi _\beta $ and its approximation $\tilde{\pi }_\beta $. The policy $\tilde{\pi }_\beta $ is optimal for an approximating process with the transition probability $\tilde{p}$. A cost per stage for considered processes can be unbounded. Under certain ergodicity assumptions we establish the upper bound for the relative stability index $[V_\beta (\tilde{\pi }_\beta )-V_\beta (\pi _\beta )]/V_\beta (\pi _\beta )$. This bound does not depend on a discount factor $\beta \in (0,1)$ and this is given in terms of the total variation distance between $p$ and $\tilde{p}$. (English) |
Keyword:
|
discrete-time Markov control process |
Keyword:
|
unbounded cost |
MSC:
|
60J99 |
MSC:
|
90C40 |
MSC:
|
93C55 |
MSC:
|
93E20 |
idZBL:
|
Zbl 1249.93176 |
idMR:
|
MR1760024 |
. |
Date available:
|
2009-09-24T19:32:10Z |
Last updated:
|
2015-03-26 |
Stable URL:
|
http://hdl.handle.net/10338.dmlcz/135344 |
. |
Reference:
|
[1] Dynkin E. B., Yushkevich A. A.: Controlled Markov Processes.Springer–Verlag, New York 1979 MR 0554083 |
Reference:
|
[4] Gordienko E., Hernández–Lerma O.: Average cost Markov control processes with weighted norms: exitence of canonical policies.Appl. Math. 23 (1995), 199–218 MR 1341223 |
Reference:
|
[5] Gordienko E., Hernández–Lerma O.: Average cost Markov control processes with weighted norms: value iteration.Appl. Math. 23 (1995), 219–237 Zbl 0829.93068, MR 1341224 |
Reference:
|
[6] Gordienko E. I., Isauro-Martínez M. E., Carrillo R. M. Marcos: Estimation of stability in controlled storage systems.Research Report No. 04.0405.I.01.001.97, Dep. de Matemáticas, Universidad Autónoma Metropolitana, México 1997 |
Reference:
|
[7] Gordienko E. I., Salem F. S.: Robustness inequality for Markov control processes with unbounded costs.Systems Control Lett. 33 (1998), 125–130 Zbl 0902.93068, MR 1607814, 10.1016/S0167-6911(97)00077-7 |
Reference:
|
[8] Hernández-Lerma O., Lasserre J. B.: Average cost optimal policies for Markov control processes with Borel state space and unbounded costs.Systems Control Lett. 15 (1990), 349–356 MR 1078813, 10.1016/0167-6911(90)90108-7 |
Reference:
|
[9] Hernández-Lerma O., Lassere J. B.: Discrete–time Markov Control Processes.Springer–Verlag, New York 1995 |
Reference:
|
[10] Hinderer H.: Foundations of Non–Stationary Dynamic Programming with Discrete Time Parameter.(Lecture Notes in Operations Research 33.) Springer–Verlag, New York 1970 Zbl 0202.18401, MR 0267890 |
Reference:
|
[11] Kartashov N. V.: Inequalities in theorems of ergodicity and stability for Markov chains with common phase space.II. Theory Probab. Appl. 30 (1985), 507–515 10.1137/1130063 |
Reference:
|
[12] Kumar P. R., Varaiya P.: Stochastic Systems: Estimation, Identification and Adaptive Control.Prentice–Hall, Englewood Cliffs, N. J. 1986 Zbl 0706.93057 |
Reference:
|
[13] Meyn S. P., Tweedie R. L.: Markov Chains and Stochastic Stability.Springer–Verlag, Berlin 1993 Zbl 1165.60001, MR 1287609 |
Reference:
|
[14] Nummelin E.: General Irreducible Markov Chains and Non–Negative Operators.Cambridge University Press, Cambridge 1984 Zbl 0551.60066, MR 0776608 |
Reference:
|
[15] Rachev S. T.: Probability Metrics and the Stability of Stochastic Models.Wiley, New York 1991 Zbl 0744.60004, MR 1105086 |
Reference:
|
[16] Scott D. J., Tweedie R. L.: Explicit rates of convergence of stochastically ordered Markov chains.In: Proc. Athens Conference of Applied Probability and Time Series Analysis: Papers in Honour of J. M. Gani and E. J. Hannan (C. C. Heyde, Yu. V. Prohorov, R. Pyke and S. T. Rachev, eds.). Springer–Verlag, New York 1995, pp. 176–191 MR 1466715 |
Reference:
|
[17] Dijk N. M. Van: Perturbation theory for unbounded Markov reward processes with applications to queueing.Adv. in Appl. Probab. 20 (1988), 99–111 MR 0932536, 10.2307/1427272 |
Reference:
|
[18] Dijk N. M. Van, Puterman M. L.: Perturbation theory for Markov reward processes with applications to queueing systems.Adv. in Appl. Probab. 20 (1988), 79–98 MR 0932535, 10.2307/1427271 |
Reference:
|
[19] Weber R. R., jr. S. Stidham: Optimal control of service rates in networks of queues.Adv. in Appl. Probab. 19 (1987), 202–218 Zbl 0617.60090, MR 0876537, 10.2307/1427380 |
Reference:
|
[20] Whitt W.: Approximations of dynamic programs I.Math. Oper. Res. 3 (1978), 231–243 Zbl 0393.90094, MR 0506661, 10.1287/moor.3.3.231 |
Reference:
|
[21] Zolotarev V. M.: On stochastic continuity of queueing systems of type $G\vert G\vert 1$.Theory Probab. Appl. 21 (1976), 250–269 MR 0420920 |
. |