Approximation and adaptive control of Markov processes: Average reward criterion

Hernández-Lerma, Onésimo

About DML-CZ | FAQ | Conditions of Use | Math Archives | Contact Us

Previous | Up | Next

Article

Title:	Approximation and adaptive control of Markov processes: Average reward criterion (English)
Author:	Hernández-Lerma, Onésimo
Language:	English
Journal:	Kybernetika
ISSN:	0023-5954
Volume:	23
Issue:	4
Year:	1987
Pages:	265-288
.
Category:	math
.
MSC:	60J25
MSC:	90C40
MSC:	93C40
MSC:	93E20
idZBL:	Zbl 0633.90091
idMR:	MR912012
.
Date available:	2009-09-24T18:00:26Z
Last updated:	2012-06-05
Stable URL:	http://hdl.handle.net/10338.dmlcz/125648
.
Reference:	[1] R. S. Acosta Abreu: Control of Markov chains with unknown parameters and metric state space.Submitted for publication. In Spanish.
Reference:	[2] R. S. Acosta Abreu, O. Hernandez-Lerma: Iterative adaptive control of denumerable state average-cost Markov systems.Control. Cyber. 14 (1985), 313 - 322. MR 0842780
Reference:	[3] V. V. Baranov: Recursive algorithms of adaptive control in stochastic systems.Cybernetics 17 (1981), 815-824. MR 0689427
Reference:	[4] V. V. Baranov: A recursive algorithm in markovian decision processes.Cybernetics 18 (1982), 499-506. Zbl 0517.90089, MR 0712079
Reference:	[5] D. P. Bertsekas, S. E. Shreve: Stochastic Optimal Control- The Discrete Time Case.Academic Press, New York 1978. Zbl 0471.93002, MR 0511544
Reference:	[6] A. Federgruen, P. J. Schweitzer: Nonstationary Markov decision problems with converging parameters.J. Optim. Theory Appl. 34 (1981), 207-241. Zbl 0426.90091, MR 0625228
Reference:	[7] A. Federgruen, H. C. Tijms: The optimality equation in average cost denumerable state semi-Markov decision problems, recurrency conditions and algorithms.J. Appl. Probab. 15 (1978), 356-373. Zbl 0386.90060, MR 0475896
Reference:	[8] P. J. Georgin: Contröle de chaines de Markov sur des espaces arbitraires.Ann. Inst. H. Poincare B 14 (1978), 255-277. MR 0508929
Reference:	[9] J. P. Georgin: Estimation et controle de chaines de Markov sur des espaces arbitraires.In: Lecture Notes Mathematics 636. Springer-Verlag, Berlin-Heidelberg-New York-Tokyo 1978, pp. 71-113. MR 0498945
Reference:	[10] E. I. Gordienko: Adaptive strategies for certain classes of controlled Markov processes.Theory Probab. Appl. 29 (1985), 504-518. Zbl 0577.93067
Reference:	[11] L. G. Gubenko, E. S. Statland: On controlled, discrete-time Markov decision processes.Theory Probab. Math. Statist. 7 (1975), 47-61.
Reference:	[12] O. Hernández-Lerma: Approximation and adaptive policies in discounted dynamic programming.Bol. Soc. Mat. Mexicana 30 (1985). In press. MR 0886123
Reference:	[13] O. Hernández-Lerma: Nonstationary value-iteration and adaptive control of discounted semi-Markov processes.J. Math. Anal. Appl. 112 (1985), 435-445. MR 0813610
Reference:	[14] O. Hernandez-Lerma, S. I. Marcus: Adaptive control of service in queueing systems.Syst. Control Lett. 3 (1983), 283-289. Zbl 0534.90037, MR 0722958
Reference:	[15] O. Hernández-Lerma, S. I. Marcus: Optimal adaptive control of priority assignment in queueing systems.Syst. Control Lett. 4 (1984), 65 - 75. MR 0740208
Reference:	[16] O. Hernández-Lerma, S. I. Marcus: Adaptive policies for discrete-time stochastic control systems with unknown disturbance distribution.Submitted for publication, 1986. MR 0912683
Reference:	[17] O. Hernández-Lerma, S. I. Marcus: Nonparametric adaptive control of discrete-time partially observable stochastic systems.Submitted for publication, 1986.
Reference:	[18] C. J. Himmelberg T. Parthasarathy, F. S. Van Vleck: Optimal plans for dynamic programming problems.Math. Oper. Res. 1 (1976), 390-394. MR 0444043
Reference:	[19] K. Hinderer: Foundations of Non-stationary Dynamic Programming with Discrete Time Parameter.(Lecture Notes in Operations Research and Mathematical Systems 33.) Springer-Verlag, Berlin-Heidelberg-New York 1970. Zbl 0202.18401, MR 0267890
Reference:	[20] A. Hordijk P. J. Schweitzer, H. Tijms: The asymptotic behaviour of the minimal total expected cost for the denumerable state Markov decision model.J. Appl. Probab. 12 (1975), 298-305. MR 0378838
Reference:	[21] P. R. Kumar: A survey of some results in stochastic adaptive control.SIAM J. Control Optim. 23 (1985), 329-380. Zbl 0571.93038, MR 0784574
Reference:	[22] M. Kurano: Discrete-time markovian decision processes with an unknown parameter - average return criterion.J. Oper. Res. Soc. Japan 15 (1972), 67-76. Zbl 0238.90006, MR 0343942
Reference:	[23] M. Kurano: Average-optimal adaptive policies in semi-Markov decision processes including an unknown parameter.J. Oper. Res. Soc. Japan 28 (1985), 252-366. Zbl 0579.90098, MR 0812416
Reference:	[24] P. Mandl: Estimation and control in Markov chains.Adv. Appl. Probab. 6 (1974), 40-60. Zbl 0281.60070, MR 0339876
Reference:	[25] P. Mandl: On the adaptive control of countable Markov chains.In: Probability Theory, Banach Centre Publications 5, PWB-Polish Scientific Publishers, Warsaw 1979, pp. 159- 173. Zbl 0439.60069, MR 0561478
Reference:	[26] H. L. Royden: Real Analysis.Macmillan, New York 1968. MR 0151555
Reference:	[27] M. Schäl: Conditions for optimality in dynamic programming and for the limit of n-stage optimal policies to be optimal.Z. Wahrsch. verw. Gebiete 32 (1975), 179-196. MR 0378841
Reference:	[28] M. Schäl: Estimation and control in discounted stochastic dynamic programming.Preprint No. 428, Institute for Applied Math., University of Bonn, Bonn 1981. MR 0875814
Reference:	[29] H. C. Tijms: On dynamic programming with arbitrary state space, compact action space and the average reward as criterion.Report BW 55/75, Mathematisch Centrum, Amsterdam 1975.
Reference:	[30] T. Ueno: Some limit theorems for temporally discrete Markov processes.J. Fac. Science, University of Tokyo 7 (1957), 449-462. Zbl 0077.33201, MR 0090921
Reference:	[31] D. J. White: Dynamic programming, Markov chains, and the method of successive approximations.J. Math. Anal. Appl. 6 (1963), 373-376. MR 0148480
Reference:	[32] P. Mandl, G. Hiibner: Transient phenomena and self-optimizing control of Markov chains.Acta Universitatis Carolinae - Math, et Phys. 26 (1985), 1, 35-51. MR 0830264
Reference:	[33] A. Hordijk, H. Tijms: A modified form of the iterative method of dynamic programming.Ann. Statist. 3 (1975), 1, 203-208. Zbl 0304.90115, MR 0378837
.

Files

Files	Size	Format	View
Kybernetika_23-1987-4_1.pdf	968.2Kb	application/pdf	View/Open

Back to standard record

Browse
- Collections
- Titles
- Authors
- MSC

About DML-CZ

Partner of

Article

Files

Search

Browse