Title:
|
An unbounded Berge's minimum theorem with applications to discounted Markov decision processes (English) |
Author:
|
Montes-de-Oca, Raúl |
Author:
|
Lemus-Rodríguez, Enrique |
Language:
|
English |
Journal:
|
Kybernetika |
ISSN:
|
0023-5954 |
Volume:
|
48 |
Issue:
|
2 |
Year:
|
2012 |
Pages:
|
268-286 |
Summary lang:
|
English |
. |
Category:
|
math |
. |
Summary:
|
This paper deals with a certain class of unbounded optimization problems. The optimization problems taken into account depend on a parameter. Firstly, there are established conditions which permit to guarantee the continuity with respect to the parameter of the minimum of the optimization problems under consideration, and the upper semicontinuity of the multifunction which applies each parameter into its set of minimizers. Besides, with the additional condition of uniqueness of the minimizer, its continuity is given. Some examples of nonconvex optimization problems that satisfy the conditions of the article are supplied. Secondly, the theory developed is applied to discounted Markov decision processes with unbounded cost functions and with possibly noncompact actions sets in order to obtain continuous optimal policies. This part of the paper is illustrated with two examples of the controlled Lindley's random walk. One of these examples has nonconstant action sets. (English) |
Keyword:
|
Berge's minimum theorem |
Keyword:
|
moment function |
Keyword:
|
discounted Markov decision process |
Keyword:
|
uniqueness of the optimal policy |
Keyword:
|
continuous optimal policy |
MSC:
|
90A16 |
MSC:
|
90C40 |
MSC:
|
93E20 |
idMR:
|
MR2954325 |
. |
Date available:
|
2012-05-15T16:16:49Z |
Last updated:
|
2013-09-22 |
Stable URL:
|
http://hdl.handle.net/10338.dmlcz/142813 |
. |
Reference:
|
[1] C. D. Aliprantis, K. C. Border: Infinite Dimensional Analysis..Third Edition. Springer-Verlag, Berlin 2006. Zbl 1156.46001, MR 2378491 |
Reference:
|
[2] R. B. Ash: Real Variables with Basic Metric Space Topology..IEEE Press, New York 1993. Zbl 0920.26002, MR 1193687 |
Reference:
|
[3] J. P. Aubin, I. Ekeland: Applied Nonlinear Analysis..John Wiley, New York 1984. Zbl 1115.47049, MR 0749753 |
Reference:
|
[4] L. M. Ausubel, R. J. Deneckere: A generalized theorem of the maximum..Econom. Theory 3 (1993), 99-107. Zbl 1002.49500, MR 1211955, 10.1007/BF01213694 |
Reference:
|
[5] C. Berge: Topological Spaces..Oliver and Boyd, Edinburgh and London 1963 (reprinted by Dover Publications, Inc., Mineola, New York 1997). Zbl 0114.38602, MR 1464690 |
Reference:
|
[6] D. Cruz-Suárez, R. Montes-de-Oca, F. Salem-Silva: Conditions for the uniqueness of optimal policies of discounted Markov decision processes..Math. Methods Oper. Res. 60 (2004), 415-436. Zbl 1104.90053, MR 2106092, 10.1007/s001860400372 |
Reference:
|
[7] J. Dugundji: Topology..Allyn and Bacon, Inc., Boston 1966. Zbl 0397.54003, MR 0193606 |
Reference:
|
[8] P. K. Dutta, M. K.Majumdar, R. K. Sundaram: Parametric continuity in dynamic programming problems..J. Econom. Dynam. Control 18 (1994), 1069-1092. Zbl 0875.90096, MR 1298092, 10.1016/0165-1889(94)90048-5 |
Reference:
|
[9] P. K. Dutta, T. Mitra: Maximum theorems for convex structures with an application to the theory of optimal intertemporal allocations..J. Math. Econom. 18 (1989), 77-86. MR 0985949, 10.1016/0304-4068(89)90006-2 |
Reference:
|
[10] O. Hernández-Lerma, J. B. Lasserre: Discrete-Time Markov Control Processes: Basic Optimality Criteria..Springer-Verlag, New York 1996. Zbl 0840.93001, MR 1363487 |
Reference:
|
[11] O. Hernández-Lerma, W. J. Runggaldier: Monotone approximations for convex stochastic control problems..J. Math. Systems Estim. Control 4 (1994), 99-140. Zbl 0812.93078, MR 1298550 |
Reference:
|
[12] K. Hinderer: Lipschitz continuity of value functions in Markovian decision Processes..Math. Methods Oper. Res. 60 (2005), 3-22. Zbl 1093.90075, MR 2226965 |
Reference:
|
[13] K. Hinderer, M. Stieglitz: Increasing and Lipschitz continuous minimizers in one-dimensional linear-convex systems without constraints: the continuous and the discrete case..Math. Methods Oper. Res. 44 (1996), 189-204. Zbl 0860.90126, MR 1409065, 10.1007/BF01194330 |
Reference:
|
[14] A. Horsley, A. J. Wrobel, T. Van Zandt: Berge's maximum theorem with two topologies on the action set..Econom. Lett. 61 (1998), 285-291. Zbl 0913.90079, MR 1676329, 10.1016/S0165-1765(98)00177-3 |
Reference:
|
[15] J. S. Jordan: The continuity of optimal dynamic decision rules..Econometrica 45 (1977), 1365-1376. Zbl 0363.90035, MR 0456573, 10.2307/1912305 |
Reference:
|
[16] T. Kamihigashi: Stochastic optimal growth with bounded or unbounded utility and with bounded or unbounded shocks..J. Math. Econom. 43 (2007), 477-500. Zbl 1154.91032, MR 2317118, 10.1016/j.jmateco.2006.05.007 |
Reference:
|
[17] T. Kamihigashi, S. Roy: A nonsmooth, nonconvex model of optimal growth..J. Econom. Theory 132 (2007), 435-460. Zbl 1142.91667, MR 2285614, 10.1016/j.jet.2005.06.007 |
Reference:
|
[18] R. B. King: Beyond Quartic Equation..Birkhauser, Boston 1996. MR 1401346 |
Reference:
|
[19] M. Kitayev: Semi-Markov and jump Markov control models: average cost criterion..Theory Probab. Appl. 30 (1985), 272-288. MR 0792619 |
Reference:
|
[20] D. V. Lindley: The theory of queues with a single server..Proc. Cambridge Philos. Soc. 48 (1952), 277-289. Zbl 0046.35501, MR 0046597 |
Reference:
|
[21] M. Majumdar, R. Radner: Stationary optimal policies with discounting in a stochastic activity analysis model..Econometrica 51 (1983), 1821-1837. MR 0720089, 10.2307/1912118 |
Reference:
|
[22] S. P. Meyn: Ergodic Theorems for discrete time stochastic systems using a stochastic Lyapunov functions..SIAM J. Control Optim. 27 (1989), 1409-1439. MR 1022436, 10.1137/0327073 |
Reference:
|
[23] E. A. Ok: Real Analysis with Economic Applications..Princeton University Press, Princeton 2007. Zbl 1119.26001, MR 2275400 |
Reference:
|
[24] A. L. Peressini, F. E. Sullivan, J. J. Uhl: The Mathematics of Nonlinear Programming..Springer-Verlag, New York 1988. Zbl 0663.90054, MR 0932726 |
Reference:
|
[25] M. L. Puterman: Markov Decision Processes: Discrete Stochastic Dynamic Programming..John Wiley, New York 1994. Zbl 1184.90170, MR 1270015 |
Reference:
|
[26] U. Rieder: Measurable selection theorems for optimization problems..Manuscripta Math. 24 (1978), 115-131. Zbl 0385.28005, MR 0493590, 10.1007/BF01168566 |
Reference:
|
[27] H. L. Royden: Real Analysis..Third Edition. Macmillan, New York 1988. Zbl 1191.26002, MR 1013117 |
Reference:
|
[28] R. H. Stockbridge: Time-average control of martingale problems: a linear programming formulation..Ann. Probab. 18 (1990), 291-314. Zbl 0699.49019, MR 1043944 |
Reference:
|
[29] R. Sundaram: A First Course in Optimization Theory..Cambridge University Press, Cambridge 1996. Zbl 0885.90106, MR 1402910 |
Reference:
|
[30] G. Tian, J. Zhou: The maximum theorem and the existence of Nash equilibrium of (generalized) games without lower semicontinuities..J. Math. Anal. Appl. 166 (1992), 351-364. Zbl 0761.90110, MR 1160931, 10.1016/0022-247X(92)90302-T |
Reference:
|
[31] G. Tian, J. Zhou: Transfer continuities, generalizations of the Weierstrass and maximum theorem: a full characterization..J. Math. Econom. 24 (1995), 281-303. MR 1320200, 10.1016/0304-4068(94)00687-6 |
Reference:
|
[32] M. Walker: A generalization of the maximum theorem..Internat. Econom. Rev. 20 (1979), 267-272. Zbl 0406.90001, MR 0525439, 10.2307/2526431 |
Reference:
|
[33] A. Yushkevich: Blackwell optimality in Borelian continuous-in-action Markov decision processes..SIAM J. Control Optim. 35 (1997), 2157-2182. Zbl 0892.93059, MR 1478659, 10.1137/S0363012995292469 |
. |