[1] Bertsekas, D. P.: 
Dynamic Programming: Deterministic and Stochastic Models. Prentice-Hall, NJ 1987. 
MR 0896902 | 
Zbl 0649.93001 
[3] Borwein, J. M., Zhu, Q. J.: 
Techniques of Variational Analysis. Springer, New York 2005. 
MR 2144010 | 
Zbl 1076.49001 
[4] Cruz-Suárez, D., Montes-de-Oca, R., Salem-Silva, F.: 
Conditions for the uniqueness of optimal policies of discounted Markov decision processes. Math. Methods Oper. Res. 60 (2004), 415-436. 
DOI 10.1007/s001860400372 | 
MR 2106092 | 
Zbl 1104.90053 
[5] Cruz-Suárez, D., Montes-de-Oca, R.: 
Uniform convergence of the value iteration policies for discounted Markov decision processes. Bol. Soc. Mat. Mexicana 12 (2006), 133-152. 
MR 2301750 
[9] Montes-de-Oca, R., Lemus-Rodríguez, E.: 
An unbounded Berge's minimum theorem with applications to discounted Markov decision processes. Kybernetika 48 (2012), 268-286. 
MR 2954325 | 
Zbl 1275.90124 
[10] Montes-de-Oca, R., Lemus-Rodríguez, E., Salem-Silva, F.: 
Nonuniqueness versus uniqueness of optimal policies in convex discounted Markov decision processes. J. Appl. Math. 2013 (2013), 1-5. 
DOI 10.1155/2013/271279 | 
MR 3039713 | 
Zbl 1266.90113 
[12] Tanaka, K., Hosino, M., Kuroiwa, D.: 
On an $\varepsilon $-optimal policy of discrete time stochastic control processes. Bull. Inform. Cybernet. 27 (1995), 107-119. 
MR 1335274