Previous |  Up |  Next

Article

Title: Mean-variance optimality for semi-Markov decision processes under first passage criteria (English)
Author: Huang, Xiangxiang
Author: Huang, Yonghui
Language: English
Journal: Kybernetika
ISSN: 0023-5954 (print)
ISSN: 1805-949X (online)
Volume: 53
Issue: 1
Year: 2017
Pages: 59-81
Summary lang: English
.
Category: math
.
Summary: This paper deals with a first passage mean-variance problem for semi-Markov decision processes in Borel spaces. The goal is to minimize the variance of a total discounted reward up to the system's first entry to some target set, where the optimization is over a class of policies with a prescribed expected first passage reward. The reward rates are assumed to be possibly unbounded, while the discount factor may vary with states of the system and controls. We first develop some suitable conditions for the existence of first passage mean-variance optimal policies and provide a policy improvement algorithm for computing an optimal policy. Then, two examples are included to illustrate our results. At last, we show how the results here are reduced to the cases of discrete-time Markov decision processes and continuous-time Markov decision processes. (English)
Keyword: semi-Markov decision processes
Keyword: first passage time
Keyword: unbounded reward rate
Keyword: minimal variance
Keyword: mean-variance optimal policy
MSC: 60J27
MSC: 90C40
idZBL: Zbl 06738594
idMR: MR3638556
DOI: 10.14736/kyb-2017-1-0059
.
Date available: 2017-04-03T10:47:18Z
Last updated: 2018-01-10
Stable URL: http://hdl.handle.net/10338.dmlcz/146708
.
Reference: [1] Berument, H., Kilinc, Z., Ozlale, U.: The effects of different inflation risk premiums on interest rate spreads..Phys. A 333 (2004), 317-324. MR 2100223, 10.1016/j.physa.2003.10.039
Reference: [2] Baykal-Gürsoy, M., Gürsoy, K.: Semi-Markov decision processes: nonstandard criteria..Probab. Engrg. Inform. Sci. 21 (2007), 635-657. MR 2357126, 10.1017/S026996480700037X
Reference: [3] Bäuerle, N., Rieder, U.: Markov decision processes with applications to finance..In: Universitext, Springer, Heidelberg 2011. Zbl 1236.90004, MR 2808878, 10.1007/978-3-642-18324-9
Reference: [4] Collins, E.: Finite-horizon variance penalised Markov decision processes..OR Spektrum 19 (1997), 35-39. Zbl 0894.90161, MR 1464393, 10.1007/s002910050017
Reference: [5] Costa, O. L. V., Maiali, A. C., Pinto, A. de C.: Sampled control for mean-variance hedging in a jump diffusion financial market..IEEE Trans. Automat. Control 55 (2010), 1704-1709. MR 2675836, 10.1109/tac.2010.2046923
Reference: [6] Filar, J. A., Kallenberg, L. C. M., Lee, H. M.: Variance-penalized Markov decision processes..Math. Oper. Res. 14 (1989), 147-161. Zbl 0676.90096, MR 0984562, 10.1287/moor.14.1.147
Reference: [7] Fu, C. P., Lari-Lavassani, A., Li, X.: Dynamic mean-variance portfolio selection with borrowing constraint..European J. Oper. Res. 200 (2010), 312-319. Zbl 1183.91192, MR 2561109, 10.1016/j.ejor.2009.01.005
Reference: [8] Guo, X. P., Hernández-Lerma, O.: Continuous-Time Markov Decision Processes: Theory and Applications..Springer-Verlag, Berlin 2009. Zbl 1209.90002, MR 2554588, 10.1007/978-3-642-02547-1
Reference: [9] Guo, X. P., Song, X. Y.: Mean-variance criteria for finite continuous-time Markov decision processes..IEEE Trans. Automat. Control 54 (2009), 2151-2157. MR 2567941, 10.1109/tac.2009.2023833
Reference: [10] Guo, X. P., Ye, L. E., Yin, G.: A mean-variance optimization problem for discounted Markov decision processes..European J. Oper. Res. 220 (2012), 423-429. Zbl 1253.90214, MR 2908853, 10.1016/j.ejor.2012.01.051
Reference: [11] Guo, X. P., Huang, X. X., Zhang, Y.: On the first passage $g$-mean variance optimality for discounted continuous-time Markov decision processes..SIAM J. Control Optim. 53 (2015), 1406-1424. Zbl 1322.90108, MR 3352600, 10.1137/140968872
Reference: [12] Hu, Q. Y.: Continuous time Markov decision processes with discounted moment criterion..J. Math. Anal. Appl. 203 (1996), 1-12. Zbl 0858.90135, MR 1412477, 10.1006/jmaa.1996.9999
Reference: [13] Hernández-Lerma, O., Lasserre, J. B.: Further Topics on Discrete-Time Markov Control Processes..Springer-Verlag, New York 1999. Zbl 0928.93002, MR 1697198, 10.1007/978-1-4612-0561-6
Reference: [14] Hernández-Lerma, O., Vega-Amaya, O., Carrasco, G.: Sample-path optimality and variance-minimization of average cost Markov control processes..SIAM J. Control Optim. 38 (1999), 79-93. Zbl 0951.93074, MR 1740606, 10.1137/S0363012998340673
Reference: [15] Haberman, S., Sung, J. H.: Optimal pension funding dynamics over infinite control horizon when stochastic rates of return are stationary..Insurance Math. Econom. 36 (2005), 103-116. Zbl 1111.91023, MR 2122668, 10.1016/j.insmatheco.2004.10.006
Reference: [16] Huang, Y. H., Guo, X. P.: First passage models for denumerable semi-Markov decision processes with nonnegative discounted costs..Acta Math. Appl. Sin. Engl. Ser. 27 (2011), 177-190. Zbl 1235.90177, MR 2784052, 10.1007/s10255-011-0061-2
Reference: [17] Huang, Y. H., Guo, X. P., Song, X. Y.: Performance analysis for controlled semi-Markov systems with application to maintenance..J. Optim. Theory Appl. 150 (2011), 395-415. Zbl 1222.90076, MR 2818928, 10.1007/s10957-011-9813-7
Reference: [18] Huang, Y. H., Guo, X. P.: Constrained optimality for first passage criteria in semi-Markov decision processes..Optimization, Control, and Applications of Stochastic Systems, pp. 181-202, Systems Control Found. Appl., Birkhäuser/Springer, New York 2012. MR 2961386, 10.1007/978-0-8176-8337-5_11
Reference: [19] Huang, Y. H., Guo, X. P.: Mean-variance problems for finite horizon semi-Markov decision processes..Appl. Math. Optim. 72 (2015), 233-259. Zbl 1343.93100, MR 3394396, 10.1007/s00245-014-9278-9
Reference: [20] Jaquette, S. C.: Markov decision processes with a new optimality criterion: continuous time..Ann. Statist. 3 (1975), 547-553. Zbl 0321.90051, MR 0363493, 10.1214/aos/1176343087
Reference: [21] Kurano, M.: Markov decision processes with a minimum-variance criterion..J. Math. Anal. Appl. 123 (1987), 572-583. Zbl 0619.90080, MR 0883710, 10.1016/0022-247x(87)90332-5
Reference: [22] Kharroubi, I., Lim, T.: A. Ngoupeyou, Mean-variance hedging on uncertain time horizon in a market with a jump..Appl. Math. Optim. 68 (2013), 413-444. MR 3131502, 10.1007/s00245-013-9213-5
Reference: [23] Lee, M. J., Li, W. J.: Drift and diffusion function specification for short-term interest rates..Econom. Lett. 86 (2005), 339-346. Zbl 1254.91733, MR 2124417, 10.1016/j.econlet.2004.09.002
Reference: [24] Mandl, P.: On the variance in controlled Markov chains..Kybernetika 7 (1971), 1-12. Zbl 0215.25902, MR 0286178
Reference: [25] Mannor, S., Tsitsiklis, J. N.: Algorithmic aspects of mean-variance optimization in Markov decision processes..European J. Oper. Res. 231 (2013), 645-653. Zbl 1317.90318, MR 3092864, 10.1016/j.ejor.2013.06.019
Reference: [26] Markowitz, H. M.: Portfolio Selection: Efficient Diversification of Investments..John Wiley and Sons, Inc., New York 1959. MR 0103768
Reference: [27] Prieto-Rumeau, T., Hernández-Lerma, O.: Variance minimization and the overtaking optimality approach to continuous-time controlled Markov chains..Math. Methods Oper. Res. 70 (2009), 527-540. Zbl 1177.93101, MR 2558430, 10.1007/s00186-008-0276-z
Reference: [28] Sobel, M. J.: The variance of discounted Markov decision processes..J. Appl. Probab. 19 (1982), 794-802. Zbl 0503.90091, MR 0675143, 10.1017/s0021900200023123
Reference: [29] White, D. J.: Computational approaches to variance-penalised Markov decision processes..OR Spektrum 14 (1992), 79-83. Zbl 0768.90087, MR 1175342, 10.1007/bf01720350
Reference: [30] Wu, X., Guo, X. P.: First passage optimality and variance minimisation of Markov decision processes with varying discount factors..J. Appl. Probab. 52 (2015), 441-456. Zbl 1327.90374, MR 3372085, 10.1017/s0021900200012560
Reference: [31] Zhou, X. Y., Yin, G.: Markowitz's mean-variance portfolio selection with regime switching: a continuous-time model..SIAM J. Control Optim. 42 (2003), 1466-1482. Zbl 1175.91169, MR 2044805, 10.1137/s0363012902405583
Reference: [32] Zhu, Q. X., Guo, X. P.: Markov decision processes with variance minimization: a new condition and approach..Stoch. Anal. Appl. 25 (2007), 577-592. Zbl 1152.90646, MR 2321898, 10.1080/07362990701282807
.

Files

Files Size Format View
Kybernetika_53-2017-1_3.pdf 417.9Kb application/pdf View/Open
Back to standard record
Partner of
EuDML logo