The exponential cost optimality for finite horizon semi-Markov decision processes

Huo, Haifeng; Wen, Xian

About DML-CZ | FAQ | Conditions of Use | Math Archives | Contact Us

Previous | Up | Next

Article

Title:	The exponential cost optimality for finite horizon semi-Markov decision processes (English)
Author:	Huo, Haifeng
Author:	Wen, Xian
Language:	English
Journal:	Kybernetika
ISSN:	0023-5954 (print)
ISSN:	1805-949X (online)
Volume:	58
Issue:	3
Year:	2022
Pages:	301-319
Summary lang:	English
.
Category:	math
.
Summary:	This paper considers an exponential cost optimality problem for finite horizon semi-Markov decision processes (SMDPs). The objective is to calculate an optimal policy with minimal exponential costs over the full set of policies in a finite horizon. First, under the standard regular and compact-continuity conditions, we establish the optimality equation, prove that the value function is the unique solution of the optimality equation and the existence of an optimal policy by using the minimum nonnegative solution approach. Second, we establish a new value iteration algorithm to calculate both the value function and the $\epsilon$-optimal policy. Finally, we give a computable machine maintenance system to illustrate the convergence of the algorithm. (English)
Keyword:	semi-Markov decision processes
Keyword:	exponential cost
Keyword:	finite horizon
Keyword:	optimality equation
Keyword:	optimal policy
MSC:	60E20
MSC:	90C40
idZBL:	Zbl 07613047
idMR:	MR4494093
DOI:	10.14736/kyb-2022-3-0301
.
Date available:	2022-10-06T14:43:11Z
Last updated:	2023-03-13
Stable URL:	http://hdl.handle.net/10338.dmlcz/151031
.
Reference:	[1] Bertsekas, D. P., Shreve, S. E.: Stochastic Optimal Control: The Discrete-Time Case..Academic Press, Inc. 1978. MR 0511544
Reference:	[2] Baüuerle, N., Rieder, U.: Markov Decision Processes with Applications to Finance..Springer, Heidelberg 2011 MR 2808878
Reference:	[3] Baüerle, N., Rieder, U.: More risk-sensitive Markov decision processes..Math. Oper. Res. 39 (2014), 105-120. MR 3173005,
Reference:	[4] Cao, X. R.: Semi-Markov decision problems and performance sensitivity analysis..IEEE Trans. Automat. Control 48 (2003), 758-769. MR 1980580,
Reference:	[5] Cavazos-Cadena, R., Montes-De-Oca, R.: Optimal stationary policies in risk-sensitive dynamic programs with finite state space and nonnegative rewards..Appl. Math. 27 (2000), 167-185. MR 1768711,
Reference:	[6] Cavazos-Cadena, R., Montes-De-Oca, R.: Nearly optimal policies in risk-sensitive positive dynamic programming on discrete spaces..Math. Methl Oper. Res. 52 (2000), 133-167. MR 1782381,
Reference:	[7] Chávez-Rodríguez, S., Cavazos-Cadena, R., Cruz-Suárez, H.: Controlled Semi-Markov chains with risk-sensitive average cost criterion..J. Optim. Theory Appl. 170 (2016), 670-686. MR 3527716,
Reference:	[8] Chung, K. J., Sobel, M. J.: Discounted MDP's: distribution functions and exponential utility maximization..SIAM J. Control Optim. 25 (1987), 49-62. MR 0872450,
Reference:	[9] Ghosh, M. K., Saha, S.: Risk-sensitive control of continuous time Markov chains..Stoch. Int. J. Probab. Stoch. Process. 86 (2014), 655-675. MR 3230073,
Reference:	[10] Guo, X. P., Hernández-Lerma, O.: Continuous-Time Markov Decision Process: Theorey and Applications..Springer-Verlag, Berlin 2009. MR 2554588
Reference:	[11] Hernández-Lerma, O., Lasserre, J. B.: Discrete-Time Markov control process: Basic Optimality Criteria..Springer-Verlag, New York 1996. MR 1363487
Reference:	[12] Howard, R. A., Matheson, J. E.: Risk-sensitive Markov decision processes..Management Sci. 18 (1972), 356-369. MR 0292497,
Reference:	[13] Huang, Y. H., Lian, Z. T., Guo, X. P.: Risk-sensitive semi-Markov decision processes with general utilities and multiple criteria..Adv. Appl. Probab. 50 (2018), 783-804. MR 3877254,
Reference:	[14] Huang, Y. H., Guo, X. P.: Finite horizon semi-Markov decision processes with application to maintenance systems..Europ. J. Oper. Res. 212 (2011), 131-140. MR 2783603,
Reference:	[15] Huang, X. X., Zou, X. L., Guo, X. P.: A minimization problem of the risk probability in first passage semi-Markov decision processes with loss rates..Sci. China Math. 58 (2015), 1923-1938. MR 3383991,
Reference:	[16] Huo, H. F., Wen, X.: First passage risk probability optimality for continuous time Markov decision processes..Kybernetika 55 (2019), 114-133. MR 3935417,
Reference:	[17] Jaśkiewicz, A.: A note on negative dynamic programming for risk-sensitive control..Oper. Res. Lett. 36 (2008), 531-534. MR 2459494,
Reference:	[18] Janssen, J., Manca, R.: Semi-Markov Risk Models For Finance, Insurance, and Reliability..Springer, New York 2006. MR 2301626
Reference:	[19] Jaśkiewicz, A.: On the equivalence of two expected average cost criteria for semi Markov control processes..Math. Oper. Res. 29 (2013), 326-338. MR 2065981,
Reference:	[20] Jaquette, S. C.: A utility criterion for Markov decision processes..Manag Sci. {\mi23} (1976), 43-49. MR 0439037,
Reference:	[21] Luque-Vasquez, F., Minjarez-Sosa, J. A.: Semi-Markov control processes with unknown holding times distribution under a discounted criterion..Math. Methods Oper. Res. 61 (2005), 455-468. MR 2225824,
Reference:	[22] Mamer, J. W.: Successive approximations for finite horizon semi-Markov decision processes with application to asset liquidation..Oper. Res. 34 (1986), 638-644. MR 0874303,
Reference:	[23] Nollau, V.: Solution of a discounted semi-markovian descision problem by successiveoevarrelaxation..Optimization. 39, (1997), 85-97. MR 1482757,
Reference:	[24] Puterman, M. L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. MR 1270015
Reference:	[25] Wei, Q.: Continuous-time Markov decision processes with risk-sensitive finite-horizon cost criterion..Math. Oper. Res. 84 (2016), 461-487. MR 3591347,
Reference:	[26] Wu, X., Guo, X. P.: First passage optimality and variance minimization of Markov decision processes with varying discount factors..J. Appl. Prob. 52 (2015), 441-456. MR 3372085,
Reference:	[27] Yushkevich, A. A.: On semi-Markov controlled models with average reward criterion..Theory Probab. Appl. 26 (1982), 808-815. MR 0636774,
Reference:	[28] Zhang, Y.: Continuous-time Markov decision processes with exponential utility..SIAM J. Control Optim. 55 (2017), 2636-2666. MR 3691210,
.

Files

Files	Size	Format	View
Kybernetika_58-2022-3_1.pdf	529.4Kb	application/pdf	View/Open

Back to standard record

Browse
- Collections
- Titles
- Authors
- MSC

About DML-CZ

Partner of

Article

Files

Search

Browse