discounted cost; random rate; stochastic systems; approximation algorithms; density estimation
The paper deals with a class of discrete-time stochastic control processes under a discounted optimality criterion with random discount rate, and possibly unbounded costs. The state process $\left\{ x_{t}\right\} $ and the discount process $\left\{ \alpha _{t}\right\} $ evolve according to the coupled difference equations $x_{t+1}=F(x_{t},\alpha _{t},a_{t},\xi _{t}),$ $ \alpha _{t+1}=G(\alpha _{t},\eta _{t})$ where the state and discount disturbance processes $\{\xi _{t}\}$ and $\{\eta _{t}\}$ are sequences of i.i.d. random variables with densities $\rho ^{\xi }$ and $\rho ^{\eta }$ respectively. The main objective is to introduce approximation algorithms of the optimal cost function that lead up to construction of optimal or nearly optimal policies in the cases when the densities $\rho ^{\xi }$ and $\rho ^{\eta }$ are either known or unknown. In the latter case, we combine suitable estimation methods with control procedures to construct an asymptotically discounted optimal policy.
