We consider a discrete-time Markov controlled process endowed with the expected total discounted reward. We assume that the distribution of the underlying random vectors is unknown and that it is approximated by an appropriate known distribution. We found upper bounds of a decrease in reward when the policy, optimal for the approximating process, is applied to control the original process.