We develop a framework for obtaining (deterministic) Fully Polynomial Time Approximation Schemes (FPTASs) for stochastic univariate dynamic programs with either convex or monotone single-period cost functions. Using our framework, we give the first FPTASs for several NP-hard problems in various fields of research such as knapsack-related problems, logistics, operations management, economics, and mathematical finance.
IntroductionDynamic Programming (DP). Dynamic Programming is an algorithmic technique used for solving sequential, or multi-stage, decision problems and is a fundamental tool in combinatorial optimization (e.g., [17], Section 2.5 in [3], and Chapter 8 in [30]). A discrete time finite time horizon dynamic program is to find an optimal policy over a finite time horizon that minimizes the average cost. At the beginning of a time period, the state of the system is observed and an action is taken. Based on exogenous stochastic information, the state, and the action, the system incurs a single-period cost and transitions into a new state. The goal is to find a policy that realizes the minimal total expected cost over the entire time horizon.We can formally model this by means of Bellman's optimality equation. Let z t (I t ) be the cost-to-go (also known as the value function). The value z t (I t ) is simply the cost of an optimal policy from time period t to the end of the time horizon, given that at the beginning of time period t the state is I t . The equation reads (1.1) z t (I t ) = min x t ∈A t (I t ) E Dt {g t (I t , x t , D t ) + z t+1 (f t (I t , x t , D t ))}.