This paper studies a discrete-time total-reward Markov decision process (MDP) with a given initial state distribution. A (randomized) stationary policy can be split on a given set of states if the occupancy measure of this policy can be expressed as a convex combination of the occupancy measures of stationary policies, each selecting deterministic actions on the given set and coinciding with the original stationary policy outside of this set. For a stationary policy, necessary and sufficient conditions are provided for splitting it at a single state as well as sufficient conditions for splitting it on the whole state space. These results are applied to constrained MDPs. The results are refined for absorbing (including discounted) MDPs with finite state and actions spaces. In particular, this paper provides an efficient algorithm that presents the occupancy measure of a given policy as a convex combination of the occupancy measures of finitely many (stationary) deterministic policies. This algorithm generates the splitting policies in a way that each pair of consecutive policies differs at exactly one state. The results are applied to constrained problems to efficiently compute an optimal policy by computing and splitting a stationary optimal policy.Key words: Markov decision processes; occupancy measures; splitting occupancy measures; constrained Markov decision processes MSC2000 subject classification: Primary: 90C40; secondary: 97K60, 60J20, 60J22 OR/MS subject classification: Primary: dynamic programming/optimal control; secondary: deterministic Markov, finite state, infinite state History: Received April 25, 2008; revised December 11, 2010, and October 16, 2011. Published online in Articles in Advance January 9, 2012.1. Introduction. This paper is concerned with a discrete-time Markov decision process (MDP) with a given distribution of an initial state and with total-reward criteria. It investigates whether and how a stationary policy can be replaced by another policy that is defined as a random selection among policies that are deterministic on a prescribed set of states and coincide with the original stationary policy outside of that set. Contributions are presented for MDPs with finite state and actions sets, for MDPs with countable state sets, and for MDPs with Borel state and action spaces.An MDP is said to be absorbing if its expected lifetime is finite under every policy. In particular, a discounted MDP can be presented as an absorbing MDP. For an absorbing MDP with a fixed initial state distribution, the occupancy measure of a policy specifies the expectation of the number of visits to each measurable set of state-action pairs. The expected total reward for the policy can be expressed as the integral of the one-step reward function with respect to the policy's occupancy measure. Thus, optimizing the expected total rewards is reduced to optimizing a linear function over the set of occupancy measures. This is the basic idea of the convexanalytic approach which provides useful methods for solving MDPs w...