The conventional Radon transform suffers from a lack of resolution when data kinematics and amplitudes differ from those of the Radon basis functions. Also, a limited aperture of data, missing traces, aliasing, a finite number of scanned ray parameters, noise, residual statics, and amplitude variations with offset (AVO) reduce the de-correlation power of the Radon basis functions. Posing Radon transform estimation as an inverse problem by searching for a sparse model that fits the data improves the performance of the algorithm. However, due to averaging along the offset axis, the conventional Radon transform cannot preserve AVO. Accordingly, we modify the Radon basis functions by extending the model domain along the offset direction. Extending the model space helps in fitting data; however, computing the offset-extended Radon transform is an under-determined and ill-posed problem. To alleviate this shortcoming, we add model domain sparsity and smoothing constraints to yield a stable solution. We develop an algorithm using offset-extended Radon basis functions with sparsity promoting in offset-stacked Radon images in conjunction with a smoothing restriction along the offset axis. As the inverted model is sparse and fits the data, muting common-offset Radon panels based on ray-parameter/curvature is sufficient for separating primaries from multiples. We successfully apply the algorithm to suppress multiples in the presence of strong AVO on synthetic data and a real data example from the Gulf of Mexico, Mississippi Canyon. The results show that extending the Radon model space is necessary for improving the separation and suppression of the multiples in the presence of strong AVO.