In recent years, extensive research has shown that deep learning-based compressed image reconstruction algorithms can achieve faster and better high-quality reconstruction for single-pixel imaging, and that reconstruction quality can be further improved by joint optimization of sampling and reconstruction. However, these network-based models mostly adopt end-to-end learning, and their structures are not interpretable. In this paper, we propose SRMU-Net, a sampling and reconstruction jointly optimized model unfolding network. A fully connected layer or a large convolutional layer that simulates compressed reconstruction is added to the compressed reconstruction network, which is composed of multiple cascaded iterative shrinkage thresholding algorithm (ISTA) unfolding iteration blocks. To achieve joint optimization of sampling and reconstruction, a specially designed network structure is proposed so that the sampling matrix can be input into ISTA unfolding iteration blocks as a learnable parameter. We have shown that the proposed network outperforms the existing algorithms by extensive simulations and experiments.