With the development of multispectral imaging technology, retrieving soil heavy metal content using multispectral remote sensing images has become possible. However, factors such as soil pH and spectral resolution affect the accuracy of model inversion, leading to low precision. In this study, 242 soil samples were collected from a typical area of the Pearl River Delta, and the Cu content in the soil was detected in the laboratory. Simultaneously, Sentinel-2 remote sensing image data were collected, and two-dimensional and three-dimensional spectral indices were established. Constructing independent decision trees based on pH values, using the Successive Projections Algorithm (SPA) combined with the Boruta algorithm to select the characteristic bands for soil Cu content, and this was combined with Optuna automatic hyperparameter optimization for ensemble learning models to establish a model for estimating Cu content in soil. The research results indicated that in the SPA combined with the Boruta feature selection algorithm, the characteristic spectral indices were mainly concentrated in the spectral transformation forms of TBI2 and TBI4. Full-sample modeling lacked predictive ability, but after classifying the samples based on soil pH value, the R2 of the RF and XGBoost models constructed with the samples with pH values between 5.85 and 7.75 was 0.54 and 0.76, respectively, with corresponding RMSE values of 22.48 and 16.12 and RPD values of 1.51 and 2.11. This study shows that the inversion of soil Cu content under different pH conditions exhibits significant differences, and determining the optimal pH range can effectively improve inversion accuracy. This research provides a reference for further achieving the efficient and accurate remote sensing of heavy metal pollution in agricultural soil.