Accurate power load forecasting can provide crucial insights for power system scheduling and energy planning. In this paper, to address the problem of low accuracy of power load prediction, we propose a method that combines secondary data cleaning and adaptive variational mode decomposition (VMD), convolutional neural networks (CNN), bi-directional long short-term memory (BILSTM), and adding attention mechanism (AM). The Inner Mongolia electricity load data were first cleaned use the K-means algorithm, and then further refined with the density-based spatial clustering of applications with the noise (DBSCAN) algorithm. Subsequently, the parameters of the VMD algorithm were optimized using a multi-strategy Cubic-T dung beetle optimization algorithm (CTDBO), after which the VMD algorithm was employed to decompose the twice-cleaned load sequences into a number of intrinsic mode functions (IMFs) with different frequencies. These IMFs were then used as inputs to the CNN-BILSTM-Attention model. In this model, a CNN is used for feature extraction, BILSTM for extracting information from the load sequence, and AM for assigning different weights to different features to optimize the prediction results. It is proved experimentally that the model proposed in this paper achieves the highest prediction accuracy and robustness compared to other models and exhibits high stability across different time periods.