Introduction: Precise fault diagnosis is crucial for enhancing the reliability and lifespan of the flexible converter valve equipment. To address this issue, depthwise separable convolution, bidirectional gate recurrent unit, and multi-head attention module (DSC-BiGRU-MAM) based fault diagnosis approach is proposed.Methods: By DSC and BiGRU operation, the model can capture the correlation between local features and temporal information when processing sequence data, thereby enhancing the representation ability and predictive performance of the model for complex sequential data. In addition, by incorporating a multi-head attention module, the proposed method dynamically learns important information from different time intervals and channels. The proposed MAM continuously stimulates fault features in both time and channel dimensions during training, while suppressing fault independent expressions. As a result, it has made an important contribution to improving the performance of the fault diagnosis model.Results and Discussion: Experimental results demonstrate that the proposed method achieves higher accuracy compared to existing methods, with an average accuracy of 95.45%, average precision of 88.67%, and average recall of 89.03%. Additionally, the proposed method has a moderate number of model parameters (17,626) and training time (935 s). Results indicate that the proposed method accurately diagnoses faults in flexible converter valve equipment, especially in real-world situations with noise overlapping signals.