Accurately identifying the rail surface state is crucial for enhancing train traction and braking capabilities, as well as ensuring safe operation and maintenance. Few-shot learning is commonly utilized to recognize the rail surface state, effectively resolving the overfitting issue caused by limited sample data. However, when it comes to the actual situation of rail surface state data, few-shot learning faces challenges such as insufficient extraction of crucial feature information and a tendency to lose distinguishing degree information. To address the aforementioned issues, a rail surface state recognition model based on improved metric learning is proposed in this paper. The proposed method incorporates a pyramid-splitting attention mechanism in the feature extraction network. This enables multi-scale spatial information to be extracted from the feature map, while also facilitating cross-dimensional channel attention and interaction between spatial attention features. This addresses the issue of inadequate key feature information extraction caused by a limited number of orbital surface state samples. A deep local description concatenator splices the local features of the query set and various support set feature maps in pairs, replacing the global feature splicing in traditional metric learning. This enables the filtering of interference information such as background, while retaining feature information with significant differentiation to a larger extent. The proposed method was evaluated using a small-sample rail surface state dataset that we built ourselves. According to the experimental results, the proposed method surpasses the existing methods in recognition accuracy, precision, and recall.