Following the intuitive idea of detecting changes by directly measuring dissimilarities between pairs of features, change detection methods based on feature similarity learning have emerged as a crucial field. However, large variances in the scale and location of required contextual information and heavy imbalance between easy and hard samples remain challenging issues. To address the first issue, we propose the Local-Specificity and Wide-View Attention Network (LSWVANet), which features a series of attention modules named Local-Specificity and Wide-View Attention Modules (LSWVAMs). Each LSWVAM consists of two contextual feature units: (1) the Local-Specificity Feature Pyramid unit, which extracts part-specific contexts at the fine-grained level to focus on subtle changes within local discriminative parts, and the Wide-View Feature Pyramid unit, which extracts wide-view contexts at the long-range level to highlight significant changes in large-scale regions. To tackle the second issue, we introduce a novel sample-specific loss function called Hard Sample-Aware Contrastive Loss (HSACL), which is designed to downweight easy samples from both changed and unchanged categories, thereby rapidly shifting the training focus towards the informative hard samples. We demonstrate the effectiveness of our method through experiments on three challenging datasets, VL-CMU-CD, PCD2015 and PSCD, and report the experimental results showing that our approach achieves state-of-the-art accuracy.
INDEX TERMSChange detection, hard sample, feature similarity learning, attention mechanism I. INTRODUCTION 1 Street-view scene change detection (SCD) is a crucial com-2 puter vision task with a wide range of applications, in-3 cluding urban planning [1] [2] [3], traffic surveillance [4] 4 [5], abandoned object detection [6] [7], disaster evaluation 5 [8], action recognition [9] [10] and self-driving [11] [12]. 6 With the emergence of self-driving cars and robotic patrols, 7 accurate navigation and planning based on map information 8 have become increasingly important. Many researchers [1] 9 [11] [12] use street-view change detection algorithms to 10 update map information. Therefore, improving the accuracy 11 of change detection model is a critical challenge in SCD. 12 With the powerful feature representation of convolutional 13 neural networks [13] [14] [15] (CNNs), fully convolutional 14 networks [14] [16] (FCNs) have been widely used in the 15 field of change detection. FCN-based methods can be broadly 16