“…Traditional visual LCD solutions calculate global features (Dalal & Triggs, 2005; Jegou et al, 2010; Siagian & Itti, 2009; Ulrich & Nourbakhsh, 2000) to embed the whole image into a single matrix/vector, or extract hand‐crafted local features (Bay et al, 2008; Lowe, 2004; Rublee et al, 2011) to detect massive salient key points while describing patches of images by descriptors as a compact representation, such as SURF (Bay et al, 2008) in FAB‐MAP (Cummins & Newman, 2008) and FAB‐MAP 2.0 (Cummins & Newman, 2011), ORB (Rublee et al, 2011) in DLoopDetector (Galvez‐López & Tardos, 2012). Some current works (Company‐Corcoles et al, 2020; Han et al, 2021) also utilize line features to enhance LCD methods. In recent years, motivated by the success of deep CNN in other computer vision tasks, many novel learned features (Arandjelovi et al, 2018; DeTone et al, 2018; Dusmanu et al, 2019; Noh et al, 2017; Sarlin et al, 2019) have been proposed and shown improved robustness against varying illuminations and viewpoints than traditional counterparts.…”