Derivative-Based Scale Invariant Image Feature Detector With Error Resilience

Mainali, Pradip; Lafruit, Gauthier; Tack, Klaas; Gool, Luc Van; Lauwereins, Rudy

doi:10.1109/tip.2014.2315959

Cited by 16 publications

(17 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In addition to this operator, KAZE adds the novel concept of using a non-linear diffusion scale space instead of the more traditional Gaussian pyramid, making it the de facto current state of the art among handcrafted keypoint detectors. More recently, SIFER [14] and D-SIFER [13] proposed an advanced Cosine Modulated Gaussian filter instead of traditional derivative-based ones, with promising results.…”

Section: Related Workmentioning

confidence: 99%

“…Applications of keypoint detection include tracking and 3D reconstruction, which often have extremely low latency and power efficiency requirements, such as in the case of autonomous driving (latency) and AR/VR pose estimation (latency and power consumption). The majority of state of the art keypoint detectors [3,12,5,14,13] are based on combinations of derivative operations, such as determinant of the Hessian [3] or difference of Gaussians [12], and their implementations are based on conventional image filtering and processing approaches. Similarly to keypoint detectors, the early layers of Convolutional Neural Networks (CNNs) are also characterized by combinations of filtering operations, hinting that keypoint detectors could be implemented as CNNs.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

KCNN: Extremely-Efficient Hardware Keypoint Detection with a Compact Convolutional Neural Network

Febbo¹,

Mutto²,

Tieu³

et al. 2018

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

View full text Add to dashboard Cite

Keypoint detection algorithms are typically based on handcrafted combinations of derivative operations implemented with standard image filtering approaches. The early layers of Convolutional Neural Networks (CNNs) for image classification, whose implementation is nowadays often available within optimized hardware units, are characterized by a similar architecture. Therefore, the exploration of CNNs for keypoint detection is a promising avenue to obtain a low-latency implementation, also enabling to effectively move the computational cost of the detection to dedicated Neural Network processing units. This paper proposes a methodology for effective keypoint detection by means of an efficient CNN characterized by a compact three-layer architecture. A novel training procedure is proposed for learning values of the network parameters which allow for an approximation of the response of handcrafted detectors, showing that the proposed architecture is able to obtain results comparable with the state of the art. The capability of emulating different detectors allows to deploy a variety of algorithms to dedicated hardware by simply retraining the network. A sensor-based FPGA implementation of the introduced CNN architecture is presented, allowing latency smaller than 1[ms].

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

KCNN: Extremely-Efficient Hardware Keypoint Detection with a Compact Convolutional Neural Network

Febbo¹,

Mutto²,

Tieu³

et al. 2018

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

View full text Add to dashboard Cite

show abstract

“…Some typical examples of the detectors of this type are Harris corners [4] for corner detection, SIFT [5], SURF [6], MSER [7] for blob detection, and SFOP [8] for junction detection. Besides the list of the aforementioned detectors, there are a vast number of detectors such as SIFER [11], D-SIFER [12], WADE [13], Edge Foci [2] targeting detection of different structures with various customizations. Although current detectors rely on some more or less different pre-designed structures, the structures share a common factor in that they have some levels of complexity.…”

Section: Hand-crafted Feature Detectormentioning

confidence: 99%

SCK: A Sparse Coding Based Key-Point Detector

Hong-Phuoc

Guan

2018

2018 25th IEEE International Conference on Image Processing (ICIP)

View full text Add to dashboard Cite

All current popular hand-crafted key-point detectors such as Harris corner, MSER, SIFT, SURF… rely on some specific pre-designed structures for the detection of corners, blobs, or junctions in an image. In this paper, a novel sparse coding based key-point detector which requires no particular predesigned structures is presented. The key-point detector is based on measuring the complexity level of each block in an image to decide where a key-point should be. The complexity level of a block is defined as the total number of non-zero components of a sparse representation of that block. Generally, a block constructed with more components is more complex and has greater potential to be a good keypoint. Experimental results on Webcam and EF datasets [1,2] show that the proposed detector achieves significantly high repeatability compared to hand-crafted features, and even outperforms the matching scores of the state-of-the-art learning based detector.

show abstract

“…Given the spatial Gaussian scale-space concept [24,34,44,46,47,59,60,67,70,106,111,120,123], a general methodology for spatial scale selection has been developed based on local extrema over spatial scales of scale-normalized differential entities [62,64,65,72,73]. This general method- 2 The spatial Laplacian applied to the first-and second-order temporal derivatives ∇ 2 (x,y) L t and ∇ 2 (x,y) L tt as well as the spatio-temporal Laplacian ∇ 2 (x,y,t) L computed from a video sequence in the UCF-101 dataset (Kayaking_g01_c01.avi) at 3 × 3 combinations of the spatial scales (bottom row) σ s,1 = 2 pixels, (middle row) σ s,2 = 4.6 pixels and (top row) σ s,3 = 10.6 pixels and the temporal scales (left column) σ τ,1 = 40 ms, (middle column) σ τ,2 = 160 ms and (right column) σ τ,3 = 640 ms with the spatial and temporal scale parameters in units of σ s = √ s and σ τ = √ τ and using a time-causal spatio-temporal scale-space representation with a logarithmic distribution of the temporal scale levels for c = 2 (image size: 320 × 172 pixels of original 320 × 240 pixels; frame 90 of 226 frames at 25 framesframes/s) ology has in turn been successfully applied to develop robust methods for image-based matching and recognition [5,41,52,68,74,84,86,87,89,90,[112][113][114] that are able to handle large variations of the size of the objects in the image domain and with numerous applications regarding object recognition, object categorization, multi-view geometry, construction of 3-D models from visual input,…”

Section: Figmentioning

confidence: 99%

Spatio-Temporal Scale Selection in Video Data

Lindeberg

2017

J Math Imaging Vis

View full text Add to dashboard Cite

This work presents a theory and methodology for simultaneous detection of local spatial and temporal scales in video data. The underlying idea is that if we process video data by spatio-temporal receptive fields at multiple spatial and temporal scales, we would like to generate hypotheses about the spatial extent and the temporal duration of the underlying spatio-temporal image structures that gave rise to the feature responses. For two types of spatio-temporal scale-space representations, (i) a non-causal Gaussian spatio-temporal scale space for offline analysis of pre-recorded video sequences and (ii) a time-causal and timerecursive spatio-temporal scale space for online analysis of real-time video streams, we express sufficient conditions for spatio-temporal feature detectors in terms of spatio-temporal receptive fields to deliver scale-covariant and scale-invariant feature responses. We present an in-depth theoretical analysis of the scale selection properties of eight types of spatio-temporal interest point detectors in terms of either: (i)-(ii) the spatial Laplacian applied to the first-and secondorder temporal derivatives, (iii)-(iv) the determinant of the spatial Hessian applied to the first-and second-order temporal derivatives, (v) the determinant of the spatio-temporal Hessian matrix, (vi) the spatio-temporal Laplacian and (vii)-(viii) the first-and second-order temporal derivatives of the determinant of the spatial Hessian matrix. It is shown that seven of these spatio-temporal feature detectors allow for provable scale covariance and scale invariance. Then, we describe a time-causal and time-recursive algorithm for detecting sparse spatio-temporal interest points from video streams and show that it leads to intuitively reasonable results. An experimental quantification of the accuracy of the spatio-temporal scale estimates and the amount of temporal delay obtained from these spatio-temporal interest point detectors is given, showing that: (i) the spatial and temporal scale selection properties predicted by the continuous theory are well preserved in the discrete implementation and (ii) the spatial Laplacian or the determinant of the spatial Hessian applied to the first-and second-order temporal derivatives leads to much shorter temporal delays in a timecausal implementation compared to the determinant of the spatio-temporal Hessian or the first-and second-order temporal derivatives of the determinant of the spatial Hessian matrix.

show abstract

Derivative-Based Scale Invariant Image Feature Detector With Error Resilience

Cited by 16 publications

References 17 publications

KCNN: Extremely-Efficient Hardware Keypoint Detection with a Compact Convolutional Neural Network

KCNN: Extremely-Efficient Hardware Keypoint Detection with a Compact Convolutional Neural Network

SCK: A Sparse Coding Based Key-Point Detector

Spatio-Temporal Scale Selection in Video Data

Contact Info

Product

Resources

About