“…Yang et al [18] call it "Spatial co-occurrence Kernel" and considered it as a count of the times that two visual features satisfy a spatial condition. Shih et al [10] present a new idea behind the co-occurrence representation, recording the spatial correlation c between a pair of feature maps k and w, seeking the maximal correlation response for a set of spatial offsets…”
Section: Deep Co-occurrence Tensor Of Deep Convolutional Featuresmentioning
confidence: 99%
“…Moreover, this high dimensionality makes it unaffordable in deep tensors like VGG with 512 channels. For this reason, Shih et al [10] add a 1 × 1 × N convolution filter to reduce the number of channels before the co-occurrence layer. As a side effect this channel reduction causes a reduction of performance as demonstrated in [10]; this representation was also used in [19].…”
Section: Deep Co-occurrence Tensor Of Deep Convolutional Featuresmentioning
confidence: 99%
“…Shih et al [10] define co-occurrences as the maximal correlation between a pair of feature maps, for a set of spatial offsets, whilst we define co-occurrences as the sum of the activations inside a region, being the activations value above a threshold. (Section 3.1).…”
“…In order to compare these two different interpretations, we have modified Shih et al [10] method to return a co-occurrence tensor instead of a cooccurrences vector. This modification consists in picking the summed maxi- with the same size than the original activation tensor A as in our method.…”
“…A Co-occurrence matrix [9] is defined from an image being the distribution of co-occurring pixel values at a given spatial offset. Recently, this concept of co-occurrence has been extended to the co-occurrence of features activations in convolutional layers [10]. In this approach, said co-occurrence layer calculates the correlation between each pair of feature maps by means of the maximum product of the activations given a set of spatial offsets.…”
“…Yang et al [18] call it "Spatial co-occurrence Kernel" and considered it as a count of the times that two visual features satisfy a spatial condition. Shih et al [10] present a new idea behind the co-occurrence representation, recording the spatial correlation c between a pair of feature maps k and w, seeking the maximal correlation response for a set of spatial offsets…”
Section: Deep Co-occurrence Tensor Of Deep Convolutional Featuresmentioning
confidence: 99%
“…Moreover, this high dimensionality makes it unaffordable in deep tensors like VGG with 512 channels. For this reason, Shih et al [10] add a 1 × 1 × N convolution filter to reduce the number of channels before the co-occurrence layer. As a side effect this channel reduction causes a reduction of performance as demonstrated in [10]; this representation was also used in [19].…”
Section: Deep Co-occurrence Tensor Of Deep Convolutional Featuresmentioning
confidence: 99%
“…Shih et al [10] define co-occurrences as the maximal correlation between a pair of feature maps, for a set of spatial offsets, whilst we define co-occurrences as the sum of the activations inside a region, being the activations value above a threshold. (Section 3.1).…”
“…In order to compare these two different interpretations, we have modified Shih et al [10] method to return a co-occurrence tensor instead of a cooccurrences vector. This modification consists in picking the summed maxi- with the same size than the original activation tensor A as in our method.…”
“…A Co-occurrence matrix [9] is defined from an image being the distribution of co-occurring pixel values at a given spatial offset. Recently, this concept of co-occurrence has been extended to the co-occurrence of features activations in convolutional layers [10]. In this approach, said co-occurrence layer calculates the correlation between each pair of feature maps by means of the maximum product of the activations given a set of spatial offsets.…”
Aggregated second-order features extracted from deep convolutional networks have been shown to be effective for texture generation, fine-grained recognition, material classification, and scene understanding. In this paper, we study a class of orderless aggregation functions designed to minimize interference or equalize contributions in the context of second-order features and we show that they can be computed just as efficiently as their first-order counterparts and they have favorable properties over aggregation by summation. Another line of work has shown that matrix power normalization after aggregation can significantly improve the generalization of second-order representations. We show that matrix power normalization implicitly equalizes contributions during aggregation thus establishing a connection between matrix normalization techniques and prior work on minimizing interference. Based on the analysis we present γ-democratic aggregators that interpolate between sum (γ=1) and democratic pooling (γ=0) outperforming both on several classification tasks. Moreover, unlike power normalization, the γdemocratic aggregations can be computed in a low dimensional space by sketching that allows the use of very high-dimensional second-order features. This results in a state-of-the-art performance on several datasets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.