2010
DOI: 10.1109/tpami.2009.132
|View full text |Cite
|
Sign up to set email alerts
|

Visual Word Ambiguity

Abstract: This paper studies automatic image classification by modeling soft assignment in the popular codebook model. The codebook model describes an image as a bag of discrete visual words selected from a vocabulary, where the frequency distributions of visual words in an image allow classification. One inherent component of the codebook model is the assignment of discrete visual words to continuous image features. Despite the clear mismatch of this hard assignment with the nature of continuous features, the approach … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

7
385
1
3

Year Published

2012
2012
2019
2019

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 673 publications
(423 citation statements)
references
References 38 publications
(89 reference statements)
7
385
1
3
Order By: Relevance
“…The BoW model treats an image as a distribution of local descriptors, wherein each descriptor is labeled as a discrete visual prototype. The set containing these prototypes, or visual words, is the so-called visual vocabulary (or dictionary [16], codebook [22]), which is typically obtained by clustering in the feature space of local descriptor. Given a visual vocabulary, an image is represented as a histogram of visual word occurrences on the sampled image patches from the image.…”
Section: Bag-of-visual-words Representation Of Lesionsmentioning
confidence: 99%
See 1 more Smart Citation
“…The BoW model treats an image as a distribution of local descriptors, wherein each descriptor is labeled as a discrete visual prototype. The set containing these prototypes, or visual words, is the so-called visual vocabulary (or dictionary [16], codebook [22]), which is typically obtained by clustering in the feature space of local descriptor. Given a visual vocabulary, an image is represented as a histogram of visual word occurrences on the sampled image patches from the image.…”
Section: Bag-of-visual-words Representation Of Lesionsmentioning
confidence: 99%
“…Unlike hard assignment, assigning a degree of similarity to an image patch (soft assignment) can help in modeling the inherent uncertainty of the image patch while considering the continuous nature of image patches [22]. Soft assignment can be easily incorporated in the BOW model by…”
Section: Bag-of-visual-words Representation Of Lesionsmentioning
confidence: 99%
“…The idea is that we want to find a representation Y ∈ R d , which can map each clip of different modalities from raw signal space to a semantic space, where clips with similar concepts are near with each other. Considering the high diversity of our data, instead of pooling the low-level TICA features spatial-temporally, we use bag of word (BoW) histogram Q ∈ R D by adopting vector quantization (VQ) technique using K-means soft assignment [23].…”
Section: Data-driven Concept Discoverymentioning
confidence: 99%
“…Dictionary (also called vocabulary) learning is the key step here. One standard version of vocabulary learning is K-means clustering on image patches combined with hard-or soft-assignment vector quantization (VQ) [7]. Spatial pyramid matching (SPM) is typically incorporated in the pipeline to compensate the loss of spatial information [12].…”
Section: Introductionmentioning
confidence: 99%
“…With the introduction of kernel techniques, the learned dictionary becomes versatile. For the K-means based scheme, [38] learned a dictionary in the histogram intersection kernel (HIK) space, while [8] learned it in the Gaussian radial basis function (RBF) kernel space. For the sparse representation based scheme, [23] proposed kernel K-SVD and kernel MOD methods.…”
Section: Introductionmentioning
confidence: 99%