Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems 2010
DOI: 10.1145/1869790.1869829
|View full text |Cite
|
Sign up to set email alerts
|

Bag-of-visual-words and spatial extensions for land-use classification

Abstract: We investigate bag-of-visual-words (BOVW) approaches to land-use classification in high-resolution overhead imagery. We consider a standard non-spatial representation in which the frequencies but not the locations of quantized image features are used to discriminate between classes analogous to how words are used for text document classification without regard to their order of occurrence. We also consider two spatial extensions, the established spatial pyramid match kernel which considers the absolute spatial… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

5
1,288
0
3

Year Published

2015
2015
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 1,918 publications
(1,314 citation statements)
references
References 23 publications
5
1,288
0
3
Order By: Relevance
“…Previous studies reported the superiority of fine-tuning relative to full-training for classification of very high resolution aerial imagery, although full-training was found to be more accurate relative to fine-tuning for classification of multi-spectral satellite data [26,45]. In particular, Nogueira et al (2017) evaluated the efficiency of fine-tuning and full-training strategies of some well-known deep CNNs (e.g., AlexNet and GoogLeNet) for classification of three well-known datasets, including UCMerced land-use [46], RS19 dataset [47], and Brazilian Coffee Scenes [48]. The fine-tuning strategy yielded a higher accuracy for the first two datasets, likely due to their similarity with the ImageNet dataset, which was originally used for training deep CNNs.…”
Section: Figure 10mentioning
confidence: 99%
“…Previous studies reported the superiority of fine-tuning relative to full-training for classification of very high resolution aerial imagery, although full-training was found to be more accurate relative to fine-tuning for classification of multi-spectral satellite data [26,45]. In particular, Nogueira et al (2017) evaluated the efficiency of fine-tuning and full-training strategies of some well-known deep CNNs (e.g., AlexNet and GoogLeNet) for classification of three well-known datasets, including UCMerced land-use [46], RS19 dataset [47], and Brazilian Coffee Scenes [48]. The fine-tuning strategy yielded a higher accuracy for the first two datasets, likely due to their similarity with the ImageNet dataset, which was originally used for training deep CNNs.…”
Section: Figure 10mentioning
confidence: 99%
“…The UC Merced land-use dataset [1] is investigated, which is a set of aerial ortho-imagery with a 0.3048 m pixel resolution extracted from United States Geological Survey (USGS) national maps. The UC Merced dataset has been used as a benchmark for land-use classifier evaluation in numerous publications [1]- [7].…”
Section: Datasetmentioning
confidence: 99%
“…The UC Merced dataset has been used as a benchmark for land-use classifier evaluation in numerous publications [1]- [7].…”
Section: Datasetmentioning
confidence: 99%
See 1 more Smart Citation
“…After defining rather simple, and usually task specific feature representations, some kind of classifiers are learned as, e.g., in bag of words approaches (see, e.g., [11,63,87]) or in many industrial applications (see, e.g., [86]). Such systems can lead to rather impressive results for specific tasks, but-as discussed aboveface the inherent limitations of flat architectures.…”
Section: Flat Vs Deep Architecturesmentioning
confidence: 99%