We investigate bag-of-visual-words (BOVW) approaches to land-use classification in high-resolution overhead imagery. We consider a standard non-spatial representation in which the frequencies but not the locations of quantized image features are used to discriminate between classes analogous to how words are used for text document classification without regard to their order of occurrence. We also consider two spatial extensions, the established spatial pyramid match kernel which considers the absolute spatial arrangement of the image features, as well as a novel method which we term the spatial co-occurrence kernel that considers the relative arrangement. These extensions are motivated by the importance of spatial structure in geographic data.The methods are evaluated using a large ground truth image dataset of 21 land-use classes. In addition to comparisons with standard approaches, we perform extensive evaluation of different configurations such as the size of the visual dictionaries used to derive the BOVW representations and the scale at which the spatial relationships are considered.We show that even though BOVW approaches do not necessarily perform better than the best standard approaches overall, they represent a robust alternative that is more effective for certain land-use classes. We also show that extending the BOVW approach with our proposed spatial cooccurrence kernel consistently improves performance.
Nuclear pore complexes (NPCs) gate the only conduits for nucleocytoplasmic transport in eukaryotes. Their gate is formed by nucleoporins containing large intrinsically disordered domains with multiple phenylalanine-glycine repeats (FG domains). In combination, these are hypothesized to form a structurally and chemically homogeneous network of random coils at the NPC center, which sorts macromolecules by size and hydrophobicity. Instead, we found that FG domains are structurally and chemically heterogeneous. They adopt distinct categories of intrinsically disordered structures in non-random distributions. Some adopt globular, collapsed coil configurations and are characterized by a low charge content. Others are highly charged and adopt more dynamic, extended coil conformations. Interestingly, several FG nucleoporins feature both types of structures in a bimodal distribution along their polypeptide chain. This distribution functionally correlates with the attractive or repulsive character of their interactions with collapsed coil FG domains displaying cohesion toward one another and extended coil FG domains displaying repulsion. Topologically, these bipartite FG domains may resemble sticky molten globules connected to the tip of relaxed or extended coils. Within the NPC, the crowding of FG nucleoporins and the segregation of their disordered structures based on their topology, dimensions, and cohesive character could force the FG domains to form a tubular gate structure or transporter at the NPC center featuring two
Semantic segmentation requires large amounts of pixelwise annotations to learn accurate models. In this paper, we present a video prediction-based methodology to scale up training sets by synthesizing new training samples in order to improve the accuracy of semantic segmentation networks. We exploit video prediction models' ability to predict future frames in order to also predict future labels. A joint propagation strategy is also proposed to alleviate mis-alignments in synthesized samples. We demonstrate that training segmentation models on datasets augmented by the synthesized samples leads to significant improvements in accuracy. Furthermore, we introduce a novel boundary label relaxation technique that makes training robust to annotation noise and propagation artifacts along object boundaries. Our proposed methods achieve state-of-the-art mIoUs of 83.5% on Cityscapes and 82.9% on CamVid. Our single model, without model ensembles, achieves 72.8% mIoU on the KITTI semantic segmentation test set, which surpasses the winning entry of the ROB challenge 2018. Our code and videos can be found at https://nv-adlr.github. io/publication/2018-Segmentation.
This paper investigates local invariant features for geographic (overhead) image retrieval. Local features are particularly well suited for the newer generations of aerial and satellite imagery whose increased spatial resolution, often just tens of centimeters per pixel, allows a greater range of objects and spatial patterns to be recognized than ever before. Local invariant features have been successfully applied to a broad range of computer vision problems and, as such, are receiving increased attention from the remote sensing community particularly for challenging tasks such as detection and classification. We perform an extensive evaluation of local invariant features for image retrieval of land-use/land-cover (LULC) classes in high-resolution aerial imagery. We report on the effects of a number of design parameters on a bag-of-visual-words (BOVW) representation including saliency-versus grid-based local feature extraction, the size of the visual codebook, the clustering algorithm used to create the codebook, and the dissimilarity measure used to compare the BOVW representations. We also perform comparisons with standard features such as color and texture. The performance is quantitatively evaluated using a first-of-its-kind LULC ground truth data set which will be made publicly available to other researchers. In addition to reporting on the effects of the core design parameters, we also describe interesting findings such as the performance-efficiency tradeoffs that are possible through the appropriate pairings of different-sized codebooks and dissimilarity measures. While the focus is on image retrieval, we expect our insights to be informative for other applications such as detection and classification.Index Terms-Bag of visual words, content-based image retrieval, high-resolution overhead image analysis, land cover, land use, local invariant features, remote sensing.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.