Analysis-by-synthesis has been a successful approach for many tasks in computer vision, such as 6D pose estimation of an object in an RGB-D image which is the topic of this work. The idea is to compare the observation with the output of a forward process, such as a rendered image of the object of interest in a particular pose. Due to occlusion or complicated sensor noise, it can be difficult to perform this comparison in a meaningful way. We propose an approach that "learns to compare", while taking these difficulties into account. This is done by describing the posterior density of a particular object pose with a convolutional neural network (CNN) that compares an observed and rendered image. The network is trained with the maximum likelihood paradigm. We observe empirically that the CNN does not specialize to the geometry or appearance of specific objects, and it can be used with objects of vastly different shapes and appearances, and in different backgrounds. Compared to state-of-the-art, we demonstrate a significant improvement on two different datasets which include a total of eleven objects, cluttered background, and heavy occlusion.
Unmanned Aerial Vehicles (UAVs) have emerged as a rapid, low-cost and flexible acquisition system that appears feasible for application in cadastral mapping: high-resolution imagery, acquired using UAVs, enables a new approach for defining property boundaries. However, UAV-derived data are arguably not exploited to its full potential: based on UAV data, cadastral boundaries are visually detected and manually digitized. A workflow that automatically extracts boundary features from UAV data could increase the pace of current mapping procedures. This review introduces a workflow considered applicable for automated boundary delineation from UAV data. This is done by reviewing approaches for feature extraction from various application fields and synthesizing these into a hypothetical generalized cadastral workflow. The workflow consists of preprocessing, image segmentation, line extraction, contour generation and postprocessing. The review lists example methods per workflow step-including a description, trialed implementation, and a list of case studies applying individual methods. Furthermore, accuracy assessment methods are outlined. Advantages and drawbacks of each approach are discussed in terms of their applicability on UAV data. This review can serve as a basis for future work on the implementation of most suitable methods in a UAV-based cadastral mapping workflow.
Dynamic scene graph generation aims at generating a scene graph of the given video. Compared to the task of scene graph generation from images, it is more challenging because of the dynamic relationships between objects and the temporal dependencies between frames allowing for a richer semantic interpretation. In this paper, we propose Spatial-temporal Transformer (STTran), a neural network that consists of two core modules: (1) a spatial encoder that takes an input frame to extract spatial context and reason about the visual relationships within a frame, and (2) a temporal decoder which takes the output of the spatial encoder as input in order to capture the temporal dependencies between frames and infer the dynamic relationships. Furthermore, STTran is flexible to take varying lengths of videos as input without clipping, which is especially important for long videos. Our method is validated on the benchmark dataset Action Genome (AG). The experimental results demonstrate the superior performance of our method in terms of dynamic scene graphs. Moreover, a set of ablative studies is conducted and the effect of each proposed module is justified. Code available at: https://github.com/yrcong/STTran.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.