We propose Deep Estimators of Features (DEFs), a learning-based framework for predicting sharp geometric features in sampled 3D shapes. Differently from existing data-driven methods, which reduce this problem to feature classification, we propose to
regress a scalar field
representing the distance from point samples to the closest feature line on
local patches.
Our approach is the first that scales to massive point clouds by fusing distance-to-feature estimates obtained on individual patches.
We extensively evaluate our approach against related state-of-the-art methods on newly proposed synthetic and real-world 3D CAD model benchmarks. Our approach not only outperforms these (with improvements in Recall and False Positives Rates), but generalizes to real-world scans after training our model on synthetic data and fine-tuning it on a small dataset of scanned data.
We demonstrate a downstream application, where we reconstruct an explicit representation of straight and curved sharp feature lines from range scan data.
We make code, pre-trained models, and our training and evaluation datasets available at https://github.com/artonson/def.
DensePose estimation task is a significant step forward for enhancing user experience computer vision applications ranging from augmented reality to cloth fitting. Existing neural network models capable of solving this task are heavily parameterized and a long way from being transferred to an embedded or mobile device. To enable Dense Pose inference on the end device with current models, one needs to support an expensive server-side infrastructure and have a stable internet connection. To make things worse, mobile and embedded devices do not always have a powerful GPU inside. In this work, we target the problem of redesigning the DensePose R-CNN model's architecture so that the final network retains most of its accuracy but becomes more light-weight and fast. To achieve that, we tested and incorporated many deep learning innovations from recent years, specifically performing an ablation study on 23 efficient backbone architectures, multiple two-stage detection pipeline modifications, and custom model quantization methods. As a result, we achieved 17× model size reduction and 2× latency improvement compared to the baseline model. 1
Figure 1: From an input RGB-D scan (left), we propose to detect objects in the scan and predict their complete part decompositions as semantic part completion; that is, we predict the part masks for the complete object, inferring the part geometry of any missing or unobserved regions in the scan. To achieve this, we predict the part structure of each detected object to drive a geometric prior-driven prediction of the complete part masks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.