We present a novel architecture for 3D object detection, M3DETR, which combines different point cloud representations (raw, voxels, bird-eye view) with different feature scales based on multi-scale feature pyramids. M3DETR is the first approach that unifies multiple point cloud representations, feature scales, as well as models mutual relationships between point clouds simultaneously using transformers. We perform extensive ablation experiments that highlight the benefits of fusing representation and scale, and modeling the relationships. Our method achieves state-of-the-art performance on the KITTI 3D object detection dataset and Waymo Open Dataset. Results show that M3DETR improves the baseline significantly by 1.48% mAP for all classes on Waymo Open Dataset. In particular, our approach ranks 1 st on the well-known KITTI 3D Detection Benchmark for both car and cyclist classes, and ranks 1 st on Waymo Open Dataset with single frame point cloud input.
Augmenting pretrained language models (LMs) with a vision encoder (e.g., Flamingo) has obtained state-of-the-art results in image-to-text generation. However, these models store all the knowledge within their parameters, thus often requiring enormous model parameters to model the abundant visual concepts and very rich textual descriptions. Additionally, they are inefficient in incorporating new data, requiring a computationalexpensive fine-tuning process. In this work, we introduce a Retrieval-augmented Visual Language Model, Re-ViLM, built upon the Flamingo, that supports retrieving the relevant knowledge from the external database for zero and in-context fewshot image-to-text generations. By storing certain knowledge explicitly in the external database, our approach reduces the number of model parameters and can easily accommodate new data during evaluation by simply updating the database. We also construct an interleaved image and text data that facilitates in-context few-shot learning capabilities. We demonstrate that Re-ViLM significantly boosts performance for image-to-text generation tasks, especially for zero-shot and fewshot generation in out-of-domain settings with 4× less parameters compared with baseline methods.* Equal contribution . ‡ Work done during an internship at NVIDIA. † Equal advising 1 NVIDIA 2 UIUC. 3 UT Austin 4 ASU 5 Caltech.
We use circle packing techniques to construct approximate solutions to the generalized Beltrami equations with simply and multiply connected regions in the plane. We show convergence of the approximate solutions. This gives a constructive proof for the existence of quasiconformal mappings with a given pair of complex dilations.
Abstract. Given a smooth minimal surface F : Ω → R 3 defined on a simply connected region Ω in the complex plane C, there is a regular SG circle pattern Q ǫ Ω . By the Weierstrass representation of F and the existence theorem of SG circle patterns, there exists an associated SG circle pattern P ǫ Ω in C with the combinatoric of Q ǫ Ω . Based on the relationship between the circle pattern P ǫ Ω and the corresponding discrete minimal surfaceThe theory of discrete differential geometry is presently emerging on the border of differential and discrete geometry, which studies geometric shapes with a finite number of elements (polyhedra) and aims at a development of discrete equivalents of the geometric notions and methods of surface theory (see . A smooth geometric shape (such as surface) appears then as a limit of the refinement of the discretization. One of the central problems of discrete differential geometry is to find proper discrete analogues of special classes of surfaces, such as minimal, constant mean curvature, isothermic, etc. In [2], a new discrete model was introduced to investigate conformal discretizations of minimal surface, i.e., the analogous discrete minimal surfaces consisting of touching spheres, and of circles which intersect the spheres orthogonally in their points of touch. It is proved that the discrete minimal surfaces converge to the smooth ones. The advantages of the discretizations are that they respect conformal properties of surfaces, possess a maximum principle, etc. Here, we are concerned with the C ∞ -convergence of discrete minimal surfaces given in terms of circles
Thurston conjectured that hexagonal circle packings can be used to approximate the Riemann mapping. The corresponding convergence was proven by Rodin and Sullivan. He and Schramm showed that for hexagonal circle packings the convergence is C ∞ . Here the C ∞ -convergence is generalized to the case of non-hexagonal circle packings with bounded degree. Furthermore, the estimation of the convergence rate is obtained for arbitrary order derivatives.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.