Most existing 3D object recognition algorithms focus on leveraging the strong discriminative power of deep learning models with softmax loss for the classification of 3D data, while learning discriminative features with deep metric learning for 3D object retrieval is more or less neglected.In the paper, we study variants of deep metric learning losses for 3D object retrieval, which did not receive enough attention from this area. First , two kinds of representative losses, triplet loss and center loss, are introduced which could learn more discriminative features than traditional classification loss. Then, we propose a novel loss named triplet-center loss, which can further enhance the discriminative power of the features. The proposed triplet-center loss learns a center for each class and requires that the distances between samples and centers from the same class are closer than those from different classes. Extensive experimental results on two popular 3D object retrieval benchmarks and two widely-adopted sketch-based 3D shape retrieval benchmarks consistently demonstrate the effectiveness of our proposed loss, and significant improvements have been achieved compared with the state-of-the-arts.
The real world exhibits an abundance of non-stationary textures. Examples include textures with large scale structures, as well as spatially variant and inhomogeneous textures. While existing example-based texture synthesis methods can cope well with stationary textures, non-stationary textures still pose a considerable challenge, which remains unresolved. In this paper, we propose a new approach for example-based non-stationary texture synthesis. Our approach uses a generative adversarial network (GAN), trained to double the spatial extent of texture blocks extracted from a specific texture exemplar. Once trained, the fully convolutional generator is able to expand the size of the entire exemplar, as well as of any of its sub-blocks. We demonstrate that this conceptually simple approach is highly effective for capturing large scale structures, as well as other non-stationary attributes of the input exemplar. As a result, it can cope with challenging textures, which, to our knowledge, no other existing method can handle.
Posterior parietal cortex (PPC) activity correlates with monkeys' decisions during visual discrimination and categorization tasks. However, recent work has questioned whether decision-correlated PPC activity plays a causal role in such decisions. That study focused on PPC's contribution to motor aspects of decisions (deciding where to move), but not sensory evaluation aspects (deciding what you are looking at). We employed reversible inactivation to compare PPC's contributions to motor and sensory aspects of decisions. Inactivation affected both aspects of behavior, but preferentially impaired decisions when visual stimuli, rather than motor response targets, were in the inactivated visual field. This demonstrates a causal role for PPC in decision-making, with preferential involvement in evaluating attended task-relevant sensory stimuli compared with motor planning.
Decomposing a complex shape into geometrically simple primitives is a fundamental problem in geometry processing. We are interested in a shape decomposition problem where the simple primitives sought are generalized cylinders, which are ubiquitous in both organic forms and man-made artifacts. We introduce a quantitative measure of cylindricity for a shape part and develop a cylindricitydriven optimization algorithm, with a global objective function, for generalized cylinder decomposition. As a measure of geometric simplicity and following the minimum description length principle, cylindricity is defined as the cost of representing a cylinder through skeletal and cross-section profile curves. Our decomposition algorithm progressively builds local to non-local cylinders, which form over-complete covers of the input shape. The over-completeness of the cylinder covers ensures a conservative buildup of the cylindrical parts, leaving the final decision on decomposition to global optimization. We solve the global optimization by finding an exact cover, which optimizes the global objective function. We demonstrate results of our optimal decomposition algorithm on numerous examples and compare with other alternatives.
Modeling human-object interactions and manipulating motions lies in the heart of fine-grained action recognition. Previous methods heavily rely on explicit detection of the object being interacted, which requires intensive human labour on object annotation. To bypass this constraint and achieve better classification performance, in this work, we propose a novel fine-grained action recognition pipeline by interaction part proposal and discriminative mid-level part mining. Firstly, we generate a large number of candidate object regions using off-the-shelf object proposal tool, e.g., BING. Secondly, these object regions are matched and tracked across frames to form a large spatio-temporal graph based on the appearance matching and the dense motion trajectories through them. We then propose an efficient approximate graph segmentation algorithm to partition and filter the graph into consistent local dense sub-graphs. These sub-graphs, which are spatiotemporal sub-volumes, represent our candidate interaction parts. Finally, we mine discriminative mid-level part detectors from the features computed over the candidate interaction parts. Bag-of-detection scores based on a novel Max-N pooling scheme are computed as the action representation for a video sample. We conduct extensive experiments on human-object interaction datasets including MPII Cooking and MSR Daily Activity 3D. The experimental results demonstrate that the proposed framework achieves consistent improvements over the state-of-the-art action recognition accuracies on the benchmarks, without using any object annotation.
Numerous techniques have been proposed for reconstructing 3D models for opaque objects in past decades. However, none of them can be directly applied to transparent objects. This paper presents a fully automatic approach for reconstructing complete 3D shapes of transparent objects. Through positioning an object on a turntable, its silhouettes and light refraction paths under different viewing directions are captured. Then, starting from an initial rough model generated from space carving, our algorithm progressively optimizes the model under three constraints: surface and refraction normal consistency, surface projection and silhouette consistency, and surface smoothness. Experimental results on both synthetic and real objects demonstrate that our method can successfully recover the complex shapes of transparent objects and faithfully reproduce their light refraction properties.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.