Thao Nguyen scite author profile

We develop a system to disambiguate object instances within the same class based on simple physical descriptions. The system takes as input a natural language phrase and a depth image containing a segmented object and predicts how similar the observed object is to the object described by the phrase. Our system is designed to learn from only a small amount of human-labeled language data and generalize to viewpoints not represented in the language-annotated depth image training set. By decoupling 3D shape representation from language representation, this method is able to ground language to novel objects using a small amount of languageannotated depth-data and a larger corpus of unlabeled 3D object meshes, even when these objects are partially observed from unusual viewpoints. Our system is able to disambiguate between novel objects, observed via depth images, based on natural language descriptions. Our method also enables viewpoint transfer; trained on human-annotated data on a small set of depth images captured from frontal viewpoints, our system successfully predicted object attributes from rear views despite having no such depth images in its training set. Finally, we demonstrate our approach on a Baxter robot, enabling it to pick specific objects based on human-provided natural language descriptions.

show abstract

Robot Object Retrieval with Contextual Natural Language Queries

Nguyen

Gopalan

Patel

et al. 2020

View full text Add to dashboard Cite

Object Detection Using Scale Invariant Feature Transform

Nguyen

Park

Han

et al. 2014

View full text Add to dashboard Cite

Abstract. An object detection scheme using the Scale Invariant Feature Transform (SIFT) is proposed in this paper. The SIFT extracts distinctive invariant features from images and it is a useful tool for matching between different views of an object. This paper proposes how the SIFT can be used for an object detection problem, especially human detection problem. The Support Vector Machine (SVM) is adopted as the classifier in the proposed scheme. Experiments on INRIA Perdestrian dataset are performed. Preliminary results show that the proposed SIFT-SVM scheme yields promising performance in terms of detection accuracy.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Thao Nguyen

Lipstick ain’t enough: Beyond Color Matching for In-the-Wild Makeup Transfer

Satellite image classification using convolutional learning

Grounding Language Attributes to Objects using Bayesian Eigenobjects

Robot Object Retrieval with Contextual Natural Language Queries

Object Detection Using Scale Invariant Feature Transform

Contact Info

Product

Resources

About