Zhiao Huang scite author profile

We introduce associative embedding, a novel method for supervising convolutional neural networks for the task of detection and grouping. A number of computer vision problems can be framed in this manner including multi-person pose estimation, instance segmentation, and multi-object tracking. Usually the grouping of detections is achieved with multi-stage pipelines, instead we propose an approach that teaches a network to simultaneously output detections and group assignments. This technique can be easily integrated into any state-of-the-art network architecture that produces pixel-wise predictions. We show how to apply this method to both multi-person pose estimation and instance segmentation and report state-of-the-art performance for multi-person pose on the MPII and MS-COCO datasets.

show abstract

Learning to Group: A Bottom-Up Framework for 3D Part Discovery in Unseen Categories

Luo¹,

Mo²,

Huang³

et al. 2020

Preprint

View full text Add to dashboard Cite

We address the problem of discovering 3D parts for objects in unseen categories. Being able to learn the geometry prior of parts and transfer this prior to unseen categories pose fundamental challenges on data-driven shape segmentation approaches. Formulated as a contextual bandit problem, we propose a learningbased agglomerative clustering framework which learns a grouping policy to progressively group small part proposals into bigger ones in a bottom-up fashion. At the core of our approach is to restrict the local context for extracting part-level features, which encourages the generalizability to unseen categories. On the largescale fine-grained 3D part dataset, PartNet, we demonstrate that our method can transfer knowledge of parts learned from 3 training categories to 21 unseen testing categories without seeing any annotated samples. Quantitative comparisons against four shape segmentation baselines shows that our approach achieve the state-of-the-art performance.

show abstract

>5kW Record High Power Narrow Linewidth Laser From Traditional Step-Index Monolithic Fiber Amplifier

Huang

Sheng

Tao

et al. 2021

IEEE Photon. Technol. Lett.

View full text Add to dashboard Cite

RoboCraft: Learning to See, Simulate, and Shape Elasto-Plastic Objects with Graph Networks

Shi¹,

Xu²,

Huang³

et al. 2022

View full text Add to dashboard Cite

Modeling and manipulating elasto-plastic objects are essential capabilities for robots to perform complex industrial and household interaction tasks (e.g., stuffing dumplings, rolling sushi, and making pottery). However, due to the high degree of freedom of elasto-plastic objects, significant challenges exist in virtually every aspect of the robotic manipulation pipeline, e.g., representing the states, modeling the dynamics, and synthesizing the control signals. We propose to tackle these challenges by employing a particle-based representation for elasto-plastic objects in a model-based planning framework. Our system, RoboCraft, only assumes access to raw RGBD visual observations. It transforms the sensing data into particles and learns a particle-based dynamics model using graph neural networks (GNNs) to capture the structure of the underlying system. The learned model can then be coupled with model-predictive control (MPC) algorithms to plan the robot's behavior. We show through experiments that with just 10 minutes of real-world robotic interaction data, our robot can learn a dynamics model that can be used to synthesize control signals to deform elasto-plastic objects into various target shapes, including shapes that the robot has never encountered before. We perform systematic evaluations in both simulation and the real world to demonstrate the robot's manipulation capabilities and ability to generalize to a more complex action space, different tool shapes, and a mixture of motion modes. We also conduct comparisons between RoboCraft and untrained human subjects controlling the gripper to manipulate deformable objects in both simulation and the real world. Our learned modelbased planning framework is comparable to and sometimes better than human subjects on the tested tasks. 1

show abstract

ManiSkill: Generalizable Manipulation Skill Benchmark with Large-Scale Demonstrations

Mu¹,

Ling²,

Xiang³

et al. 2021

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Zhiao Huang

Associative Embedding: End-to-End Learning for Joint Detection and Grouping

Learning to Group: A Bottom-Up Framework for 3D Part Discovery in Unseen Categories

>5kW Record High Power Narrow Linewidth Laser From Traditional Step-Index Monolithic Fiber Amplifier

RoboCraft: Learning to See, Simulate, and Shape Elasto-Plastic Objects with Graph Networks

ManiSkill: Generalizable Manipulation Skill Benchmark with Large-Scale Demonstrations

Contact Info

Product

Resources

About