Guillem Alenyà scite author profile

Abstract-This paper reviews the state-of-the art in the field of lock-in ToF cameras, their advantages, their limitations, the existing calibration methods, and the way they are being used, sometimes in combination with other sensors. Even though lockin ToF cameras provide neither higher resolution nor larger ambiguity-free range compared to other range map estimation systems, advantages such as registered depth and intensity data at a high frame rate, compact design, low weight and reduced power consumption have motivated their increasing usage in several research areas, such as computer graphics, machine vision and robotics.

show abstract

Single image 3D human pose estimation from noisy observations

Simo-Serra

et al. 2012

View full text Add to dashboard Cite

Markerless 3D human pose detection from a single image is a severely underconstrained problem because different 3D poses can have similar image projections. In order to handle this ambiguity, current approaches rely on prior shape models that can only be correctly adjusted if 2D image features are accurately detected. Unfortunately, although current 2D part detector algorithms have shown promising results, they are not yet accurate enough to guarantee a complete disambiguation of the 3D inferred shape.\ud In this paper, we introduce a novel approach for estimating 3D human pose even when observations are noisy. We propose a stochastic sampling strategy to propagate\ud the noise from the image plane to the shape space. This provides a set of ambiguous 3D shapes, which are virtually undistinguishable from their image projections. Disambiguation is then achieved by imposing kinematic constraints that guarantee the resulting pose resembles a 3D\ud human shape. We validate the method on a variety of situations in which state-of-the-art 2D detectors yield either inaccurate estimations or partly miss some of the body parts.Preprin

show abstract

GanHand: Predicting Human Grasp Affordances in Multi-Object Scenes

et al. 2020

View full text Add to dashboard Cite

show abstract

Using depth and appearance features for informed robot grasping of highly wrinkled clothes

et al. 2012

View full text Add to dashboard Cite

Detecting grasping points is a key problem in cloth manipulation. Most current approaches follow a multiple regrasp\ud strategy for this purpose, in which clothes are sequentially grasped from different points until one of them yields to a\ud desired configuration. In this paper, by contrast, we circumvent the need for multiple re-graspings by building a robust detector that identifies the grasping points, generally in one single step,\ud even when clothes are highly wrinkled.\ud In order to handle the large variability a deformed cloth may have, we build a Bag of Features based detector that combines\ud appearance and 3D geometry features. An image is scanned using a sliding window with a linear classifier, and the candidate\ud windows are refined using a non-linear SVM and a “grasp goodness” criterion to select the best grasping point.\ud We demonstrate our approach detecting collars in deformed polo shirts, using a Kinect camera. Experimental results show\ud a good performance of the proposed method not only in identifying the same trained textile object part under severe deformations and occlusions, but also the corresponding part in other clothes, exhibiting a degree of generalization.Preprin

show abstract

Indoor and outdoor depth imaging of leaves with time-of-flight and stereo vision sensors: Analysis and comparison

Kazmi

Foix

Alenyà

et al. 2014

ISPRS Journal of Photogrammetry and Remote Sensing

109

View full text Add to dashboard Cite

In this article we analyze the response of Time of Flight cameras (active sensors) for close range imaging under three different illumination conditions and compare the results with stereo vision (passive) sensors. Time of Flight sensors are sensitive to ambient light and have low resolution but deliver high frame rate accurate depth data under suitable conditions. We introduce some metrics for performance evaluation over a small region of interest. Based on these metrics, we analyze and compare depth imaging of leaf under indoor (room) and outdoor (shadow and sunlight) conditions by varying exposures of the sensors. Performance of three different time of flight cameras (PMD CamBoard, PMD CamCube and SwissRanger SR4000) is compared against selected stereo-correspondence algorithms (local correlation and graph cuts). PMD CamCube has better cancellation of sunlight, followed by CamBoard, while SwissRanger SR4000 performs poorly under sunlight. stereo vision is more robust to ambient illumination and provides high resolution depth data but it is constrained by texture of the object along with computational efficiency. Graph cut based stereo correspondence algorithm can better retrieve the shape of the leaves but is computationally much more expensive as compared to local correlation. Finally, we propose a method to increase the dynamic range of the ToF cameras for a scene involving both shadow and sunlight exposures at the same time using camera flags (PMD) or confidence matrix (SwissRanger).

show abstract

Context-Aware Human Motion Prediction

Corona

Pumarola

Alenyà

et al. 2020

View full text Add to dashboard Cite

The problem of predicting human motion given a sequence of past observations is at the core of many applications in robotics and computer vision. Current state-ofthe-art formulate this problem as a sequence-to-sequence task, in which a historical of 3D skeletons feeds a Recurrent Neural Network (RNN) that predicts future movements, typically in the order of 1 to 2 seconds. However, one aspect that has been obviated so far, is the fact that human motion is inherently driven by interactions with objects and/or other humans in the environment. In this paper, we explore this scenario using a novel context-aware motion prediction architecture. We use a semantic-graph model where the nodes parameterize the human and objects in the scene and the edges their mutual interactions. These interactions are iteratively learned through a graph attention layer, fed with the past observations, which now include both object and human body motions. Once this semantic graph is learned, we inject it to a standard RNN to predict future movements of the human/s and object/s. We consider two variants of our architecture, either freezing the contextual interactions in the future of updating them. A thorough evaluation in the "Whole-Body Human Motion Database" [29] shows that in both cases, our context-aware networks clearly outperform baselines in which the context information is not considered.

show abstract

3D modelling of leaves from color and ToF data for robotized plant measuring

Alenyà

Dellen

Torras

2011

View full text Add to dashboard Cite

Supervision of long-lasting extensive botanic experiments is a promising robotic application that some recent technological advances have made feasible. Plant modelling for this application has strong demands, particularly in what concerns 3D information gathering and speed. This paper shows that Time-of-Flight (ToF) cameras achieve a good compromise between both demands, providing a suitable complement to color vision. A new method is proposed to segment plant images into their composite surface patches by combining hierarchical color segmentation with quadratic surface fitting using ToF depth data. Experimentation shows that the interpolated depth maps derived from the obtained surfaces fit well the original scenes. Moreover, candidate leaves to be approached by a measuring instrument are ranked, and then robot-mounted cameras move closer to them to validate their suitability to being sampled. Some ambiguities arising from leaves overlap or occlusions are cleared up in this way. The work is a proof-of-concept that dense color data combined with sparse depth as provided by a ToF camera yields a good enough 3D approximation for automated plant measuring at the high throughput imposed by the application.Peer ReviewedPostprint (published version

show abstract

SMPLicit: Topology-aware Generative Model for Clothed People

et al. 2021

View full text Add to dashboard Cite

Figure 1: We introduce SMPLicit, a fully differentiable generative model for clothed bodies, capable of representing garments with different topology. The four figures on the left show the application of the model to the problem of 3D body and cloth reconstruction from an input image. We are able to predict different models per cloth, even for multi-layer cases. Three right-most images: The model can also be used for editing the outfits, removing/adding new garments and re-posing the body.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

334 Leonard St

Brooklyn, NY 11211

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Guillem Alenyà

Lock-in Time-of-Flight (ToF) Cameras: A Survey

Single image 3D human pose estimation from noisy observations

GanHand: Predicting Human Grasp Affordances in Multi-Object Scenes

Using depth and appearance features for informed robot grasping of highly wrinkled clothes

Indoor and outdoor depth imaging of leaves with time-of-flight and stereo vision sensors: Analysis and comparison

Context-Aware Human Motion Prediction

3D modelling of leaves from color and ToF data for robotized plant measuring

SMPLicit: Topology-aware Generative Model for Clothed People

Contact Info

Product

Resources

About