Scene Understanding through Autonomous Interactive Perception

Bergström, Niklas; Ek, Carl Henrik; Björkman, Mårten; Kragić, Danica

doi:10.1007/978-3-642-23968-7_16

Cited by 17 publications

(26 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…If 3D-data is available, 3D features such as surface normals or curvature might additionally be exploited [10,11]. However, visual or spatial boundaries need not always correspond to object boundaries [12,13], so not all ambiguities can be resolved [12,[14][15][16]]. An alternative is to look at video streams [17]; however, in real-robot setups there may be too much (self-)occlusion for this strategy to be viable.…”

Section: A Non-interactive Visual Segmentationmentioning

confidence: 99%

“…In a subsequent stage, movement and object membership can be estimated for each of these parts. Alternative approaches use algorithms such as iterative closest point (ICP) to determine whether the tracked point cloud is a single rigid object [12,31], or estimate the movement of trackable visual features [13,16,25,[31][32][33].…”

Section: B Interactive Perception For Object Segmentationmentioning

confidence: 99%

“…Many of the discussed approaches do not handle occlusion and co-movement as they deal with only one object of interest, e.g., [6,7,15]. Noise is often ignored [14,15] or handled by requiring objects to move as rigid bodies during one or multiple actions [12,16,19,33]. In these approaches, it is not clear how to deal with uncertainty, e.g., from occlusions or with pushes resulting in multiple adjacent objects moving as according to the same homogeneous transform.…”

Section: Dealing With Noise and Cluttermentioning

confidence: 99%

“…Bergström et al [16] combined rigid motion clues with color-and disparity clues, whereas Schiebener et al [25] validated hypotheses based on proximity and shared parametric surfaces using co-movement. Hausman et al [33] used visual features to generate hypotheses, and to reconstruct a dense model from clustered feature points.…”

Section: Combining Interaction and Visual Cluesmentioning

confidence: 99%

“…The methods discussed in the previous paragraph used visual features to create hypotheses that were tested by interaction [16,25,33] or to modify binary potentials between parts [13]. Instead, a probabilistic approach offers a principled way to integrate noisy clues from multiple sources.…”

Section: Combining Interaction and Visual Cluesmentioning

confidence: 99%

See 4 more Smart Citations

Probabilistic Segmentation and Targeted Exploration of Objects in Cluttered Environments

2014

View full text Add to dashboard Cite

Abstract-Creating robots that can act autonomously in dynamic, unstructured environments requires dealing with novel objects. Thus, an off-line learning phase is not sufficient for recognizing and manipulating such objects. Rather, an autonomous robot needs to acquire knowledge through its own interaction with its environment, without using heuristics encoding human insights about the domain. Interaction also allows information that is not present in static images of a scene to be elicited. Out of a potentially large set of possible interactions, a robot must select actions that are expected to have the most informative outcomes to learn efficiently. In the proposed bottom-up, probabilistic approach, the robot achieves this goal by quantifying the expected informativeness of its own actions in information-theoretic terms. We use this approach to segment a scene into its constituent objects. We retain a probability distribution over segmentations. We show that this approach is robust in the presence of noise and uncertainty in real-world experiments. Evaluations show that the proposed information-theoretic approach allows a robot to efficiently determine the composite structure of its environment. We also show that our probabilistic model allows straightforward integration of multiple modalities, such as movement data and static scene features. Learned static scene features allow for experience from similar environments to speed up learning for new scenes.

show abstract

Section: A Non-interactive Visual Segmentationmentioning

confidence: 99%

Section: B Interactive Perception For Object Segmentationmentioning

confidence: 99%

Section: Dealing With Noise and Cluttermentioning

confidence: 99%

Section: Combining Interaction and Visual Cluesmentioning

confidence: 99%

Section: Combining Interaction and Visual Cluesmentioning

confidence: 99%

See 3 more Smart Citations

Probabilistic Segmentation and Targeted Exploration of Objects in Cluttered Environments

2014

View full text Add to dashboard Cite

show abstract

The Importance of Structure

Kragić

2016

Springer Tracts in Advanced Robotics

View full text Add to dashboard Cite

On-line learning of temporal state models for flexible objects

Bergström

Kragić

et al. 2012

2012 12th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2012)

View full text Add to dashboard Cite

Abstract-State estimation and control are intimately related processes in robot handling of flexible and articulated objects. While for rigid objects, we can generate a CAD model beforehand and a state estimation boils down to estimation of pose or velocity of the object, in case of flexible and articulated objects, such as a cloth, the representation of the object's state is heavily dependent on the task and execution. For example, when folding a cloth, the representation will mainly depend on the way the folding is executed.In this paper, we address the problem of learning a temporal object model from observations generated during task execution. We use the case of dynamic cloth folding as a proof-ofconcept for our methodology. In cloth folding, the most important information is contained in the temporal structure of the data requiring appropriate representation of the observations, fast state estimation and a suitable prediction mechanism.Our approach is realized through efficient implementation of feature extraction and a generative process model, exploiting recent hardware advances in conjunction with principled probabilistic models. The model is capable of representing the temporal structure of the data and it is robust to noise in the observations. We present results exploiting our model to classify the success of a folding action.

show abstract

Scene Understanding through Autonomous Interactive Perception

Cited by 17 publications

References 19 publications

Probabilistic Segmentation and Targeted Exploration of Objects in Cluttered Environments

Probabilistic Segmentation and Targeted Exploration of Objects in Cluttered Environments

The Importance of Structure

On-line learning of temporal state models for flexible objects

Contact Info

Product

Resources

About