Recognizing manipulations performed by a human and the transfer and execution of this by a robot is a difficult problem. We address this in the current study by introducing a novel representation of the relations between objects at decisive time points during a manipulation. Thereby, we encode the essential changes in a visual scenery in a condensed way such that a robot can recognize and learn a manipulation without prior object knowledge. To achieve this we continuously track image segments in the video and construct a dynamic graph sequence. Topological transitions of those graphs occur whenever a spatial relation between some segments has changed in a discontinuous way and these moments are stored in a transition matrix called the semantic event chain (SEC). We demonstrate that these time points are highly descriptive for distinguishing different manipulations. Employing simple sub-string search algorithms, semantic event chains can be compared and type-similar manipulations can be recognized with high confidence. As the approach is generic, statistical learning can be used to find the archetypal SEC of a given manipulation class. The performance of the algorithm is demonstrated on a set of real videos showing hands manipulating various objects and performing different actions. In experiments with a robotic arm, we show that the SEC can be learned by observing human manipulations, transferred to a new scenario, and then reproduced by the machine.
Abstract-In this work we introduce a novel approach for detecting spatiotemporal object-action relations, leading to both, action recognition and object categorization. Semantic scene graphs are extracted from image sequences and used to find the characteristic main graphs of the action sequence via an exact graph-matching technique, thus providing an event table of the action scene, which allows extracting objectaction relations. The method is applied to several artificial and real action scenes containing limited context. The central novelty of this approach is that it is model free and needs a priori representation neither for objects nor actions. Essentially actions are recognized without requiring prior object knowledge and objects are categorized solely based on their exhibited role within an action sequence. Thus, this approach is grounded in the affordance principle, which has recently attracted much attention in robotics and provides a way forward for trial and error learning of object-action relations through repeated experimentation. It may therefore be useful for recognition and categorization tasks for example in imitation learning in developmental and cognitive robotics.
Supervision of long-lasting extensive botanic experiments is a promising robotic application that some recent technological advances have made feasible. Plant modelling for this application has strong demands, particularly in what concerns 3D information gathering and speed. This paper shows that Time-of-Flight (ToF) cameras achieve a good compromise between both demands, providing a suitable complement to color vision. A new method is proposed to segment plant images into their composite surface patches by combining hierarchical color segmentation with quadratic surface fitting using ToF depth data. Experimentation shows that the interpolated depth maps derived from the obtained surfaces fit well the original scenes. Moreover, candidate leaves to be approached by a measuring instrument are ranked, and then robot-mounted cameras move closer to them to validate their suitability to being sampled. Some ambiguities arising from leaves overlap or occlusions are cleared up in this way. The work is a proof-of-concept that dense color data combined with sparse depth as provided by a ToF camera yields a good enough 3D approximation for automated plant measuring at the high throughput imposed by the application.Peer ReviewedPostprint (published version
a b s t r a c tIn this paper, we present a novel multi-level procedure for finding and tracking leaves of a rosette plant, in our case up to 3 weeks old tobacco plants, during early growth from infrared-image sequences. This allows measuring important plant parameters, e.g. leaf growth rates, in an automatic and non-invasive manner. The procedure consists of three main stages: preprocessing, leaf segmentation, and leaf tracking. Leaf-shape models are applied to improve leaf segmentation, and further used for measuring leaf sizes and handling occlusions. Leaves typically grow radially away from the stem, a property that is exploited in our method, reducing the dimensionality of the tracking task. We successfully tested the method on infrared image sequences showing the growth of tobacco-plant seedlings up to an age of about 30 days, which allows measuring relevant plant growth parameters such as leaf growth rate. By robustly fitting a suitably modified autocatalytic growth model to all growth curves from plants under the same treatment, average plant growth models could be derived. Future applications of the method include plant-growth monitoring for optimizing plant production in green houses or plant phenotyping for plant research.
Abstract-Supervision of long-lasting extensive botanic experiments is a promising robotic application that some recent technological advances have made feasible. Plant modelling for this application has strong demands, particularly in what concerns 3D information gathering and speed. This paper shows that Time-ofFlight (ToF) cameras achieve a good compromise between both demands. A new method is proposed to segment plant images into their composite surface patches by combining a hierarchical segmentation of the infrared intensity image, provided by the ToF camera, with quadratic surface fitting using ToF depth data. Leaf models are fitted to the segments and used to find candidate leaves for probing. The candidate leaves are ranked, and then the robot-mounted camera moves closer to selected leaves to validate their suitability to being sampled. Some ambiguities arising from leaves overlap or occlusions are cleared up in this way. Suitable leaves are then probed using a special cutting tool also mounted on the robot arm. The work is a proof-of-concept that dense infrared data combined with sparse depth as provided by a ToF camera yields a good enough 3D approximation for automated cutting of leaf discs for experimentation purposes.
We present a real-time technique for the spatiotemporal segmentation of color/depth movies. Images are segmented using a parallel Metropolis algorithm implemented on a GPU utilizing both color and depth information, acquired with the Microsoft Kinect. Segments represent the equilibrium states of a Potts model, where tracking of segments is achieved by warping obtained segment labels to the next frame using real-time optical flow, which reduces the number of iterations required for the Metropolis method to encounter the new equilibrium state. By including depth information into the framework, true objects boundaries can be found more easily, improving also the temporal coherency of the method. The algorithm has been tested for videos of medium resolutions showing human manipulations of objects. The framework provides an inexpensive visual front end for visual preprocessing of videos in industrial settings and robot labs which can potentially be used in various applications.
The effects of disorder in external forces on the dynamical behavior of coupled nonlinear oscillator networks are studied. When driven synchronously, i.e., all driving forces have the same phase, the networks display chaotic dynamics. We show that random phases in the driving forces result in regular, periodic network behavior. Intermediate phase disorder can produce network synchrony. Specifically, there is an optimal amount of phase disorder, which can induce the highest level of synchrony. These results demonstrate that the spatiotemporal structure of external influences can control chaos and lead to synchronization in nonlinear systems. [3,4]. In biology, the central nervous system can be described as a complex network of oscillators [5], and cultured networks of heart cells are examples of biological structures with strong nearest-neighbor coupling [6]. In particular, the emergence of synchrony in such networks [7,8] and the control of chaos in nonlinear systems [9,10,11] have received increased attention in recent years.Disorder and noise in physical systems usually tend to destroy spatial and temporal regularity. However, in nonlinear systems, often the opposite effect is found and intrinsically disordered processes, such as thermal fluctuations or mechanically randomized scattering, lead to surprisingly ordered patterns [12]. For instance, in the phenomenon of stochastic resonance the presence of noise can improve the ability of a system to transfer information reliably [13]. Some time ago, Braiman et al. studied one-(1D) and two-dimensional (2D) coupled arrays of forced, damped, nonlinear pendula [14]. They found that when a certain amount of disorder was introduced by randomizing the lengths of the pendula the dynamics of the array ceased to be chaotic. Instead, they observed complex, yet regular, spatiotemporal patterns. Further studies of the same system showed that chaos in the array of oscillators can also be tamed by impurities [15] and that random shortcuts between the pendula lead to synchronization of the array [16].Here, we introduce disorder by modifying the driving forces of the oscillators through phase differences. We observe the emergence of regular, phase-locked dynamics. Moreover, for intermediate spreads of the phase angles in the driving forces, we find that the oscillations become largely synchronous.We focus our numerical analysis on arrays of forced, damped, nonlinear pendula. The 1D array (chain) is de- * Electronic address: sbrandt@physics.wustl.edu scribed by the equation of motionIn order to consider a 2D lattice, we introduce an additional index, θ n → θ n,m , n, m = 1, 2, . . . N and modify the coupling term accordingly:For both the 1D and 2D case, we choose free boundary conditions, i.e.,The parameter values used are the same as in previous studies [14,15,16]: The mass of the pendulum bob is m = 1, the length l = 1, the acceleration due to gravity g = 1, the damping γ = 0.75, the d.c. torque τ ′ = 0.7155, the a.c. torque τ = 0.4, the angular frequency ω = 0.25, and the coup...
Abstract-Scene understanding is a necessary prerequisite for robots acting autonomously in complex environments. Low-cost RGB-D cameras such as Microsoft Kinect enabled new methods for analyzing indoor scenes and are now ubiquitously used in indoor robotics. We investigate strategies for efficient pixelwise object class labeling of indoor scenes that combine both pretrained semantic features transferred from a large color image dataset and geometric features, computed relative to the room structures, including a novel distance-from-wall feature, which encodes the proximity of scene points to a detected major wall of the room. We evaluate our approach on the popular NYU v2 dataset. Several deep learning models are tested, which are designed to exploit different characteristics of the data. This includes feature learning with two different pooling sizes. Our results indicate that combining semantic and geometric features yields significantly improved results for the task of object class segmentation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.