Fully articulated hand tracking promises to enable fundamentally new interactions with virtual and augmented worlds, but the limited accuracy and efficiency of current systems has prevented widespread adoption. Today's dominant paradigm uses machine learning for initialization and recovery followed by iterative model-fitting optimization to achieve a detailed pose fit. We follow this paradigm, but make several changes to the model-fitting, namely using: (1) a more discriminative objective function; (2) a smooth-surface model that provides gradients for non-linear optimization; and (3) joint optimization over both the model pose and the correspondences between observed data points and the model surface. While each of these changes may actually
increase
the cost per fitting iteration, we find a compensating decrease in the number of iterations. Further, the wide basin of convergence means that fewer starting points are needed for successful model fitting. Our system runs in real-time on CPU only, which frees up the commonly over-burdened GPU for experience designers. The hand tracker is efficient enough to run on low-power devices such as tablets. We can track up to several meters from the camera to provide a large working volume for interaction, even using the noisy data from current-generation depth cameras. Quantitative assessments on standard datasets show that the new approach exceeds the state of the art in accuracy. Qualitative results take the form of live recordings of a range of interactive experiences enabled by this new approach.
Figure 1: AutoCollage automatically creates a collage of representative elements from a set of images. Novel and desirable properties include: boundaries between images are appropriately positioned; there is little duplication of material; small and meaningless image fragments are avoided; faces are preserved whole; blends may either cut along natural boundaries or be transparent, decided automatically.
AbstractThe paper defines an automatic procedure for constructing a visually appealing collage from a collection of input images. The aim is that the resulting collage should be representative of the collection, summarising its main themes. It is also assembled largely seamlessly, using graph-cut, Poisson blending of alpha-masks, to hide the joins between input images. This paper makes several new contributions. Firstly, we show how energy terms can be included that: encourage the selection of a representative set of images; that are sensitive to particular object classes; that encourage a spatially * efficient and seamless layout. Secondly the resulting optimization poses a search problem that, on the face of it, is computationally infeasible. Rather than attempt an expensive, integrated optimization procedure, we have developed a sequence of optimization steps, from static ranking of images, through region of interest optimization, optimal packing by constraint satisfaction, and lastly graphcut alpha-expansion. To illustrate the power of AutoCollage, we have used it to create collages of many home photo sets; we also conducted a user study in which AutoCollage outperformed competitive methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.