This paper proposes a new dataset, Frames, composed of 1369 human-human dialogues with an average of 15 turns per dialogue. This corpus contains goal-oriented dialogues between users who are given some constraints to book a trip and assistants who search a database to find appropriate trips. The users exhibit complex decision-making behaviour which involve comparing trips, exploring different options, and selecting among the trips that were discussed during the dialogue. To drive research on dialogue systems towards handling such behaviour, we have annotated and released the dataset and we propose in this paper a task called frame tracking. This task consists of keeping track of different semantic frames throughout each dialogue. We propose a rule-based baseline and analyse the frame tracking task through this baseline.
Conditional text-to-image generation is an active area of research, with many possible applications. Existing research has primarily focused on generating a single image from available conditioning information in one step. One practical extension beyond one-step generation is a system that generates an image iteratively, conditioned on ongoing linguistic input or feedback. This is significantly more challenging than one-step generation tasks, as such a system must understand the contents of its generated images with respect to the feedback history, the current feedback, as well as the interactions among concepts present in the feedback history. In this work, we present a recurrent image generation model which takes into account both the generated output up to the current step as well as all past instructions for generation. We show that our model is able to generate the background, add new objects, and apply simple transformations to existing objects. We believe our approach is an important step toward interactive generation. Code and data is available at: https://www.microsoft.com/en-us/research/ project/generative-neural-visual-artist-geneva/.
In this paper, we propose to use deep policy networks which are trained with an advantage actor-critic method for statistically optimised dialogue systems. First, we show that, on summary state and action spaces, deep Reinforcement Learning (RL) outperforms Gaussian Processes methods.Summary state and action spaces lead to good performance but require pre-engineering effort, RL knowledge, and domain expertise. In order to remove the need to define such summary spaces, we show that deep RL can also be trained efficiently on the original state and action spaces. Dialogue systems based on partially observable Markov decision processes are known to require many dialogues to train, which makes them unappealing for practical deployment. We show that a deep RL method based on an actor-critic architecture can exploit a small amount of data very efficiently. Indeed, with only a few hundred dialogues collected with a handcrafted policy, the actorcritic deep learner is considerably bootstrapped from a combination of supervised and batch RL. In addition, convergence to an optimal policy is significantly sped up compared to other deep RL methods initialized on the data with batch RL. All experiments are performed on a restaurant domain derived from the Dialogue State Tracking Challenge 2 (DSTC2) dataset.
An automated method for root system architecture reconstruc on from three-dimensional volume data sets obtained from magne c resonance imaging (MRI) was developed and validated with a three-dimensional semimanual reconstruc on using virtual reality and a two-dimensional reconstruc on using SmartRoot. It was tested on the basis of an MRI image of a 25-d-old lupin (Lupinus albus L.) grown in natural sand with a resolu on of 0.39 by 0.39 by 1.1 mm. The automated reconstruc on algorithm was inspired by methods for blood vessel detec on in MRI images. It describes the root system by a hierarchical network of nodes, which are connected by segments of defi ned length and thickness, and also allows the calcula on of root parameter profi les such as root length, surface, and apex density The obtained root system architecture (RSA) varied in number of branches, segments, and connec vity of the segments but did not vary in the average diameter of the segments (0.137 cm for semimanual and 0.143 cm for automa c RSA), total root surface (127 cm 2 for semimanual and 124 cm 2 for automa c RSA), total root length (293 cm for semimanual and 282 cm for automa c RSA), and total root volume (4.7 cm 3 for semimanual and 4.7 cm 3 for automa c RSA). The diff erence in performance of the automated and semimanual reconstruc ons was checked by using the root system as input for water uptake modeling with the Doussan model. Both systems worked well and allowed for con nuous water fl ow. Slight diff erences in the connec vity appeared to be leading to locally diff erent water fl ow veloci es, which were 30% smaller for the semimanual method.Abbrevia ons: MRI, magne c resonance imaging; RSA, root system architecture.
We propose a real-time approach to learning semantic maps from moving RGB-D cameras. Our method models geometry, appearance, and semantic labeling of surfaces. We recover camera pose using simultaneous localization and mapping while concurrently recognizing and segmenting object classes in the images. Our object-class segmentation approach is based on random decision forests and yields a dense probabilistic labeling of each image. We implemented it on GPU to achieve a high frame rate. The probabilistic segmentation is fused in octree-based 3D maps within a Bayesian framework. In this way, image segmentations from various view points are integrated within a 3D map which improves segmentation quality. We evaluate our system on a large benchmark dataset and demonstrate state-of-the-art recognition performance of our object-class segmentation and semantic mapping approaches.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.