We propose a novel deep learning framework for bidirectional translation between robot actions and their linguistic descriptions. Our model consists of two recurrent autoencoders (RAEs). One RAE learns to encode action sequences as fixed-dimensional vectors in a way that allows the sequences to be reproduced from the vectors by its decoder. The other RAE learns to encode descriptions in a similar way. In the learning process, in addition to reproduction losses, we create another loss function whereby the representations of an action and its corresponding description approach each other in the latent vector space. Across the shared representation, the trained model can produce a linguistic description given a robot action. The model is also able to generate an appropriate action by receiving a linguistic instruction, conditioned on the current visual input. Visualization of the latent representations shows that the robot actions are embedded in a semantically compositional way in the vector space by being learned jointly with descriptions. Index Terms-Deep learning in robotics and automation, AI-based methods, neurorobotics.
To work cooperatively with humans by using language, robots must not only acquire a mapping between language and their behavior but also autonomously utilize the mapping in appropriate contexts of interactive tasks online. To this end, we propose a novel learning method linking language to robot behavior by means of a recurrent neural network. In this method, the network learns from correct examples of the imposed task that are given not as explicitly separated sets of language and behavior but as sequential data constructed from the actual temporal flow of the task. By doing this, the internal dynamics of the network models both language–behavior relationships and the temporal patterns of interaction. Here, “internal dynamics” refers to the time development of the system defined on the fixed-dimensional space of the internal states of the context layer. Thus, in the execution phase, by constantly representing where in the interaction context it is as its current state, the network autonomously switches between recognition and generation phases without any explicit signs and utilizes the acquired mapping in appropriate contexts. To evaluate our method, we conducted an experiment in which a robot generates appropriate behavior responding to a human’s linguistic instruction. After learning, the network actually formed the attractor structure representing both language–behavior relationships and the task’s temporal pattern in its internal dynamics. In the dynamics, language–behavior mapping was achieved by the branching structure. Repetition of human’s instruction and robot’s behavioral response was represented as the cyclic structure, and besides, waiting to a subsequent instruction was represented as the fixed-point attractor. Thanks to this structure, the robot was able to interact online with a human concerning the given task by autonomously switching phases.
An unstretched alpha -form specimen of polarised PVDF shows four TSC peaks designated P1, P2, P3 and P4 in ascending order of temperature. The P1 and P2 peaks are associated with the dipolar depolarisation due to the alpha a and alpha e relaxations, respectively. The P3 and P4 peaks may both be attributed to interfacial polarisations formed by trapped carriers in the surface regions of PVDF crystals. The P3 peak shows a tendency to saturation in magnitude and a shift in peak temperature with poling field. The latter can be explained by the field-lowering of trap depth (the Poole-Frenkel effect). The P4 peak appears only in the specimen polarised at high fields which is expected to cause the field-induced structure change from the nonpolar alpha -form to the polar alpha -form. The polar beta -form specimen also shows a large TSC peak corresponding to the P4 peak in the alpha -form specimen. Therefore, the P3 and P4 peaks may be associated with nonpolar and polar alpha -form crystals, respectively.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.