Maximilian Panzner scite author profile

2016

Abstract. In this paper we are concerned with learning models of actions and compare a purely generative model based on Hidden Markov Models to a discriminatively trained recurrent LSTM network in terms of their properties and their suitability to learn and represent models of actions. Specifically we compare the performance of the two models regarding the overall classification accuracy, the amount of training sequences required and how early in the progression of a sequence they are able to correctly classify the corresponding sequence. We show that, despite the current trend towards (deep) neural networks, traditional graphical model approaches are still beneficial under conditions where only few data points or limited computing power is available.

Learning linguistic constructions grounded in qualitative action models

Gaspers

2015

Abstract-Aiming at the design of adaptive artificial agents which are able to learn autonomously from experience and human tutoring, in this paper we present a system for learning syntactic constructions grounded in perception. These constructions are learned from examples of natural language utterances and parallel performances of actions, i.e. their trajectories and involved objects. From the input, the system learns linguistic structures and qualitative action models. Action models are represented as Hidden Markov Models over sequences of qualitative relations between a trajector and a landmark and abstract away from concrete action trajectories. Learning of action models is driven by linguistic observations, and linguistic patterns are, in turn, grounded in learned action models. The proposed system is applicable for both language understanding and language generation. We present empirical results, showing that the learned action models generalize well over concrete instances of the same action and also to novel performers, while allowing accurate discrimination between different actions. Further, we show that the system is able to describe novel dynamic scenes and to understand novel utterances describing such scenes.

A multimodal corpus for the evaluation of computational models for (grounded) language acquisition

Gaspers¹,

Panzner²,

Lemme³

et al. 2014

This paper describes the design and acquisition of a German multimodal corpus for the development and evaluation of computational models for (grounded) language acquisition and algorithms enabling corresponding capabilities in robots. The corpus contains parallel data from multiple speakers/actors, including speech, visual data from different perspectives and body posture data. The corpus is designed to support the development and evaluation of models learning rather complex grounded linguistic structures, e.g. syntactic patterns, from sub-symbolic input. It provides moreover a valuable resource for evaluating algorithms addressing several other learning processes, e.g. concept formation or acquisition of manipulation skills. The corpus will be made available to the public.

A deep reinforcement learning based model supporting object familiarization

2017

An important ability of cognitive systems is the ability to familiarize themselves with the properties of objects and their environment as well as to develop an understanding of the consequences of their own actions on physical objects. Developing developmental approaches that allow cognitive systems to familiarize with objects in this sense via guided self-exploration is an important challenge within the field of developmental robotics. In this paper we present a novel approach that allows cognitive systems to familiarize themselves with the properties of objects and the effects of their actions on them in a self-exploration fashion. Our approach is inspired by developmental studies that hypothesize that infants have a propensity to systematically explore the connection between own actions and their perceptual consequences in order to support intermodal calibration of their bodies. We propose a reinforcement-based approach operating in a continuous state space in which the function predicting cumulated future rewards is learned via a deep Q-network. We investigate the impact of the structure of rewards, the impact of different regularization approaches as well as the impact of different exploration strategies.

Modeling the Co-Emergence of Linguistic Constructions and Action Concepts: The Case of Action Verbs

Gaspers

IEEE Trans. Cogn. Dev. Syst.

2019

In this paper, we are concerned with understanding how linguistic and conceptual structures co-emerge, shaping and influencing each other. Most theories and models of language acquisition so far have adopted a 'mapping' paradigm according to which novel words or constructions are 'mapped' onto existing, priorly acquired or innate concepts. Departing from this mapping approach, we present a computational model of the co-emergence of linguistic and conceptual structures. We focus in particular on the case of action verbs and develop a model by which a system can learn the grounded meaning of a verbal construction without assuming the prior existence of a corresponding sensomotorically grounded action concept. Our model spells out how a learner can distill the essence of the meaning of a verbal construction as a process of incremental generalization of the meaning of action verbs, starting from a meaning that is specific to a certain situation in which the verb has been encountered. We understand the meaning of verbs as evoking a grounded simulation rather than a static concept and propose to capture the meaning of verbs via generative statistical models that support simulation, in our case Hidden Markov Models. Statistical models can represent the essence of a verb's meaning while modelling uncertainty and thus variation at the surface level of (observed) action performances. We show that by extending an existing framework for construction learning, our approach can account for the coemergence of linguistic and conceptual structures. We provide proof-of-concept for our model by experimentally evaluating it on matching, choice and generation tasks, showing that our model can not only understand but also produce language.