This paper describes a technique for conducting multiparameter experiments in a manner such that the number of data points investigated is reduced to a minimum. The method is based upon the observation that human responses to psychophysiological inputs are lawful rather than random, and hence can be predicted from mathematical equations. The procedure is to: (a) collect data on human responses at a few points in the experimental matrix, (b) fit this data with a low-order polynominal, using a computer program to evaluate the coefficients of the equation as a function of the collected data points, and (c) then, using the developed equation, the computer predicts the values that would be observed at other data points. If these computed values are close enough to the observed values at these points, it is assumed that the equation is correct. If the values are not close enough, the new data is entered into the computer and a higher order equation is fitted by a method of least squares. The procedure is iterative, and is continued until the residual error between computed and observed values for all points falls below some desired value. The importance of the technique is that in multiparameter experiments such a technique can reduce the necessary number of observations by several orders of magnitude compared to what would be necessary by conventional techniques.
While Adversarial Imitation Learning (AIL) algorithms have recently led to state-of-the-art results on various imitation learning benchmarks, it is unclear as to what impact various design decisions have on performance. To this end, we present here an organizing, modular framework called Reinforcement-learning-based Adversarial Imitation Learning (RAIL) that encompasses and generalizes a popular subclass of existing AIL approaches. Using the view espoused by RAIL, we create two new IfO (Imitation from Observation) algorithms, which we term SAIfO: SAC-based Adversarial Imitation from Observation and SILEM (Skeletal Feature Compensation for Imitation Learning with Embodiment Mismatch). We go into greater depth about SILEM in a separate technical report [11]. In this paper, we focus on SAIfO, evaluating it on a suite of locomotion tasks from OpenAI Gym, and showing that it outperforms contemporaneous RAIL algorithms that perform IfO.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.