The RADAR project involves a collection of machine learning research thrusts that are integrated into a cognitive personal assistant. Progress is examined with a test developed to measure the impact of learning when used by a human user. Three conditions (conventional tools, Radar without learning, and Radar with learning) are evaluated in a large-scale, betweensubjects study. This paper describes the RADAR Test with a focus on test design, test harness development, experiment execution, and analysis. Results for the 1.1 version of Radar illustrate the measurement and diagnostic capability of the test. General lessons on such efforts are also discussed.
Report Documentation PageForm Public reporting burden for the collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering and maintaining the data needed, and completing and reviewing the collection of information. Send comments regarding this burden estimate or any other aspect of this collection of information, including suggestions for reducing this burden, to Washington Headquarters Services, Directorate for Information Operations and Reports, 1215 Jefferson Davis Highway, Suite 1204, Arlington VA 22202-4302. Respondents should be aware that notwithstanding any other provision of law, no person shall be subject to a penalty for failing to comply with a collection of information if it does not display a currently valid OMB control number.
OverviewThe RADAR (Reflective Agents with Distributed Adaptive Reasoning) project 1 within the DARPA PAL (Personalized Assistant that Learns) program is centered on research and development towards a personal cognitive assistant. The underlying scientific advances within the project are predominantly within the realm of machine learning (ML). These ML approaches are varied and the resulting technologies are diverse. As such, the integration result of this research effort, a system called Radar, is a multi-task machine learning system. Annual evaluation on the integrated system is a major theme for the RADAR project, and the PAL program as a whole. Furthermore, there is an explicit directive to keep the test consistent throughout the program. As such, considerable effort was devoted towards designing, implementing, and executing the evaluation. This document describes this process, protocol, and some of the results for the Radar 1.1 test. Note that this document is not centered on Radar features or the actual machine learning methods used.It is also important to note that the RADAR project differs from the bulk of its predecessors and its companion PAL program project, CALO 2 , in that humans are in the loop for both the learning and evaluation steps. Radar is trained by junior members of the team who are largely unfamiliar with ML methods. Generic human subjects are then recruited to use Radar while handling a simulated crisis in a conference planning domain. This allows concrete measurement of performance using a h...