A hybrid learning system for recognizing user tasks from desktop activities and email messages

Shen, Jianqiang; Li, Lida; Dietterich, Thomas G.; Herlocker, Jonathan L.

doi:10.1145/1111449.1111473

Cited by 70 publications

(68 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Detection of breakpoints can also contribute to an emerging class of interactive tools that enables knowledge activities to be organized into reusable structures and shared [9,27]. A challenge in building these types of tools is being able to organize user activities without having to repeatedly solicit input [9].…”

Section: How Breakpoints Can Be Usedmentioning

confidence: 99%

Understanding and developing models for detecting and differentiating breakpoints during interactive tasks

Iqbal

Bailey

2007

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems

View full text Add to dashboard Cite

The ability to detect and differentiate breakpoints during task execution is critical for enabling defer-to-breakpoint policies within interruption management. In this work, we examine the feasibility of building statistical models that can detect and differentiate three granularities (types) of perceptually meaningful breakpoints during task execution, without having to recognize the underlying tasks. We collected ecological samples of task execution data, and asked observers to review the interaction in the collected videos and identify any perceived breakpoints and their type. Statistical methods were applied to learn models that map features of the interaction to each type of breakpoint. Results showed that the models were able to detect and differentiate breakpoints with reasonably high accuracy across tasks. Among many uses, our resulting models can enable interruption management systems to better realize defer-to-breakpoint policies for interactive, free-form tasks.

show abstract

Section: How Breakpoints Can Be Usedmentioning

confidence: 99%

Understanding and developing models for detecting and differentiating breakpoints during interactive tasks

Iqbal

Bailey

2007

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems

View full text Add to dashboard Cite

show abstract

“…As far as the rest of literature is concerned, there is relatively little literature on evaluation results of cognitive digital assistants and their focus tends to be specific to a narrow range of learning (e.g., [8,9]). This may be because most of assistants of this nature are design exercises, lack resources for comprehensive evaluation, not evaluated with humans in the loop, and/or proprietary and unpublished.…”

Section: Related Workmentioning

confidence: 99%

Evaluation of an integrated multi-task machine learning system with humans in the loop

Steinfeld

Bennett

Cunningham

et al. 2007

Proceedings of the 2007 Workshop on Performance Metrics for Intelligent Systems

View full text Add to dashboard Cite

Abstract-Performance of a cognitive personal assistant, RADAR, consisting of multiple machine learning components, natural language processing, and optimization was examined with a test explicitly developed to measure the impact of integrated machine learning when used by a human user in a real world setting. Three conditions (conventional tools, Radar without learning, and Radar with learning) were evaluated in a large-scale, between-subjects study. The study revealed that integrated machine learning does produce a positive impact on overall performance. This paper also discusses how specific machine learning components contributed to human-system performance.

show abstract

“…TaskPredictor (Shen 2006) is a machine learning system that attempted to predict users' current activities using two parts, one based on the windows in focus and the other based on email. Evaluation of the system involved training the system for a number of days and deploying it within the research group to a total of 9 test subjects.…”

Section: Related Workmentioning

confidence: 99%

“…Information collected included the acceptance rates of the agent's advice and the time taken for transactions. The authors also performed an experiment with two hundred simulated subjects by creating profiles and measuring the resulting simulated behavior.TaskPredictor (Shen 2006) is a machine learning system that attempted to predict users' current activities using two parts, one based on the windows in focus and the other based on email. Evaluation of the system involved training the system for a number of days and deploying it within the research group to a total of 9 test subjects.…”

mentioning

confidence: 99%

The RADAR Test Methodology: Evaluating a Multi-Task Machine Learning System with Humans in the Loop

Steinfeld¹,

Bennett²,

Cunningham³

et al. 2006

View full text Add to dashboard Cite

The RADAR project involves a collection of machine learning research thrusts that are integrated into a cognitive personal assistant. Progress is examined with a test developed to measure the impact of learning when used by a human user. Three conditions (conventional tools, Radar without learning, and Radar with learning) are evaluated in a large-scale, betweensubjects study. This paper describes the RADAR Test with a focus on test design, test harness development, experiment execution, and analysis. Results for the 1.1 version of Radar illustrate the measurement and diagnostic capability of the test. General lessons on such efforts are also discussed. Report Documentation PageForm Public reporting burden for the collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering and maintaining the data needed, and completing and reviewing the collection of information. Send comments regarding this burden estimate or any other aspect of this collection of information, including suggestions for reducing this burden, to Washington Headquarters Services, Directorate for Information Operations and Reports, 1215 Jefferson Davis Highway, Suite 1204, Arlington VA 22202-4302. Respondents should be aware that notwithstanding any other provision of law, no person shall be subject to a penalty for failing to comply with a collection of information if it does not display a currently valid OMB control number. OverviewThe RADAR (Reflective Agents with Distributed Adaptive Reasoning) project 1 within the DARPA PAL (Personalized Assistant that Learns) program is centered on research and development towards a personal cognitive assistant. The underlying scientific advances within the project are predominantly within the realm of machine learning (ML). These ML approaches are varied and the resulting technologies are diverse. As such, the integration result of this research effort, a system called Radar, is a multi-task machine learning system. Annual evaluation on the integrated system is a major theme for the RADAR project, and the PAL program as a whole. Furthermore, there is an explicit directive to keep the test consistent throughout the program. As such, considerable effort was devoted towards designing, implementing, and executing the evaluation. This document describes this process, protocol, and some of the results for the Radar 1.1 test. Note that this document is not centered on Radar features or the actual machine learning methods used.It is also important to note that the RADAR project differs from the bulk of its predecessors and its companion PAL program project, CALO 2 , in that humans are in the loop for both the learning and evaluation steps. Radar is trained by junior members of the team who are largely unfamiliar with ML methods. Generic human subjects are then recruited to use Radar while handling a simulated crisis in a conference planning domain. This allows concrete measurement of performance using a h...

show abstract

A hybrid learning system for recognizing user tasks from desktop activities and email messages

Cited by 70 publications

References 9 publications

Understanding and developing models for detecting and differentiating breakpoints during interactive tasks

Understanding and developing models for detecting and differentiating breakpoints during interactive tasks

Evaluation of an integrated multi-task machine learning system with humans in the loop

The RADAR Test Methodology: Evaluating a Multi-Task Machine Learning System with Humans in the Loop

Contact Info

Product

Resources

About