The growing volume and variety of data presents both opportunities and challenges for visual analytics. Addressing these challenges is needed for big data to provide valuable insights and novel solutions for business, security, social media, and healthcare. In the case of temporal event sequence analytics it is the number of events in the data and variety of temporal sequence patterns that challenges users of visual analytic tools. This paper describes 15 strategies for sharpening analytic focus that analysts can use to reduce the data volume and pattern variety. Four groups of strategies are proposed: (1) extraction strategies, (2) temporal folding, (3) pattern simplification strategies, and (4) iterative strategies. For each strategy, we provide examples of the use and impact of this strategy on volume and/or variety. Examples are selected from 20 case studies gathered from either our own work, the literature, or based on email interviews with individuals who conducted the analyses and developers who observed analysts using the tools. Finally, we discuss how these strategies might be combined and report on the feedback from 10 senior event sequence analysts.
Finding the differences and similarities between two datasets is a common analytics task. With temporal event sequence data, this task is complex because of the many ways single events and event sequences can differ between the two datasets (or cohorts) of records: the structure of the event sequences (e.g., event order, co-occurring events, or event frequencies), the attributes of events and records (e.g., patient gender), or metrics about the timestamps themselves (e.g., event duration). In exploratory analyses, running statistical tests to cover all cases is time-consuming and determining which results are significant becomes cumbersome. Current analytics tools for comparing groups of event sequences emphasize a purely statistical or purely visual approach for comparison. This paper presents a taxonomy of metrics for comparing cohorts of temporal event sequences, showing that the problem-space is bounded. We also present a visual analytics tool, CoCo (for "Cohort Comparison"), which implements balanced integration of automated statistics with an intelligent user interface to guide users to significant, distinguishing features between the cohorts. Lastly, we describe two early case studies: the first with a research team studying medical team performance in the emergency department and the second with pharmacy researchers.
Event sequence data is common to a broad range of application domains, from security to health care to scholarly communication. This form of data captures information about the progression of events for an individual entity (e.g., a computer network device; a patient; an author) in the form of a series of time-stamped observations. Moreover, each event is associated with an event type (e.g., a computer login attempt, or a hospital discharge). Analyses of event sequence data have been shown to help reveal important temporal patterns, such as clinical paths resulting in improved outcomes, or an understanding of common career trajectories for scholars. Moreover, recent research has demonstrated a variety of techniques designed to overcome methodological challenges such as large volumes of data and high dimensionality. However, the effective identification and analysis of latent stages of progression, which can allow for variation within different but similarly evolving event sequences, remain a significant challenge with important real-world motivations. In this paper, we propose an unsupervised stage analysis algorithm to identify semantically meaningful progression stages as well as the critical events which help define those stages. The algorithm follows three key steps: (1) event representation estimation, (2) event sequence warping and alignment, and (3) sequence segmentation. We also present a novel visualization system, ET2, which interactively illustrates the results of the stage analysis algorithm to help reveal evolution patterns across stages. Finally, we report three forms of evaluation for ET2: (1) case studies with two real-world datasets, (2) interviews with domain expert users, and (3) a performance evaluation on the progression analysis algorithm and the visualization design.
Animated transition has been a popular design choice for smoothly switching between different visualization views or layouts, in which movement trajectories are created as cues for tracking objects during location shifting. Tracking moving objects, however, becomes difficult when their movement paths overlap or the number of tracking targets increases. We propose a novel design to facilitate tracking moving objects in animated transitions. Instead of simply animating an object along a straight line, we create "bundled" movement trajectories for a group of objects that have spatial proximity and share similar moving directions. To study the effect of bundled trajectories, we untangle variations due to different aspects of tracking complexity in a comprehensive controlled user study. The results indicate that using bundled trajectories is particularly effective when tracking more targets (six vs. three targets) or when the object movement involves a high degree of occlusion or deformation. Based on the study, we discuss the advantages and limitations of the new technique, as well as provide design implications.
Rare category identification is an important task in many application domains, ranging from network security, to financial fraud detection, to personalized medicine. These are all applications which require the discovery and characterization of sets of rare but structurally-similar data entities which are obscured within a larger but structurally different dataset. This paper introduces RCLens, a visual analytics system designed to support user-guided rare category exploration and identification. RCLens adopts a novel active learning-based algorithm to iteratively identify more accurate rare categories in response to user-provided feedback. The algorithm is tightly integrated with an interactive visualization-based interface which supports a novel and effective workflow for rare category identification. This paper (1) defines RCLens' underlying active-learning algorithm; (2) describes the visualization and interaction designs, including a discussion of how the designs support user-guided rare category identification; and (3) presents results from an evaluation demonstrating RCLens' ability to support the rare category identification process.
No abstract
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.