Fig. 1. Comparison of taxi trips from Lower Manhattan to JFK andLGA airports in May 2011. The query on the left selects trips that occurred on Sundays, while the one on the right selects trips that occurred on Mondays. Users specify these queries by visually selecting regions on the map and connecting them. In addition to inspecting the results depicted on the map, i.e., the dots corresponding to pickups (blue) and dropoffs (orange) of the selected trips, they can also explore the results through other visual representations. The scatter plots below the maps show the relationship between hour of the day and trip duration. Points in the plots are colored according to the spatial constraint represented by the arrows between the regions: trips to JFK in blue, and trips to LGA in red. The plots show that many of the trips on Monday between 3PM and 5PM take much longer than trips on Sundays.Abstract-As increasing volumes of urban data are captured and become available, new opportunities arise for data-driven analysis that can lead to improvements in the lives of citizens through evidence-based decision making and policies. In this paper, we focus on a particularly important urban data set: taxi trips. Taxis are valuable sensors and information associated with taxi trips can provide unprecedented insight into many different aspects of city life, from economic activity and human behavior to mobility patterns. But analyzing these data presents many challenges. The data are complex, containing geographical and temporal components in addition to multiple variables associated with each trip. Consequently, it is hard to specify exploratory queries and to perform comparative analyses (e.g., compare different regions over time). This problem is compounded due to the size of the data-there are on average 500,000 taxi trips each day in NYC. We propose a new model that allows users to visually query taxi trips. Besides standard analytics queries, the model supports origin-destination queries that enable the study of mobility across the city. We show that this model is able to express a wide range of spatio-temporal queries, and it is also flexible in that not only can queries be composed but also different aggregations and visual representations can be applied, allowing users to explore and compare results. We have built a scalable system that implements this model which supports interactive response times; makes use of an adaptive level-of-detail rendering strategy to generate clutter-free visualization for large results; and shows hidden details to the users in a summary through the use of overlay heat maps. We present a series of case studies motivated by traffic engineers and economists that show how our model and system enable domain experts to perform tasks that were previously unattainable for them.
We investigate how to automatically recover visual encodings from a chart image, primarily using inferred text elements. We contribute an end‐to‐end pipeline which takes a bitmap image as input and returns a visual encoding specification as output. We present a text analysis pipeline which detects text elements in a chart, classifies their role (e.g., chart title, x‐axis label, y‐axis title, etc.), and recovers the text content using optical character recognition. We also train a Convolutional Neural Network for mark type classification. Using the identified text elements and graphical mark type, we can then infer the encoding specification of an input chart image. We evaluate our techniques on three chart corpora: a set of automatically labeled charts generated using Vega, charts from the Quartz news website, and charts extracted from academic papers. We demonstrate accurate automatic inference of text elements, mark types, and chart specifications across a variety of input chart types.
Multidimensional projection has emerged as an important visualization tool in applications involving the visual analysis of high-dimensional data. However, high precision projection methods are either computationally expensive or not flexible enough to enable feedback from user interaction into the projection process. A built-in mechanism that dynamically adapts the projection based on direct user intervention would make the technique more useful for a larger range of applications and data sets. In this paper we propose the Piecewise Laplacian-based Projection (PLP), a novel multidimensional projection technique, that, due to the local nature of its formulation, enables a versatile mechanism to interact with projected data and to allow interactive changes to alter the projection map dynamically, a capability unique of this technique. We exploit the flexibility provided by PLP in two interactive projection-based applications, one designed to organize pictures visually and another to build music playlists. These applications illustrate the usefulness of PLP in handling high-dimensional data in a flexible and highly visual way. We also compare PLP with the currently most promising projections in terms of precision and speed, showing that it performs very well also according to these quality criteria.
Visualization designers regularly use color to encode quantitative or categorical data. However, visualizations "in the wild" often violate perceptual color design principles and may only be available as bitmap images. In this work, we contribute a method to semi-automatically extract color encodings from a bitmap visualization image. Given an image and a legend location, we classify the legend as describing either a discrete or continuous color encoding, identify the colors used, and extract legend text using OCR methods. We then combine this information to recover the specific color mapping. Users can also correct interpretation errors using an annotation interface. We evaluate our techniques using a corpus of images extracted from scientific papers and demonstrate accurate automatic inference of color mappings across a variety of chart types. In addition, we present two applications of our method: automatic recoloring to improve perceptual effectiveness, and interactive overlays to enable improved reading of static visualizations.
Visualization of high-dimensional data requires a mapping to a visual space. Whenever the goal is to preserve similarity relations a frequent strategy is to use 2D projections, which afford intuitive interactive exploration, e.g., by users locating and selecting groups and gradually drilling down to individual objects. In this paper, we propose a framework for projecting high-dimensional data to 3D visual spaces, based on a generalization of the LeastSquare Projection (LSP). We compare projections to 2D and 3D visual spaces both quantitatively and through a user study considering certain exploration tasks. The quantitative analysis confirms that 3D projections outperform 2D projections in terms of precision. The user study indicates that certain tasks can be more reliably and confidently answered with 3D projections. Nonetheless, as 3D projections are displayed on 2D screens, interaction is more difficult. Therefore, we incorporate suitable interaction functionalities into a framework that supports 3D transformations, predefined optimal 2D views, coordinated 2D and 3D views, and hierarchical 3D cluster definition and exploration. For visually encoding data clusters in a 3D setup, we employ color coding of projected data points as well as four types of surface renderings. A second user study evaluates the suitability of these visual encodings. Several examples illustrate the framework's applicability for both visual exploration of multidimensional abstract (non-spatial) data as well as the feature space of multi-variate spatial data.
Evaluation methodologies in visualization have mostly focused on how well the tools and techniques cater to the analytical needs of the user. While this is important in determining the effectiveness of the tools and advancing the state-of-the-art in visualization research, a key area that has mostly been overlooked is how well established visualization theories and principles are instantiated in practice. This is especially relevant when domain experts, and not visualization researchers, design visualizations for analysis of their data or for broader dissemination of scientific knowledge. There is very little research on exploring the synergistic capabilities of cross-domain collaboration between domain experts and visualization researchers. To fill this gap, in this paper we describe the results of an exploratory study of climate data visualizations conducted in tight collaboration with a pool of climate scientists. The study analyzes a large set of static climate data visualizations for identifying their shortcomings in terms of visualization design. The outcome of the study is a classification scheme that categorizes the design problems in the form of a descriptive taxonomy. The taxonomy is a first attempt for systematically categorizing the types, causes, and consequences of design problems in visualizations created by domain experts. We demonstrate the use of the taxonomy for a number of purposes, such as, improving the existing climate data visualizations, reflecting on the impact of the problems for enabling domain experts in designing better visualizations, and also learning about the gaps and opportunities for future visualization research. We demonstrate the applicability of our taxonomy through a number of examples and discuss the lessons learnt and implications of our findings.
View (c) Hotspot View (e) Global Temporal View (f) Ranking Type View (h) Filter Widget (g) Radial Type View (d) Cumulative Temporal View (a) Control Menu Fig. 1. CrimAnalyzer system: the spatial and temporal interactive views enable the exploration of local regions while revealing their criminal patterns over time.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.