Figure 1: The scatterplots in (a) and (c) visualize the dimensionality reduced representations of two distinct subspaces of a high-dimensional dataset. The matrix visualization (b) shows the discrepancies between the distances in the two projections. The point's color in the projections encodes for data labels and serve as visual connection between them.
We present a new approach to visualizing data that is well-suited for personal and casual applications. The idea is to map the data to another dataset that is already familiar to the user, and then rely on their existing knowledge to illustrate relationships in the data. We construct the map by preserving pairwise distances or by maintaining relative values of specific data attributes. This metaphorical mapping is very flexible and allows us to adapt the visualization to its application and target audience. We present several examples where we map data to different domains and representations. This includes mapping data to cat images, encoding research interests with neural style transfer and representing movies as stars in the night sky. Overall, we find that although metaphors are not as accurate as the traditional techniques, they can help design engaging and personalized visualizations.
CCS CONCEPTS• Human-centered computing → Visualization techniques; Visualization theory, concepts and paradigms; • Computing methodologies → Machine learning.
A common enhancement of scatterplots represents points as small multiples, glyphs, or thumbnail images. As this encoding often results in overlaps, a general strategy is to alter the position of the data points, for instance, to a grid-like structure. Previous approaches rely on solving expensive optimization problems or on dividing the space that alter the global structure of the scatterplot. To find a good balance between efficiency and neighborhood and layout preservation, we propose Hagrid, a technique that uses space-filling curves (SFCs) to “gridify” a scatterplot without employing expensive collision detection and handling mechanisms. Using SFCs ensures that the points are plotted close to their original position, retaining approximately the same global structure. The resulting scatterplot is mapped onto a rectangular or hexagonal grid, using Hilbert and Gosper curves. We discuss and evaluate the theoretic runtime of our approach and quantitatively compare our approach to three state-of-the-art gridifying approaches, DGrid, Small multiples with gaps SMWG, and CorrelatedMultiples CMDS, in an evaluation comprising 339 scatterplots. Here, we compute several quality measures for neighborhood preservation together with an analysis of the actual runtimes. The main results show that, compared to the best other technique, Hagrid is faster by a factor of four, while achieving similar or even better quality of the gridified layout. Due to its computational efficiency, our approach also allows novel applications of gridifying approaches in interactive settings, such as removing local overlap upon hovering over a scatterplot.
Graphical abstract
Fig. 1: Overview of the best embeddings, as predicted by our method, for three of the datasets we have collected: MNIST, photos of flowers, and paintings. On the left of each sub-figure, we have an embedding of embeddings (called a metamap) where each square represents an embedding that was considered in our study and the contour color coding represents goodness of embedding (with dark blue tones meaning "good" and dark red tones meaning "bad"). On the right of the sub-figures, we have the top 3 best embeddings. The background of the metamap is visualized using a "goodness" score for the embeddings outputted from our model. The top 3 MNIST embeddings (blue background -high score) are of higher quality than the top 3 painting embeddings (red background -low score).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.