Multidimensional projection techniques have become essential analytical tools. Typically, they map data from a high-dimensional space into a low-dimensional visual space, preserving distance or neighborhood structures on the produced layout. Despite the advances, with faster and highly precise techniques, existing methods still carry deficiencies that impair their use as exploratory tools. An example is the mismatching that can occur between what the user considers similar/dissimilar and what is conveyed by the visual representation. Recently, a class of projection techniques aims at addressing this limitation, allowing users to control the projection process by changing the distance relationships using small data samples. Among such methods, Local Affine Multidimensional Projection has proved to be the state-of-the-art regarding the effectiveness of user intervention. Although Local Affine Multidimensional Projection has attained a relative success, it is limited to certain application domains. Since it relies on feature vector representations with data instances described by vectors embedded into a Euclidean space, scenarios that offer only distance information or a distance function cannot be handled. In this article, we present a novel multidimensional projection technique, called User-assisted Projection Technique for Distance Information, which takes advantage of the solid mathematical framework provided by Local Affine Multidimensional Projection, adapting it to scenarios where only distance information or a distance function is available. The results show that User-assisted Projection Technique for Distance Information is as fast, accurate, robust, and flexible as the existing state-of-the-art techniques, enabling the application of the refined user control provided by Local Affine Multidimensional Projection on domains not previously covered. Its versatility is illustrated in an application that involves the organization of book collections that employs an external source of information for the recommendation of new readings.
On visual analytics applications, the concept of putting the user on the loop refers to the ability to replace heuristics by user knowledge on machine learning and data mining tasks. On supervised tasks, the user engagement occurs via the manipulation of the training data. However, on unsupervised tasks, the user involvement is limited to changes in the algorithm parametrization or the input data representation, also known as features. Depending on the application domain, different types of features can be extracted from the raw data. Therefore, the result of unsupervised algorithms heavily depends on the type of employed feature. Since there is no perfect feature extractor, combining different features have been explored in a process called feature fusion. The feature fusion is straightforward when the machine learning or data mining task has a cost function. However, when such a function does not exist, user support for combination needs to be provided otherwise the process is impractical. In this paper, we present a novel feature fusion approach that uses small data samples to allows users not only to effortless control the combination of different feature sets but also to interpret the attained results. The effectiveness of our approach is confirmed by a comprehensive set of qualitative and quantitative tests, opening up different possibilities of user-guided analytical scenarios not covered yet. The ability of our approach to providing real-time feedback for the feature fusion is exploited on the context of unsupervised clustering techniques, where the composed groups reflect the semantics of the feature combination.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.