We survey work on the different uses of graphical mapping and interaction techniques for visual data mining of large data sets represented as table data. Basic terminology related to data mining, data sets, and visualization is introduced. Previous work on information visualization is reviewed in light of different categorizations of techniques and systems. The role of interaction techniques is discussed, in addition to work addressing the question of selecting and evaluating visualization techniques. We review some representative work on the use of information visualization techniques in the context of mining data. This includes both visual data exploration and visually expressing the outcome of specific mining algorithms. We also review recent innovative approaches that attempt to integrate visualization into the DM/KDD process, using it to enhance user interaction and comprehension.
The development of new methods and concepts to visualize massive amounts of data holds the promise to revolutionize the way scientific results are analyzed, especially when tasks such as classification and clustering are involved, as in the case of sensing and biosensing. In this paper we employ a suite of software tools, referred to as PEx-Sensors, through which projection techniques are used to analyze electrical impedance spectroscopy data in electronic tongues and related sensors. The possibility of treating high dimension datasets with PEx-Sensors is advantageous because the whole impedance vs. frequency curves obtained with various sensing units and for a variety of samples can be analyzed at once. It will be shown that non-linear projection techniques such as Sammon's Mapping or IDMAP provide higher distinction ability than linear methods for sensor arrays containing units capable of molecular recognition, apparently because these techniques are able to capture the cooperative response owing to specific interactions between the sensing unit material and the analyte. In addition to allowing for a higher sensitivity and selectivity, the use of PEx-Sensors permits the identification of the major contributors for the distinguishing ability of sensing units and of the optimized frequency range. The latter will be illustrated with sensing units made with layer-by-layer (LbL) films to detect phytic acid, whose capacitance data were visualized with Parallel Coordinates. Significantly, the implementation of PEx-Sensors was conceived so as to handle any type of sensor based on any type of principle of detection, representing therefore a generic platform for treating large amounts of data for sensors and biosensors.
The one-to-one strategy of mapping each single data item into a graphical marker adopted in many visualization techniques has limited usefulness when the number of records and/or the dimensionality of the data set are very high. In this situation, the strong overlapping of graphical markers severely hampers the user's ability to identify patterns in the data from its visual representation. We tackle this problem here with a strategy that computes frequency or density information from the data set, and uses such information in Parallel Coordinates visualizations to filter out the information to be presented to the user, thus reducing visual clutter and allowing the analyst to observe relevant patterns in the data. The algorithms to construct such visualizations, and the interaction mechanisms supported, inspired by traditional image processing techniques such as grayscale manipulation and thresholding are also presented. We also illustrate how such algorithms can assist users to effectively identify clusters in very noisy large data sets.
Multidimensional projections map data points, defined in a high-dimensional data space, into a 1D, 2D or 3D representation space. Such a mapping may be typically achieved with dimensional reduction, clustering, or force directed point placement. Projections can be displayed and navigated by data analysts by means of visual representations, which may vary from points on a plane to graphs, surfaces or volumes. Typically, projections strive to preserve distance relationships amongst data points, as defined in the original space. Information loss is inevitable and the projection approach defines the extent to which the distance preserving goal is attained. We introduce PEx -the Projection Explorer -a visualization tool for mapping and exploration of high-dimensional data via projections. A set of examples -on both structured (table) and unstructured (text) data -illustrate how projection based visualizations, coupled with appropriate exploration tools, offer a flexible set-up for multidimensional data exploration. The projections in PEx handle relatively large data sets at a computational cost adequate to user interaction.
The need for reliable, fast diagnostics is closely linked to the need for safe, effective treatment of the so-called “neglected” diseases. The list of diseases with no field-adapted diagnostic tools includes leishmaniasis, shigella, typhoid, and bacterial meningitis. Leishmaniasis, in particular, is a parasitic disease caused by Leishmania spp. transmitted by infected phlebotomine sandfly, which remains a public health concern in developing countries with ca. 12 million people infected and 350 million at risk of infection. Despite several attempts, methods for diagnosis are still noneffective, especially with regard to specificity due to false positives with Chagas’ disease caused by Trypanosoma cruzi. Accepted golden standards for detecting leishmaniasis involve isolation of parasites either microscopically, or by culture, and in both methods specimens are obtained by invasive means. Here, we show that efficient distinction between cutaneous leishmaniasis and Chagas’ disease can be obtained with a low-cost biosensor system made with nanostructured films containing specific Leishmania amazonensis and T. cruzi antigens and employing impedance spectroscopy as the detection method. This unprecedented selectivity was afforded by antigen−antibody molecular recognition processes inherent in the detection with the immobilized antigens, and by statistically correlating the electrical impedance data, which allowed distinction between real samples that tested positive for Chagas’ disease and leishmaniasis. Distinction could be made of blood serum samples containing 10−5 mg/mL of the antibody solution in a few minutes. The methods used here are generic and can be extended to any type of biosensor, which is important for an effective diagnosis of many other diseases.
This paper presents a formal definition for HMBS (Hypermedia Model Based on Statecharts). HMBS uses the structure and execution semantics of statecharts to specify both the structural organization and the browsing semantics of hypermedia applications. Statecharts are an extension of finite-state machines and the model is thus a generalization of hypergraph-based hypertext models. Some of the most important features of HMBS are its ability to model hierarchy and synchronization of information; provision of mechanisms for specifying access structures, navigational contexts, access control, multiple tailored versions, and hierarchical views. Analysis of the underlying statechart machine allows verification of page reachability, valid paths, and other properties, thus providing mechanisms to support authors in the development of structured applications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.