We describe a Visual Analytics (VA) infrastructure, rooted on techniques in machine learning and logic-based deductive reasoning that will assist analysts to make sense of large, complex data sets by facilitating the generation and validation of models representing relationships in the data. We use Logic Programming (LP) as the underlying computing machinery to encode the relations as rules and facts and compute with them. A unique aspect of our approach is that the LP rules are automatically learned, using Inductive Logic Programming, from examples of data that the analyst deems interesting when viewing the data in the highdimensional visualization interface. Using this system, analysts will be able to construct models of arbitrary relationships in the data, explore the data for scenarios that fit the model, refine the model if necessary, and query the model to automatically analyze incoming (future) data exhibiting the encoded relationships. In other words it will support both model-driven data exploration, as well as data-driven model evolution. More importantly, by basing the construction of models on techniques from machine learning and logic-based deduction, the VA process will be both flexible in terms of modeling arbitrary, user-driven relationships in the data as well as readily scale across different data domains. INTRODUCTIONModern day enterprises, be they commerce, government, science, engineering or medical, have to cope with voluminous amounts of data. Effective decision making based on large, dynamic datasets with many parameters requires a conceptual high-level understanding of the data. Acquiring such an understanding is a difficult problem, especially in the presence of incomplete, inconsistent, and noisy data acquired from disparate real-world sources.To make progress on this problem one must draw on the complementary strengths of computing machinery and human insight. Recognizing this promising human-computer synergy, Visual Analytics (VA), defined as the science of analytical reasoning facilitated by interactive visual interfaces [22], has become a major development thrust. It seeks to engage the fast visual circuitry of the human brain to quickly find relations in complex data, trigger creative thoughts, and use these elements to steer the underlying computational analysis processes towards the extraction of new information for further insight. VA has widespread applications, such as homeland security, the financial industry and internet security among others. Research papers and tools related to VA are beginning to emerge (see e.g.However, thus far the main emphasis in VA has been mostly on visualization, data management, and user interfaces (see e.g. [5] and other work mentioned in the following). As far as analytical computing goes, VA research has mainly focused on relatively low-level tasks, such as image [24] and video [12] analysis and database operations [21]. In today's VA systems, it is the human analyst who performs the actual reasoning and abstraction. Obviously this ty...
The process of learning models from raw data typically requires a substantial amount of user input during the model initialization phase. We present an assistive visualization system which greatly reduces the load on the users and makes the process of model initialization and refinement more efficient, problem-driven, and engaging. Utilizing a sequence segmentation task with a Hidden Markov Model as an example, we assign each token in the sequence a feature vector based on its various properties within the sequence. These vectors are then clustered according to similarity, generating a layout of the individual tokens in form of a node link diagram where the length of the links is determined by the feature vector similarity. Users may then tune the weights of the feature vector components to improve the segmentation, which is visualized as a better separation of the clusters. Also, as individual clusters represent different classes, the user can now work at the cluster level to define token classes, instead of labelling one entry at time. Inconsistent entries visually identify themselves by locating at the periphery of clusters, and the user then helps refine the model by resolving these inconsistencies. Our system therefore makes efficient use of the knowledge of its users, only requesting user assistance for non-trivial data items. It so allows users to visually analyze data at a higher, more abstract level, improving scalability. INTRODUCTIONWith the tremendous growth in physical and online data collection technology, we are now experiencing an explosion of digital information. Since a large amount of these data are unstructured, various machine learning techniques have been developed to assign structure to these data to make them machine readable. This process can allow the machine to reason with and draw insight from data almost automatically. However, all such tasks depend heavily on large amounts of user-tagged data as the starting point, and use various semi-supervised learning methods [19]. Due to the high user input required, such tagged data is difficult to construct. Further, data is dynamic, and as a dataset grows and changes, we might need to supplement the tagged data from time to time. We propose to make this task simpler and interactive by designing a system where the user can obtain a visual overview of the dataset, and in that visual interface only tags those data elements that the system cannot easily resolve itself.One crucial idea behind our system is that given good feature vectors to represent each data point, points that are similar will be close-by in the feature vector space. Here, we mean data-points which though rich in semantics, do not have an explicit highdimensional feature vector automatically attached to them. In such cases we need to design feature vectors to represent the semantics and structure of the data-points. We aim to achieve this in our system by designing feature vectors which encompass a data point's structure, context, and location in the dataset. If some s...
This paper presents a system that helps users by suggesting appropriate colors for inserting text and symbols into an image. The color distribution in the image regions surrounding the annotation area determines the colors that make a good choice-i.e. capture a viewer's attention, while remaining legible. Each point in the color-space is assigned a distance-map value, where colors with higher values are better choices. This tool works like a-Magic Marker‖ giving users the power to automatically choose a good annotation color, which can be varied based on their personal preferences.
Visual analytics seeks to conduct a discourse with the user through images, to stimulate curiosity and a penchant to decipher the unknown. Figure 1 depicts our view of the visual analytics process. The computer supports the user in this interactive analytical reasoning, constructing a formal model of the given data, with the end product being formatted knowledge constituting insight.Yet, validation and refi nement of this computational model of insight can occur only in the human domain expert's mind, bringing to bear possibly unformatted knowledge as well as intuition and creative thought. So, it's left to this human user to guide the computer in the formalization (learning) of more sophisticated models that capture what the human desires and what the computer currently believes about the data domain, perhaps with an associated confi dence level. In visual analytics, the computer uses images and text (and possibly sound and haptics) to exchange information with the user about its view of the domain model.Obviously, the better a communicator the computer is, the more assistance it will elicit from the user to help it refi ne the model. This in turn leads to this article's topic-the need for the computer to master the art of interpersonal communication-that is, communication between it and the human analyst. The Elements of Interpersonal CommunicationObviously, communication is present in many domains, not just in human behavior. Communication protocols are part of many human-made systems, such as computing and telecommunication, and they follow similar defi nitions. We focus here on human behavior because we aim for the computer to collaborate with the human user.The interpersonal-communication protocol 1 (see Figure 2) always includes The interpersonal-communication framework has three components. Direct channels encompass information that the sender directly controls; they're easily recognized by the receiver. Indirect channels aren't always under the sender's direct control and are usually recognized subconsciously by the receiver. The context is the conditions surrounding the communication from which the receiver can derive the message's meaning.Communicators use intonation or pitch to emphasize words and passages. Brevity or economy of words leads to clear, effective presentations, whereas an aesthetic choice of words (good storytelling) can generate more interest, attention, and even fascination. Finally, personalization of word choice can target a specifi c receiver, just as the word choice can indicate a sender's identity.Clearly, some people are more eloquent in these matters than others; the same is true for human-
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.