We present Falcon, an interactive, deterministic, and declarative data cleaning system, which uses SQL update queries as the language to repair data. Falcon does not rely on the existence of a set of pre-defined data quality rules. On the contrary, it encourages users to explore the data, identify possible problems, and make updates to fix them. Bootstrapped by one user update, Falcon guesses a set of possible SQL update queries that can be used to repair the data. The main technical challenge addressed in this paper consists in finding a set of SQL update queries that is minimal in size and at the same time fixes the largest number of errors in the data. We formalize this problem as a search in a lattice-shaped space. To guarantee that the chosen updates are semantically correct, Falcon navigates the lattice by interacting with users to gradually validate the set of SQL update queries. Besides using traditional one-hop based traverse algorithms (e.g., BFS or DFS), we describe novel multi-hop search algorithms such that Falcon can dive over the lattice and conduct the search efficiently. Our novel search strategy is coupled with a number of optimization techniques to further prune the search space and efficiently maintain the lattice. We have conducted extensive experiments using both real-world and synthetic datasets to show that Falcon can effectively communicate with users in data repairing
Computational thinking is the capacity of undertaking a problem-solving process in various disciplines (including STEM, i.e. science, technology, engineering and mathematics) using distinctive techniques that are typical of computer science. It is nowadays considered a fundamental skill for students and citizens, that has the potential to affect future generations. At the roots of computational-thinking abilities stands the knowledge of computer programming, i.e. coding. With the goal of fostering computational thinking in young students, we address the challenging and open problem of using methods, tools and techniques to support teaching and learning of computer-programming skills in school curricula of the secondary grade and university courses. This problem is made complex by several factors. In fact, coding requires abstraction capabilities and complex cognitive skills such as procedural and conditional reasoning, planning, and analogical reasoning. In this paper, we introduce a new paradigm called ACME (“Code Animation by Evolved Metaphors”) that stands at the foundation of the Diogene-CT code visualization environment and methodology. We develop consistent visual metaphors for both procedural and object-oriented programming. Based on the metaphors, we introduce a playground architecture to support teaching and learning of the principles of coding. To the best of our knowledge, this is the first scalable code visualization tool using consistent metaphors in the field of the Computing Education Research (CER). It might be considered as a new kind of tools named as code visualization environments.
We consider the problem of handling digital identities within serviceoriented architecture (SOA) architectures. We explore federated, single signon (SSO) solutions based on identity managers and service providers. After an overview of the different standards and protocols, we introduce a middlewarebased architecture to simplify the integration of legacy systems within such platforms. Our solution is based on a middleware module that decouples the legacy system from the identity-management modules.We consider both standard point-to-point service architectures, and complex government interoperability frameworks, and report experiments to show that our solution provides clear advantages both in terms of effectiveness and performance
This paper introduces Greg, ML, a machine-learning tool for generating automatic diagnostic suggestions based on patient profiles. We discuss the architecture that stands at the core of Greg, and some experimental results based on the working prototype we have developed. Finally, we discuss challenges and opportunities related to the use of this kind of tools in medicine, and some important lessons learned developing the tool. In this respect, despite the ironic title of this paper, we underline that Greg should be conceived primarily as a support for expert doctors in their diagnostic decisions, and can hardly replace humans in their judgment.
This paper introduces the Greg, ML platform, a machine-learning engine and toolset conceived to generate automatic diagnostic suggestions based on patient profiles. Greg, ML departs from many other experiences in machine learning for healthcare in the fact that it was designed to handle a large number of different diagnoses, in the order of the hundreds. We discuss the architecture that stands at the core of Greg, ML, designed to handle the complex challenges posed by this ambitious goal, and confirm its effectiveness with experimental results based on the working prototype we have developed. Finally, we discuss challenges and opportunities related to the use of this kind of tools in medicine, and some important lessons learned while developing the tool. In this respect, we underline that Greg, ML should be conceived primarily as a support for expert doctors in their diagnostic decisions, and can hardly replace humans in their judgment.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.