Supporting aggregates in recursive logic rules represents a very important problem for Datalog. To solve this problem, we propose a simple extension, called DatalogFS(Datalog extended with frequency support goals), that supports queries and reasoning about the number of distinct variable assignments satisfying given goals, or conjunctions of goals, in rules. This monotonic extension greatly enhances the power of Datalog, while preserving (i) its declarative semantics and (ii) its amenability to efficient implementation via differential fixpoint and other optimization techniques presented in the paper. Thus, DatalogFSenables the efficient formulation of queries that could not be expressed efficiently or could not be expressed at all in Datalog with stratified negation and aggregates. In fact, using a generalized notion of multiplicity called frequency, we show that diffusion models and page rank computations can be easily expressed and efficiently implemented using DatalogFS. © 2012 Springer-Verlag Berlin Heidelberg
Despite the perceived abundance of information collected after a disaster, available\ud data furnish a narrow picture of flood impacts, or they are difficult to\ud compare so as to produce an integrated interpretation of flood events. This is\ud due to the diversity of the purposes for which data are collected and the variety\ud of stakeholders involved in data collection and management. The RISPOSTA\ud procedure addresses the need for standardised ways to collect flood damage data\ud and to create consistent and reliable flood databases that meet the objectives of\ud risk mitigation. In this regard, the procedure satisfies several requirements of\ud loss data: (1) the data should refer to the different exposed sectors so as to supply\ud a comprehensive view of flood impacts; (2) they should be collected at the finest\ud scale so that the proper scale of analysis can be chosen by subsequent data\ud aggregation; (3) they should be linked to the physical event as well as to the\ud features of the different exposed elements so as to supply information on both\ud flood impacts and their explicative variables; and (4) they should be collected at\ud different times according to the unfolding of the event in order to describe the\ud entire range of possible damage
FS-rules provide a powerful monotonic extension for Horn clauses that supports monotonic aggregates in recursion by reasoning on the multiplicity of occurrences satisfying existential goals. The least fixpoint semantics, and its equivalent least model semantics, hold for logic programs with FS-rules; moreover, generalized notions of stratification and stable models are easily derived when negated goals are allowed. Finally, the generalization of techniques such as seminaive fixpoint and magic sets, make possible the efficient implementation of DatalogFS, i.e., Datalog with rules with Frequency Support (FS-rules) and stratified negation. A large number of applications that could not be supported efficiently, or could not be expressed at all in stratified Datalog can now be easily expressed and efficiently supported in DatalogFS and a powerful DatalogFS system is now being developed at UCLA. Copyright © 2013 [MIRJANA MAZURAN, EDOARDO SERRA and CARLO ZANIOLO]
The Big Data challenge has made the issue of "making sense" of data urgent and unavoidable. This paper introduces exploratory computing (EC), a novel paradigm whose aim is to support a comprehensive "exploratory" experience for the user. "Exploratory" because it supports search and discovery of information through various tasks (investigation, knowledge seeking, serendipitous discovery, comparison of information…) in a dynamic interaction, where meaningful feedbacks from the system play a crucial role, closely resembling a human-to-human dialogue. "Computing" because a complex interaction as the one outlined above requires powerful computational strength for the user to be able to fully profit from, and even enjoy, the interaction. EC is not associated with a predefined set of techniques: Rather, it is an approach that can be concretized in different ways. In the paper, two different implementations of the EC approach are presented, both of which interpret the EC highlevel requirements. It is the authors' hope that others will follow.
Abstract-Extracting information from semistructured documents is a very hard task, and is going to become more and more critical as the amount of digital information available on the Internet grows. Indeed, documents are often so large that the data set returned as answer to a query may be too big to convey interpretable knowledge. In this paper, we describe an approach based on Tree-Based Association Rules (TARs): mined rules, which provide approximate, intensional information on both the structure and the contents of Extensible Markup Language (XML) documents, and can be stored in XML format as well. This mined knowledge is later used to provide: 1) a concise idea-the gist-of both the structure and the content of the XML document and 2) quick, approximate answers to queries. In this paper, we focus on the second feature. A prototype system and experimental results demonstrate the effectiveness of the approach.
The advent of the Big Data challenge has stimulated research on methods and techniques to deal with the problem of managing data abundance. Many approaches have been developed, but for the most part, they attack one specific side of the problem: e.g. efficient querying, analysis techniques that summarize data or reduce its dimensionality, data visualization, etc. The approach proposed in this paper aims instead at taking a comprehensive view: first of all, it takes into account that human exploration is an iterative and multi-step process and therefore allows building upon a previous query on to the next, in a sort of "dialogue" between the user and the system. Second, it aims at supporting a variety of user experiences, like investigation, inspiration seeking, monitoring, comparison, decision-making, research, etc. Third, and probably most important, it adds to the notion of "big" the notion of "rich": Exploratory Computing (EC) aims at dealing with datasets of semantically complex items, whose inspection may reach beyond the user's previous knowledge or expectations: an exploratory experience basically consists in creating, refining, modifying, comparing various datasets in order to "make sense" of these meanings.
The aim of this paper is to present a "procedure" to collect and store damage data in the aftermath of flood events. The activity is performed within the Poli_RISPOSTA project (stRumentI per la protezione civile a Supporto delle POpolazioni nel poST Alluvione), an internal project of Politecnico di Milano whose aim is to supply tools supporting Civil Protection Authorities in dealing with flood emergency. Specifically, the aim of this paper is to discuss the present implementation of the project, highlighting challenges for data collection, storage, analysis and visualisation. Data can have different formats (e.g. paper based vs. digital form, different digital files extensions), refer to different aspects of the phenomenon (i.e. hazard, exposure, vulnerability and damage), refer to different spatial and temporal scales (e.g. micro vs. meso scale, different phases of the flood event) and come from different sources (e.g. local authorities, field surveys, crowdsourcing). Therefore a multidisciplinary approach which includes expertise from ICT, geomatics, engineering, urban planning, economy, etc. is required. This paper first describes a conceptual map of the issue at stake, then it discusses the state of the art of the implementation, taken as reference the Umbria flood in November 2012. Impacts of the project are discussed with
In many data streaming applications today, tuples inside the streams may get revised over time. This type of data stream brings new issues and challenges to the data mining tasks. We present a theoretical analysis for mining frequent itemsets from sliding windows over such data. We define conditions that determine whether an infrequent itemset will become frequent when some existing tuples inside the streams have been updated. We design simple but effective structures for managing both the evolving tuples and the candidate frequent itemsets. Moreover, we provide a novel verification method that efficiently computes the counts of candidate itemsets. Experiments on real-world datasets show the efficiency and effectiveness of our proposed method.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.