Data stream processing and analytics (DSPA) applications are widely used to process the ever increasing amounts of data streams produced by highly geographical distributed data sources such as fixed and mobile IoT devices in order to extract valuable information in a timely manner for real-time actuation. To efficiently handle this ever increasing amount of data streams, the emerging Edge/Fog computing paradigms is used as the middle-tier between the Cloud and the IoT devices to process data streams closer to their sources and to reduce the network resource usage and network delay to reach the Cloud. In this paper, we account for the fact that both network resources and computational resources can be limited and shareable among multiple DSPA applications in the Edge-Fog-Cloud architecture, hence it is necessary to ensure their efficient usage. In this respect, we propose a resource-aware and time-efficient heuristic called SOO that identifies a good DSPA operator placement on the Edge-Fog-Cloud architecture towards optimizing the trade-off between the computational and network resource usage. Via thorough simulation experiments, we show that the solution provided by SOO is very close to the optimal one while the execution time is considerably reduced.
This paper focuses on the optimization of the navigation through voluminous subsumption hierarchies of topics employed by Portal Catalogs like Netscape Open Directory (ODP). We a d v ocate for the use of labeling schemes for modeling these hierarchies in order to e ciently answer queries such as subsumption check, descendants, ancestors or nearest common ancestor, which usually require costly transitive closure computations. We rst give a qualitative comparison of three main families of schemes, namely bit vector, pre x and interval based schemes. We then show that two labeling schemes are good candidates for an e cient implementation of label querying using standard relational DBMS, namely, the Dewey Pre x scheme 6] and an Interval scheme by A g r a wal, Borgida and Jagadish 1]. We compare their storage and query evaluation performance for the 16 ODP hierarchies using the PostgreSQL engine.
Numerous algorithms have been proposed for detecting anomalies (outliers, novelties) in an unsupervised manner. Unfortunately, it is not trivial, in general, to understand why a given sample (record) is labelled as an anomaly and thus diagnose its root causes. We propose the following reduced-dimensionality, surrogate model approach to explain detector decisions: approximate the detection model with another one that employs only a small subset of features. Subsequently, samples can be visualized in this low-dimensionality space for human understanding. To this end, we develop PROTEUS, an AutoML pipeline to produce the surrogate model, specifically designed for feature selection on imbalanced datasets. The PROTEUS surrogate model can not only explain the training data, but also the out-of-sample (unseen) data. In other words, PROTEUS produces predictive explanations by approximating the decision surface of an unsupervised detector. PROTEUS is designed to return an accurate estimate of out-of-sample predictive performance to serve as a metric of the quality of the approximation. Computational experiments confirm the efficacy of PROTEUS to produce predictive explanations for different families of detectors and to reliably estimate their predictive performance in unseen data. Unlike several ad-hoc feature importance methods, PROTEUS is robust to high-dimensional data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.