2016
DOI: 10.1007/s10489-016-0807-x
|View full text |Cite
|
Sign up to set email alerts
|

Quality-optimized predictive analytics

Abstract: On-line statistical and machine learning analytic tasks over largescale contextual data streams coming from e.g., wireless sensor networks, Internet of Things environments, have gained high popularity nowadays due to their significance in knowledge extraction, regression and classification tasks, and, more generally, in making sense from large-scale streaming data. The quality of the received contextual information, however, impacts predictive analytics tasks especially when dealing with uncertain data, outlie… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
9
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
4
3
1

Relationship

5
3

Authors

Journals

citations
Cited by 8 publications
(9 citation statements)
references
References 36 publications
0
9
0
Order By: Relevance
“…A comprehensive list of the available transformations can be found at Spark's website. 2 It is worth noting that transformations are not applied immediately. Instead, Spark uses a lineage graph and pipelines successive transformations to the original dataset once an action is called [32].…”
Section: Transformations In Sparkmentioning
confidence: 99%
See 1 more Smart Citation
“…A comprehensive list of the available transformations can be found at Spark's website. 2 It is worth noting that transformations are not applied immediately. Instead, Spark uses a lineage graph and pipelines successive transformations to the original dataset once an action is called [32].…”
Section: Transformations In Sparkmentioning
confidence: 99%
“…Platforms such as MapReduce [14], Yarn [29], Spark [32] and Mahout [22] are nowadays commonplace. Predictive modeling [26], [23] and exploratory analysis [2,3,6,20] are commonly based on statistical aggregation operators over the results of exploration queries [4,7]. Such queries involve large datasets (which may themselves be the result of linking of other different datasets) and a number of range predicates over multidimensional data vectorial representation, structured, semi-and unstructured data.…”
Section: Introductionmentioning
confidence: 99%
“…Moreover, sensor systems rely on renewable energy, most widely used is solar energy, which could be slower to harvest in environments such as rain forests [19], therefore, having a power efficient device is required in order to have a well-functioning system. Another concern with communicating processed data on every sensing & reporting period is that it is possible that the data contains bias, e.g., missing or corrupted data points [4]. When disseminating them to the gateway, it can potentially be used to make inaccurate predictions of the sensed/monitored environment.…”
Section: Related Work and Contribution A Knowledge Sharing In Edgementioning
confidence: 99%
“…Analytics are derived by models dealing with dynamic optimal decisions for data deliver in light of communication efficiency [27], [6]. Several schemes exploit the computational capability of edge nodes to launch algorithms directly at the data sources [2], [9], [16].…”
Section: Related Workmentioning
confidence: 99%