2020
DOI: 10.1007/s11036-020-01530-6
|View full text |Cite
|
Sign up to set email alerts
|

A Cautionary Tale for Machine Learning Design: why we Still Need Human-Assisted Big Data Analysis

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
26
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
2
1

Relationship

2
5

Authors

Journals

citations
Cited by 36 publications
(26 citation statements)
references
References 20 publications
0
26
0
Order By: Relevance
“…So, an additional human involvement is required to produce APIs to adjust incorrect information produced by ML. Roccetti et al [ 4 ] used human experts’ knowledge by defining some semantics to select high-quality instances from the data set, to train neural networks on fifteen million water meter readings with the goal of increasing the accuracy of the ML prediction.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…So, an additional human involvement is required to produce APIs to adjust incorrect information produced by ML. Roccetti et al [ 4 ] used human experts’ knowledge by defining some semantics to select high-quality instances from the data set, to train neural networks on fifteen million water meter readings with the goal of increasing the accuracy of the ML prediction.…”
Section: Resultsmentioning
confidence: 99%
“…The success of ML methods in real-world applications is dependent on not only the design of the ML procedures but also the quality and semantics of the data used for training ML methods [ 4 ]. Data quality has an important role in the success of ML approaches in prediction.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…This large dataset spanned a period in time, from the beginning of 2014 to the end of 2018. To train our deep learning model, at the end of a long validation process which is described at length in [ 8 , 9 ], we decided to use a smaller dataset, comprised of just those water meter devices, with at least three valid numerical readings. This dataset contained exactly 17,714 devices; where 15.652 were non-defective ones, and the remaining 2.062 were defective .…”
Section: Dataset Description: Type Of Variables Deep Learning and Prmentioning
confidence: 99%
“…Along this line, in this paper we describe a deep learning design experience, where we had initially a trouble on developing an appropriate deep learning model able to detect failures in mechanical water meter devices, because we tried to train that model by merging together the numerical information relative to water consumption with some device descriptors based on categorical information, thus resulting into an explosion in data dimensionality, that soon determined a deterioration of the prediction accuracy [ 8 , 9 ]. After several unsuccessful experiments conducted with alternative methodologies that either permitted to reduce the data space dimensionality or employed more traditional machine learning algorithms, we changed the training strategy.…”
Section: Introductionmentioning
confidence: 99%