Aiming at the low cleaning rate of the traditional multi-source heterogeneous power grid big data cleaning model, a multi-source heterogeneous power grid big data cleaning model based on machine learning classification algorithm is designed. By capturing high-quality multi-source heterogeneous power grid big data, weight labeling of data source importance measurement, data attributes and tuples, and constructing Tan network based on the idea of machine learning classification algorithm, the data probability value is finally used to complete the classification and cleaning of inaccurate data. Experiments show that the model based on machine learning classification algorithm can effectively improve the imprecise data cleaning rate compared with the traditional model to solve multi-source heterogeneous imprecise data cleaning.
The Editor-in-Chief has retracted this article [1], which was published as part of special issue "Multi-source Weak Data Management using Big Data", because its content has been duplicated from an unpublished manuscript authored by Hong Lin without permission. In addition, there is evidence suggesting authorship manipulation and an attempt to subvert the peer review process.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.