In order to strengthen the supervision and management of environmental pollution and timely understand and record the basic information of potential environmental pollution of enterprises and institutions, this paper proposes a general survey and inventory method of pollution sources based on big data technology under machine learning. Firstly, this paper evaluates and screens the data provided by government departments, constructs a machine learning classification model, and uses a variety of classification algorithms to compare and analyze according to the basic idea of machine learning to deal with practical problems. Then, the calibration data set constructed is used as the training set to predict and classify the national industrial and commercial data, provincial industrial and commercial data, and municipal industrial and commercial data provided by government departments. The experimental results show that the naive Bayesian classification algorithm is the best algorithm, and the F1 values of each data set are increased by 32.92%, 21.42%, and 14.91%, respectively. The classified prediction of the screened Internet data shows that the accuracy of the final Internet supplementary data is 17.26%, which is similar to the industrial and commercial data of the city. The availability of the machine learning model established in this paper is proven.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.