The treatment of large data streams in the presence of concept drifts is one of the main challenges in the field of data mining, particularly when the algorithms have to deal with concepts that disappear and then reappear. This paper presents a new algorithm, called Fast Adapting Ensemble (FAE), which adapts very quickly to both abrupt and gradual concept drifts, and has been specifically designed to deal with recurring concepts. FAE processes the learning examples in blocks of the same size, but it does not have to wait for the batch to be complete in order to adapt its base classification mechanism. FAE incorporates a drift detector to improve the handling of abrupt concept drifts and stores a set of inactive classifiers that represent old concepts, which are activated very quickly when these concepts reappear. We compare our new algorithm with various well-known learning algorithms, taking into account, common benchmark datasets. The experiments show promising results from the proposed algorithm (regarding accuracy and runtime), handling different types of concept drifts.
Since 2008, E-PRTR is the European Emissions and Transfer Register of Pollutants, which was set to accomplish the UNECE Aarhus Convention about the information, public participation in decisions and access to the judgement in environmental issues. This pollutants emissions inventory follows a methodology based in the imperative declaration of emissions into the atmosphere and to water by the potential sources, included in the Directive 96/61/CE. As a consequence, the accuracy of this inventory depends on the information declared by the sources. In this work, a systematic methodology to validate the declared emissions was designed and applied to the Galician region. This methodology is based in a data structure of plant/activity-process-sourcepollutant, that is, a flowsheeting analysis of every plant was developed in order to associate each process to each source (i.e. chimney) at the same plant; with this approach, estimation of the pollutants emissions from every source is obtained by the calculation of emissions by process, based in different emissions factors. Of course, complementary data from the processes (i.e. fuel consumption, energy production, ...) is required.Results of the E-PRTR for 2008 and 2010 years at Galicia show significant differences between the emissions distribution by sector, depending on the pollutant; this can be explained by changes in the processes technologies and performance. About the validation, in a first stage less than 50% of the sources provided acceptable emissions with the complementary information for validation; some of them complete this information upon request.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.