This paper describes a study on applying data mining techniques to power transformer failure prediction. The data set used consisted not only on DGA tests, but also in other tests done to the transformer’s insulating oil. This dataset presented several challenges, such as highly imbalanced classes (common in failure prediction problems), and the temporal nature of the observations.To overcome these challenges, several techniques were applied for prediction and better understand the dataset. Pre-processing and temporality incorporation in the dataset is discussed. For prediction, a 1-class and 2-class SVM, decision trees and random forests, as well as a LSTM neural network were applied to the dataset.As the prediction performance was low (high false-positive rate), we conducted a test to ascertain if the amount of data collected was sufficient. Results indicate that the frequency of data collection was not adequate, hinting that the degradation period was shorter than the periodicity of data collection.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.