Towards a More Reliable Interpretation of Machine Learning Outputs for Safety-Critical Systems Using Feature Importance Fusion

Rengasamy, Divish; Rothwell, Benjamin; Figueredo, Grazziela P.

doi:10.3390/app112411854

Cited by 22 publications

(3 citation statements)

References 39 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…If it is set too high, anomalies will be missed, and if it is set too low, the rate of false positives will become high. Typically used methodologies for thresholding are Area Under Curve Percentage (AUCP) [16], Median Absolute Deviation (MAD) [17], Modified Thompson Tau Test (MTT) [18], Variational Autoencoders (VAE) [19], Z-Score [20] or Clustering-based techniques [21].…”

Section: Thresholdingmentioning

confidence: 99%

Generic Diagnostic Framework for Anomaly Detection—Application in Satellite and Spacecraft Systems

Bieber,

Verhagen,

Cosson

et al. 2023

Aerospace

View full text Add to dashboard Cite

Spacecraft systems collect health-related data continuously, which can give an indication of the systems’ health status. While they rarely occur, the repercussions of such system anomalies, faults, or failures can be severe, safety-critical and costly. Therefore, the data are used to anticipate any kind of anomalous behaviour. Typically this is performed by the use of simple thresholds or statistical techniques. Over the past few years, however, data-driven anomaly detection methods have been further developed and improved. They can help to automate the process of anomaly detection. However, it usually is time intensive and requires expertise to identify and implement suitable anomaly detection methods for specific systems, which is often not feasible for application at scale, for instance, when considering a satellite consisting of numerous systems and many more subsystems. To address this limitation, a generic diagnostic framework is proposed that identifies optimal anomaly detection techniques and data pre-processing and thresholding methods. The framework is applied to two publicly available spacecraft datasets and a real-life satellite dataset provided by the European Space Agency. The results show that the framework is robust and adaptive to different system data, providing a quick way to assess anomaly detection for the underlying system. It was found that including thresholding techniques significantly influences the quality of resulting anomaly detection models. With this, the framework can provide both a way forward in developing data-driven anomaly detection methods for spacecraft systems and guidance relative to the direction of anomaly detection method selection and implementation for specific use cases.

show abstract

Section: Thresholdingmentioning

confidence: 99%

Generic Diagnostic Framework for Anomaly Detection—Application in Satellite and Spacecraft Systems

Bieber,

Verhagen,

Cosson

et al. 2023

Aerospace

View full text Add to dashboard Cite

show abstract

“…The purpose of the PFI is to calculate how much the performance measure of the model has decreased by randomly extracting the features from the data set. The amount of increase in the RMSE (Ibrahim and Jafari 2019) or MAE (Rengasamy, Rothwell, and Figueredo 2021) values can be determined by the effects of the used features in the model on the classification. The bigger the change, the more important that feature is.…”

Section: Permutation Feature Importance (Pfi)mentioning

confidence: 99%

Investigation Of Diabetes Data with Permutation Feature Importance Based Deep Learning Methods

Gürsoy

Alkan

2022

Karadeniz Fen Bilimleri Dergisi

View full text Add to dashboard Cite

Diabetes is a metabolic disease that occurs due to high blood sugar levels in the body. If it is not treated, diabetes-related health problems may occur in many vital organs of the body. With the latest techniques in machine learning technologies, some of the applications can be used to diagnose diabetes at an early stage. In this study, the data set from the laboratories of Medical City Hospital Endocrinology and Diabetes Specialization Center Al Kindy Training Hospital was used. The dataset consists of 3 different classes: normal, pre-diabetes and diabetes. The obtained diabetes dataset was classified using Long Short-Term Memory (LSTM), Convolutional Neural Network (CNN) and Gated Recurrent Unit (GRU) deep learning methods. The classification performance of each algorithm was evaluated with accuracy, precision, sensitivity and F score performance parameters. Among the deep learning methods, 96.5% classification accuracy was obtained with the LSTM algorithm, 94% with the CNN algorithm and 93% with the GRU algorithm. In this study, the Permutation Feature Importance (PFI) method was also used to determine the effect of features in the data set on classification performance. With this method, study reveals that the HbA1c feature is an important parameter in the used deep learning methods. Both the results obtained with the LSTM algorithm and the determination of the most important feature affecting the classification success reveal the originality of the study. It shows that the obtained results will provide healthcare professionals with a prognostic tool for effective decision-making that can assist in the early detection of the disease.

show abstract

“…Considerada uma das mais importantes fases na construção de um modelo de aprendizado de máquina, técnicas de seleção de variáveis vem sendo destaques em várias literaturas (Chen et al, 2020;Rengasamy et al, 2021), pois ao selecionar as variáveis que realmente são relevantes para serem aplicados ao modelo, ocorrerá uma interpretação melhor de como cada uma delas afetam as predições. Embora a identificação e seleção de variáveis possa ser feita de forma empírica (por meio do conhecimento de especialistas, popularidade na literatura e sucesso preditivo em pesquisas anteriores), tendo a oportunidade de melhorar ainda mais o presente modelo com recursos correlacionados e não redundantes, além de diminuir a complexidade, facilitar a compreensão e ajuda a melhorar o desempenho das métricas em exatidão, precisão e recuperação.…”

Section: Seleção De Variáveisunclassified

Uso de dados administrativos hospitalares para o desenvolvimento de modelos preditivos de readmissão hospitalar não planejadas de pacientes idosos em um hospital público terciário na cidade de São Paulo, Brasil

Barros¹

View full text Add to dashboard Cite

Towards a More Reliable Interpretation of Machine Learning Outputs for Safety-Critical Systems Using Feature Importance Fusion

Cited by 22 publications

References 39 publications

Generic Diagnostic Framework for Anomaly Detection—Application in Satellite and Spacecraft Systems

Generic Diagnostic Framework for Anomaly Detection—Application in Satellite and Spacecraft Systems

Investigation Of Diabetes Data with Permutation Feature Importance Based Deep Learning Methods

Uso de dados administrativos hospitalares para o desenvolvimento de modelos preditivos de readmissão hospitalar não planejadas de pacientes idosos em um hospital público terciário na cidade de São Paulo, Brasil

Contact Info

Product

Resources

About