Interpretable Machine Learning for Genomics

Watson, David S.

doi:10.21203/rs.3.rs-448572/v1

Cited by 11 publications

(10 citation statements)

References 116 publications

(79 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Thus, SHAP is considered to be the most reliable metric for tree-based ML methods at the moment (Molnar, 2021). Furthermore, the SHAP values have a very intuitive explanation.…”

Section: Shapley Additive Explanations Valuesmentioning

confidence: 99%

Board gender diversity and workplace diversity: a machine learning approach

Ranta

Ylinen

2023

View full text Add to dashboard Cite

Purpose This study aims to examine the association between board gender diversity (BGD) and workplace diversity and the relative importance of various board and firm characteristics in predicting diversity. Design/methodology/approach With a novel machine learning (ML) approach, this study models the association between three workplace diversity variables and BGD using a social media data set of approximately 250,000 employee reviews. Using the tools of explainable artificial intelligence, the authors interpret the results of the ML model. Findings The results show that BGD has a strong positive association with the gender equality and inclusiveness dimensions of corporate diversity culture. However, BGD is found to have a weak negative association with age diversity in a company. Furthermore, the authors find that workplace diversity is an important predictor of firm value, indicating a possible channel on how BGD affects firm performance. Originality/value The effects of BGD on workplace diversity below management levels are mainly omitted in the current corporate governance literature. Furthermore, existing research has not considered different dimensions of this diversity and has mainly focused on its gender aspects. In this study, the authors address this research problem and examine how BGD affects different dimensions of diversity at the overall company level. This study reveals important associations and identifies key variables that should be included as a part of theoretical causal models in future research.

show abstract

“…Thus, SHAP is considered to be the most reliable metric for tree-based ML methods at the moment (Molnar, 2021). Furthermore, the SHAP values have a very intuitive explanation.…”

Section: Shapley Additive Explanations Valuesmentioning

confidence: 99%

Board gender diversity and workplace diversity: a machine learning approach

Ranta

Ylinen

2023

View full text Add to dashboard Cite

show abstract

“…Alternatively, the model-agnostic tools can be applied to the outputs of any AI algorithm, including very complex an opaque models, in order to provide interpretation on decision drivers for those models. According to Molnar [2021], this set of tools is post-hoc, i.e. they are applied after the model has been trained and they do not require access to its estimates or code, rather it only requires the ability to test the model.…”

Section: Model-specific Vs Model-agnosticmentioning

confidence: 99%

“…Algorithm 1: Permutation Feature Importance algorithm [Molnar, 2021], [Fisher et al, 2019] Initialization: Trained model f , feature matrix X, target vector y, error measure L(y, f ); Estimate the original model error = L(y, f (X)); For each feature: j = 0, 1, . .…”

Section: Feature Importancementioning

confidence: 99%

See 1 more Smart Citation

Local and Global Explainability Metrics for Machine Learning Predictions

Muñoz¹,

Costa²,

Modenesi³

et al. 2023

Preprint

View full text Add to dashboard Cite

Rapid advancements in artificial intelligence (AI) technology have brought about a plethora of new challenges in terms of governance and regulation. AI systems are being integrated into various industries and sectors, creating a demand from decision-makers to possess a comprehensive and nuanced understanding of the capabilities and limitations of these systems. One critical aspect of this demand is the ability to explain the results of machine learning models, which is crucial to promoting transparency and trust in AI systems, as well as fundamental in helping machine learning models to be trained ethically. In this paper, we present novel quantitative metrics frameworks for interpreting the predictions of classifier and regressor models. The proposed metrics are model agnostic and are defined in order to be able to quantify: (i) the interpretability factors based on global and local feature importance distributions; (ii) the variability of feature impact on the model output; and (iii) the complexity of feature interactions within model decisions. We employ publicly available datasets to apply our proposed metrics to various machine learning models focused on predicting customers' credit risk (classification task) and real estate price valuation (regression task). The results expose how these metrics can provide a more comprehensive understanding of model predictions and facilitate better communication between decision-makers and stakeholders, thereby increasing the overall transparency and accountability of AI systems.

show abstract

“…The lack of correlation may suggest that some models make the right prediction for the wrong reason (Kirchner, 2006). However, to fully understand the basis of the predictions, formal machine learning explanation methods for interpreting the model structures are required (Molnar, 2020). These methods include the SHAP method (Lundberg et al, 2017), the LIME method (Ribeiro et al, 2016), or the integrated gradients method (Sundararajan et al, 2017).…”

Section: Correlations Between Prediction Accuracy and Consistency Of ...mentioning

confidence: 99%

Reliability Assessment of Machine Learning Models in Hydrological Predictions Through Metamorphic Testing

Yang

Chui

2021

Water Resources Research

View full text Add to dashboard Cite

The reliability of the machine learning model prediction for a given input can be assessed by comparing it against the actual output. However, in hydrological studies, machine learning models are often adopted to predict future or unknown events, where the actual outputs are unavailable. The prediction accuracy of a model, which measures its average performance across an observed data set, may not be relevant for a specific input. This study presents a method based on metamorphic testing (MT), adopted from software engineering, to assess the prediction reliability where the actual outputs are unknown. In this method, the predictions for a group of related inputs are considered consistent only if the input and output follow certain relations that are deduced from the properties of the system being modeled. For instance, the predicted runoff volume should increase in a rainfall‐runoff model as the rainfall magnitude of an input increases. In this study, the MT‐based method was applied to assess the predictions made by various machine learning models that were trained to predict the magnitude of flood events in Germany. Surprisingly, the prediction accuracy of a model and its ability to provide consistent predictions were found to be uncorrelated. This study further investigated the factors influencing the assessment result of a given input, such as its similarity to observed data. Overall, this research shows that MT is an effective and simple method for detecting inconsistent model predictions and is recommended when a model is employed to making predictions under new conditions.

show abstract

Interpretable Machine Learning for Genomics

Cited by 11 publications

References 116 publications

Board gender diversity and workplace diversity: a machine learning approach

Board gender diversity and workplace diversity: a machine learning approach

Local and Global Explainability Metrics for Machine Learning Predictions

Reliability Assessment of Machine Learning Models in Hydrological Predictions Through Metamorphic Testing

Contact Info

Product

Resources

About