Tuesday, Aug 6: 10:30 AM - 12:20 PM
1585
Topic-Contributed Paper Session
Oregon Convention Center
Room: CC-G130
Applied
Yes
Main Sponsor
Section on Statistical Learning and Data Science
Co Sponsors
Section on Statistical Graphics
Section on Statistics in Defense and National Security
Presentations
This two-part talk will cover two different meanings of "360" as it relates to explainable AI.
The first part will present AI Explainability 360, an open-source software toolkit that we created featuring ten diverse explanation methods and two evaluation metrics. This diversity was aimed at addressing the needs of multiple stakeholders touched by AI and machine learning algorithms, whether they be affected citizens, domain experts, system developers, or government regulators. The impact of the toolkit will be discussed through several case studies, statistics, and community feedback, highlighted by the adoption of the toolkit by the independent LF AI & Data Foundation.
The second part of the talk will offer a selective survey of our recent research directions in explainable AI. In particular, we will discuss work on making perturbation-based explanation methods (such as LIME and SHAP) more reliable and efficient. These ideas may be of interest to statisticians looking to contribute to the area.
Explainable artificial intelligence (XAI) should not be limited to helping an end-user determine whether a machine learning model is reliable; it can be much more powerful. Creative uses of XAI show promise for knowledge discovery, and especially for extracting insights from complex scientific data beyond current capabilities of, but in close collaboration with, domain scientists. In this talk, we will explore non-standard uses of XAI, including recent results from a chemistry application, methodological development, and future potential.
Random forests remain among the most popular off-the-shelf supervised machine learning tools with a well-established track record of predictive accuracy in both regression and classification settings. Despite their empirical success, a full and satisfying explanation for their success has yet to be put forth. In this talk, we will show that the additional randomness injected into individual trees serves as a form of implicit regularization, making random forests an ideal model in low signal-to-noise ratio (SNR) settings. From a model-complexity perspective, this means that the mtry parameter in random forests serves much the same purpose as the shrinkage penalty in explicit regularization procedures like the lasso. Realizing this, we demonstrate that alternative forms of randomness can provide similarly beneficial stabilization. In particular, we show that augmenting the feature space with additional features consisting of only random noise can substantially improve the predictive accuracy of the model. This surprising fact has been largely overlooked within the statistics community, but has crucial implications for thinking about how best to define and measure importance.
Little analysis has been performed to determine if machine learning (ML) explanations accurately represent the target model and should be trusted beyond subjective inspection. Many state-of-the-art ML explainability (MLE) techniques only provide a list of important features based on heuristic measures or make assumptions about the data and the model which are not representative in the real-world. Further, most are designed without considering the usefulness by an end-user in a broader context. To address these issues, we present a notion of explanation fidelity based on Shapley values from cooperative game theory and find many MLE explainability methods produce explanations that are incongruent with the ML model that is being explained. We also find that in deployed scenarios, explanations are rarely used. In the cases when the explanations are used, there is danger that explanations persuade end users to wrongly accept false positives and false negatives.
Scientific computational models are designed and implemented to represent known physical relationships of engineered systems. In situations where mechanistic equations are unknown or the computational burden is too expensive, machine learning (ML) techniques are becoming commonly employed in lieu of, to complement, or as surrogates for classic computational models to uncover these relationships and handle computational challenges. We refer to this fusion of traditional mathematical models with machine learning models as scientific machine learning (SciML). When SciML is used in high consequence applications, the ability to interpret the model is essential for assessment and understanding. However, many ML models are not inherently interpretable. Explainability techniques are intended to provide insight into "black box" ML models, but as with the models, it is imperative that explanations used in high consequence applications are accurate and meaningful. For this reason, we propose that ML explanations used to aid SciML that is deployed to support high consequence decisions be assessed via a framework of maturity level requirements. We draw inspiration from the Predictive Capability Maturity Model (PCMM) currently used at Sandia National Labs to assess the credibility of scientific computational models. The PCMM was specifically developed to assess scientific computational models for high consequence applications and, thus, provides a solid framework to build on for SciML maturity levels. In this talk, we will review PCMM and discuss our work towards developing maturity level requirements for ML explanations. While our efforts are focused on SciML, we believe that such evaluation requirements are relevant more broadly for ML, and we hope to provoke further discussion on the assessment of ML explainability.