Class notes

Interpretable Models

Rating

Sold

Pages

Uploaded on

15-06-2022

Written in

2021/2022

It is easy to understand!

Institution

Course

Content preview

Perspective
https://doi.org/10.1038/s42256-019-0048-x

Stop explaining black box machine learning
models for high stakes decisions and use
interpretable models instead
Cynthia Rudin

Black box machine learning models are currently being used for high-stakes decision making throughout society, causing prob-
lems in healthcare, criminal justice and other domains. Some people hope that creating methods for explaining these black box
models will alleviate some of the problems, but trying to explain black box models, rather than creating models that are inter-
pretable in the first place, is likely to perpetuate bad practice and can potentially cause great harm to society. The way forward
is to design models that are inherently interpretable. This Perspective clarifies the chasm between explaining black boxes and
using inherently interpretable models, outlines several key reasons why explainable black boxes should be avoided in high-
stakes decisions, identifies challenges to interpretable machine learning, and provides several example applications where
interpretable models could potentially replace black box models in criminal justice, healthcare and computer vision.

T
here has been an increasing trend in healthcare and criminal not. There is a spectrum between fully transparent models (where we
justice to leverage machine learning (ML) for high-stakes pre- understand how all the variables are jointly related to each other) and
diction applications that deeply impact human lives. Many of models that are lightly constrained in model form (such as models
the ML models are black boxes that do not explain their predictions that are forced to increase as one of the variables increases, or models
in a way that humans can understand. The lack of transparency and that, all else being equal, prefer variables that domain experts have
accountability of predictive models can have (and has already had) identified as important; see ref. 12).
severe consequences; there have been cases of people incorrectly A preliminary version of this manuscript appeared at a work-
denied parole1, poor bail decisions leading to the release of dan- shop, entitled ‘Please stop explaining black box machine learning
gerous criminals, ML-based pollution models stating that highly models for high stakes decisions’13.
polluted air was safe to breathe2 and generally poor use of limited
valuable resources in criminal justice, medicine, energy reliability, Key issues with explainable ML
finance and in other domains3. A black box model could be either (1) a function that is too com-
Rather than trying to create models that are inherently interpre- plicated for any human to comprehend or (2) a function that is
table, there has been a recent explosion of work on ‘explainable ML’, proprietary (Supplementary Section A). Deep learning models,
where a second (post hoc) model is created to explain the first black for instance, tend to be black boxes of the first kind because they
box model. This is problematic. Explanations are often not reliable, are highly recursive. As the term is presently used in its most com-
and can be misleading, as we discuss below. If we instead use models mon form, an explanation is a separate model that is supposed to
that are inherently interpretable, they provide their own explana- replicate most of the behaviour of a black box (for example, ‘the
tions, which are faithful to what the model actually computes. black box says that people who have been delinquent on current
In what follows, we discuss the problems with explainable ML, fol- credit are more likely to default on a new loan’). Note that the term
lowed by the challenges in interpretable ML. This document is mainly ‘explanation’ here refers to an understanding of how a model works,
relevant to high-stakes decision making and troubleshooting models, as opposed to an explanation of how the world works. The termi-
which are the main two reasons one might require an interpretable nology ‘explanation’ will be discussed later; it is misleading. I am
or explainable model. Interpretability is a domain-specific notion4–7, concerned that the field of interpretability/explainability/compre-
so there cannot be an all-purpose definition. Usually, however, an hensibility/transparency in ML has strayed away from the needs
interpretable machine learning model is constrained in model form of real problems. This field dates back to the early 1990s at least
so that it is either useful to someone, or obeys structural knowledge (see refs. 4,14), and there are a huge number of papers on interpre-
of the domain, such as monotonicity (for example, ref. 8), causality, table ML in various fields (that often do not have the word ‘interpre-
structural (generative) constraints, additivity9 or physical constraints table’ or ‘explainable’ in the title, as recent papers do). Recent work
that come from domain knowledge. Interpretable models could use on the explainability of black boxes—rather than the interpretability
case-based reasoning for complex domains. Often for structured of models—contains and perpetuates critical misconceptions that
data, sparsity is a useful measure of interpretability, because humans have generally gone unnoticed, but that can have a lasting negative
can handle at most 7 ± 2 cognitive entities at once10,11. Sparse models impact on the widespread use of ML models in society. Let us spend
allow a view of how variables interact jointly rather than individually. some time discussing this before discussing possible solutions.
We will discuss several forms of interpretable machine ML models
for different applications, but there can never be a single definition; It is a myth that there is necessarily a trade-off between accuracy
for example, in some domains sparsity is useful, and in others it is and interpretability. There is a widespread belief that more complex

Duke University, Durham, NC, USA. e-mail:

206 Nature Machine Intelligence | VOL 1 | MAY 2019 | 206–215 | www.nature.com/natmachintell

, NaTure Machine InTelligence Perspective
science, the small difference in performance between ML algorithms
can be overwhelmed by the ability to interpret results and process
the data better at the next iteration19. In those cases, the accuracy/
interpretability trade-off is reversed—more interpretability leads to
better overall accuracy, not worse.
Learning performance

Efforts working within a knowledge discovery process led me to
work in interpretable ML20. Specifically, I participated in a large-
scale effort to predict electrical grid failures across New York City.
The data were messy, including free text documents (trouble tick-
ets), accounting data about electrical cables from as far back as the
1890s and inspections data from a brand new manhole inspections
programme; even the structured data were not easily integrated
into a database, and there were confounding issues and other prob-
lems. Algorithms on a static data set were at most 1% different in
Effectiveness of explanations performance, but the ability to interpret and reprocess the data led
to significant improvements in performance, including correcting
Fig. 1 | A fictional depiction of the accuracy–interpretability trade-off. problems with the data set and revealing false assumptions about
Adapted from ref. 18, DARPA. the data generation process. The most accurate predictors we found
were sparse models with meaningful features that were constructed
through the iterative process.
models are more accurate, meaning that a complicated black box is The belief that there is always a trade-off between accuracy and
necessary for top predictive performance. However, this is often not interpretability has led many researchers to forgo the attempt to
true, particularly when the data are structured, with a good repre- produce an interpretable model. This problem is compounded by
sentation in terms of naturally meaningful features. When consid- the fact that researchers are now trained in deep learning, but not in
ering problems that have structured data with meaningful features, interpretable ML. Worse, toolkits of ML algorithms offer little in the
there is often no significant difference in performance between way of useful interfaces for interpretable ML methods.
more complex classifiers (deep neural networks, boosted decision To our knowledge, all recent review and commentary articles on
trees, random forests) and much simpler classifiers (logistic regres- this topic imply (implicitly or explicitly) that the trade-off between
sion, decision lists) after preprocessing. (Supplementary Section B interpretability and accuracy generally occurs. It could be possible
discusses this further.) In data science problems, where structured that there are application domains where a complete black box is
data with meaningful features are constructed as part of the data sci- required for a high stakes decision. As of yet, I have not encoun-
ence process, there tends to be little difference between algorithms, tered such an application, despite having worked on numerous
assuming that the data scientist follows a standard process for knowl- applications in healthcare and criminal justice (for example, ref. 21),
edge discovery (such as KDD, CRISP-DM or BigData; see refs. 15–17). energy reliability (for example, ref. 20) and financial risk assessment
Even for applications such as computer vision, where deep learn- (for example, ref. 22).
ing has major performance gains, and where interpretability is
much more difficult to define, some forms of interpretability can Explainable ML methods provide explanations that are not faith-
be imbued directly into the models without losing accuracy. This ful to what the original model computes. Explanations must be
will be discussed more later in the section ‘Algorithmic challenges wrong. They cannot have perfect fidelity with respect to the origi-
in interpretable ML’. Uninterpretable algorithms can still be useful nal model. If the explanation was completely faithful to what the
in high-stakes decisions as part of the knowledge discovery process, original model computes, the explanation would equal the original
for instance to obtain baseline levels of performance, but they are model, and one would not need the original model in the first place,
not generally the final goal of knowledge discovery. only the explanation. (In other words, this is a case where the orig-
Figure 1, taken from the DARPA Explainable Artificial inal model would be interpretable.) This leads to the danger that
Intelligence program’s Broad Agency Announcement18, exemplifies any explanation method for a black box model can be an inaccurate
a blind belief in the myth of the accuracy–interpretability trade-off. representation of the original model in parts of the feature space.
This is not a ‘real’ figure, in that it was not generated by any data. (See also ref. 23, among others.)
The axes have no quantification (there is no specific meaning to An inaccurate (low-fidelity) explanation model limits trust in
the horizontal or vertical axes). The image appears to illustrate an the explanation, and by extension, trust in the black box that it is
experiment with a static data set, where several ML algorithms are trying to explain. An explainable model that has a 90% agreement
applied to the same data set. However, this kind of smooth accuracy/ with the original model indeed explains the original model most
interpretability/explainability trade-off is atypical in data science of the time. However, an explanation model that is correct 90% of
applications with meaningful features. Even if one were to quantify the time is wrong 10% of the time. If a tenth of the explanations are
the interpretability/explainability axis and aim to show that such a incorrect, one cannot trust the explanations, and thus one cannot
trade-off did exist, it is not clear what algorithms would be applied trust the original black box. If we cannot know for certain whether
to produce this figure. (Would one actually claim it is fair to com- our explanation is correct, we cannot know whether to trust either
pare the 1984 decision tree algorithm CART to a 2018 deep learning the explanation or the original model.
model and conclude that interpretable models are not as accurate?) A more important misconception about explanations stems from
One can always create an artificial trade-off between accuracy and the terminology ‘explanation’, which is often used in a misleading
interpretability/explainability by removing parts of a more complex way, because explanation models do not always attempt to mimic
model to reduce accuracy, but this is not representative of the analy- the calculations made by the original model. Even an explanation
sis one would perform on a real problem. It is also not clear why model that predicts almost identically to a black box model might
the comparison should be performed on a static data set, because use completely different features, and is thus not faithful to the com-
any formal process for defining knowledge from data15–17 would putation of the black box. Consider a black box model for criminal
require an iterative process, where one refines the data process- recidivism prediction, where the goal is to predict whether some-
ing after interpreting the results. Generally, in the practice of data one will be arrested within a certain time after being released from

Nature Machine Intelligence | VOL 1 | MAY 2019 | 206–215 | www.nature.com/natmachintell 207

Report Copyright Violation

Written for

Course: CPA - Certified Practising Accountant

All documents for this subject (9)

Document information

Uploaded on: June 15, 2022
Number of pages: 10
Written in: 2021/2022
Type: Class notes
Professor(s): Aman john
Contains: All classes

Subjects

models
machine learning

$20.99

Get access to the full document:

Written by students who passed

Immediately available after payment

Read online or as PDF

Get to know the seller

kessysalvatory

Get to know the seller

kessysalvatory

View profile

Sold

Member since

3 year

Number of followers

Documents

Last sold

0.0

0 reviews

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller kessysalvatory. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $20.99. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews) 47347 documents were sold in the last 30 days Founded in 2010, the go-to place to buy study notes for 16 years now

Interpretable Models

Content preview

Written for

Document information

Subjects

Get to know the seller

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Didn't get what you expected? Choose another document

Pay as you like, start learning right away

Working on your references?

Frequently asked questions

What do I get when I buy this document?

Satisfaction guarantee: how does it work?

Who am I buying these notes from?

Will I be stuck with a subscription?

Can Stuvia be trusted?