Research

Post-hoc explanations

Many machine learning models are so-called black-box models that are not transparent to the user. We try to provide explanations for already trained black box models. My work focuses on Counterfactual Explanations.

Inherently interpretable models

Another way to approach Interpretability in ML is to create models that are in themselves transparent and interpretable. My work focuses on generating rule sets that are both interpretable and accurate.

What does interpretability mean?

I am looking into what interpretability means from the human perspective. What type of explanations do we want? My focus here is on the healthcare domain.

Perspectives

It is important to keep the conversation going about interpretability. Commentaries and perspective pieces stimulate exchange of thoughts and give fruit for thought for future projects.

Below, I am listing the projects I have been working on including manuscripts and code implementations. For questions or thoughts please do not hesitate to get in touch.

COUNTERFACTUAL EXPLANATIONS

We can provide insights into the outcomes of machine learning models in the form of Counterfactual Explanations (CEs). With such an explanation, we explain the outcome of a model by describing an alternative scenario similar to the original one, in which the model outcome would be different. Essentially, the question we answer is Why outcome A and not B?. These type of explanations get a lot of support from psychology and philosophy because they are similar to how we give explanations in real life. In our work, we are proposing a framework to generate CEs that are easy to understand and respect certain criteria such that they also have practical value.

Counterfactual Explanations Using Optimization with Constraint Learning: In this work we propose a flexible framework to generate counterfatual explanations. Our framework is based on optimization with constraint learning and we compare our results to other approaches from the literature.

Finding Regions of Counterfactual Explanations via Robust Optimization: In this work we propose a method that can find robust CEs, i.e., CEs that remain valid when slightly perturbed. We use algorithmic ideas from robust optimization and demonstrate results on different datasets and classification models.



RULE GENERATION

In this work we are proposing an algorithm for rule generation for (multi-class) classification problems. Our approach is based on linear programming, which makes it scalable enough to cope with large datasets and flexible enough to add constraints for addressing interpretability and fairness. The rules we generate come with weights that can be interpreted as rule importance.



INTERPRETABILITY IN HEALTHCARE

Healthcare is a high stakes domain where important decisions are made every day. There is a lot of potential in machine learning models to support clinicians in their decision making process. The debate has been ongoing about what type of explanations we need in healthcare, or if we need them at all even. Here are two perspective pieces on interpretability and explainability in healthcare.

Coming up: User study on what is perceived as interpretable in healthcare



SEMANTIC MATCH

A very promiment way of explaining black-box models is using feature attribution methods, where the contribution of each feature to the outcome is calculated. For low-level features, where every feature has a distinct semantic meaning, these contributions are quite intuitive and easy to understand. However, when we are dealing with high-level features, i.e. patterns in groups of features, an array of feature contributions becomes difficult to interpret because we cannot be sure whether what we see in the explanation matches the model's representation. In our work on Semantic Match, we discuss this problem and how it relates to confirmation bias, and we propose a structured approach to evaluate Semantic Match in practice.



Workshops & Talks

PAST