Given a classifier $F$ that outputs a decision $y = F (x)$ for an instance $x$ , a counterfactual explanation consists of an instance $\tilde{x}$ such that the decision for $F$ on $\tilde{x}$ is different from $y$ (i.e. $F (\tilde{x}) \neq = F (x)$ ) and such that the difference between $x$ and $\tilde{x}$ is minimal. ^[Guidotti - Counterfactual explanations and how to find them - literature review and benchmarking]

Note

Minimality can have different meanings according to the setting.

A counterfactual explainer, on the other hand, is a function $f_{k}$ that takes as input a classifier $F$ , a set $X$ of known instances, which can be seen as the dataset and a given instance $x \in X$ and it returns a set $C = f_{k} (F, X, x) = {\tilde{x}_{1}, \dots, \tilde{x}_{h}}$ of $h \leq k$ valid counterfactual examples, where $k$ is the number of counterfactual required. ^[Guidotti - Counterfactual explanations and how to find them - literature review and benchmarking].

Counterfactual explanations can be exploited to interpret the decisions returned by AI systems employed in various settings, such as classification, regression, knowledge engineering, planning and recommendation.

In order for a counterfactual to be good, it should satisfy a set of properties, like plausibility, similarity, diversity etc.

Properties of counterfactual explainers

#todo

Efficiency
Stability
Fairness

Categorization of counterfactual explainers

#todo Insert this into Case-based vs Instance-based Counterfactual Generation.

Depending on the technique used to generate the counterfactuals, it’s possible to categorize the explainers in: #todo

Optimization-based:
Heuristic Search Strategy-based
Instance-based
Decision Tree-based

Counterfactuals can be distinguished between:

Model Agnostic, if the explainer can take in input any black-box model;
Model Specific, if the explainer is only able to explain a specific black-box model.

and also:

Data Agnostic, if the explainer can ingest any type of data
Data Specific, if the explainer can only ingest some specific types of data, such as images, text or tables.

Explainers are also divided into:

Endogenous explainers, which return examples that are selected from the given dataset, or that use feature values from the given dataset;
Exogenous explainers, which do not guarantee the presence of the generated feature values in the dataset, but they rely on instances which are obtained with interpolation of data and/or random data generation.

Techniques

Brute Force

#todo

Note for myself

In Thesis Notes it’s possible to see the brute force approach for sequential counterfactual examples, while in page=17 it’s possible to see the same approach but for non sequential counterfactuals. See if you need to insert them here to have a comparison or something.

RCE - Completely Pure Random

This technique is an alternative to brute force and consists in randomly varying random selected features and return the counterfactual if it’s valid. The approach hasn’t any guarantee of optimality.

Optimization-based

#todo

Heuristic-based

#todo

tags:#ai-explainability/counterfactual

👨🏽‍💻 Domiziano's Notes

Explorer

Counterfactual Explanation

Properties of counterfactual explainers

Categorization of counterfactual explainers

Techniques

Brute Force

RCE - Completely Pure Random

Optimization-based

Heuristic-based

Graph View

Table of Contents

Backlinks