SHAP values, SHALL values

%load_ext autoreload
%autoreload 2

import pandas as pd
import pyagrum as gum
import pyagrum.lib.notebook as gnb
import pyagrum.explain as expl

Building the model

template = gum.fastBN("X1->X2->Y;X3->Z->Y;X0->Z;X1->Z;X2->R[5];Z->R;X1->Y")
data_path = "res/shap/Data_6var_direct_indirect.csv"

## gum.generateSample(template,1000,data_path)

learner = gum.BNLearner(data_path, template)
bn = learner.learnParameters(template.dag())
bn

1 - ShapValues

pyAgrum provides all three types of Shapley values—Conditional, Marginal, and Causal—through the classes ConditionalShapValues, MarginalShapValues, and CausalShapValues.
These classes are specifically designed to explain posterior probabilities resulting from Bayesian network inference.
Explanations are provided for each target class, making the framework well-suited for handling multi-class problems.

train = pd.read_csv(data_path)

1.1 - Conditional Shapley Values

## ConditionalShapValues requires a Bayesian Network and a target name (or target ID). The logit parameter is a boolean indicating whether to use the logit function instead of raw probabilities.
conditionalExplainer = expl.ConditionalShapValues(bn, "Y", logit=True)

Once the ConditionalShapValues object is created, the main method to call is compute, which returns an Explanation object containing all the information related to the explanation.

compute(data: tuple(pd.DataFrame, bool) | None, N: int = 100)-> Explanation

The idea is to provide us with a dataset to be explained, and to indicate whether it contains the labels or the positions of the feature values used in the Bayesian network. Alternatively, you can set data to None and specify an integer value for N. In that case, we will generate a dataset using the Bayesian network and provide an explanation for it.

1.1.1 - Global Explanation

## The explanation with data set to None
conditionalExplanation = conditionalExplainer.compute(data=None, N=50)

conditionalExplanation = conditionalExplainer.compute(data=(train, False))

To obtain the feature importances, simply write :

## The result is provided for each target value.
conditionalExplanation.importances

{0: {'X2': 0.32716064437520076,
  'X1': 0.2533375405370653,
  'X0': 0.06176712200000174,
  'R': 0.05445633444152396,
  'X3': 0.10465402104047902,
  'Z': 0.5464180054433384},
 1: {'X2': 0.3271606443752008,
  'X1': 0.25333754053706536,
  'X0': 0.06176712200000175,
  'R': 0.054456334441523965,
  'X3': 0.10465402104047898,
  'Z': 0.5464180054433385}}

You can pass the object to be explained as a dictionary, like this :

instance = {"X2": [0, 1], "X1": [1, 1], "X0": [0, 1], "R": [0, 0], "Y": [1, 1], "X3": [1, 1], "Z": [0, 1]}
globalExpl = conditionalExplainer.compute(data=(instance, False))

1.1.2 - Local Explanation

If you want to explain a single instance, there’s no need to change anything—just specify which one.

localExpl = conditionalExplainer.compute((train.iloc[0], False))

To obtain the feature contributions, simply write :

localExpl._values

{0: {'X2': -0.20330620848406428,
  'X1': -0.09507935172769902,
  'X0': 0.033732915370269866,
  'R': 0.024202249066987196,
  'X3': -0.04129914759887273,
  'Z': 0.25154641625928853},
 1: {'X2': 0.2033062084840644,
  'X1': 0.0950793517276991,
  'X0': -0.03373291537026983,
  'R': -0.02420224906698724,
  'X3': 0.04129914759887279,
  'Z': -0.25154641625928875}}

You can pass the object to be explained as a dictionary, like this :

instance = {"X2": 0, "X1": 1, "X0": 0, "R": 1, "Y": 1, "X3": 1, "Z": 1}
localExpl = conditionalExplainer.compute((instance, False))

1.2 - Marginal Shapley Values

The only difference between the Conditional Shapley Values syntax and the Marginal one lies in the instantiation of the explainer, as the Marginal approach requires a background dataset to perform computations.

background = train.sample(500)
marginalExplainer = expl.MarginalShapValues(bn=bn, target="Y", background=(background, False), logit=True)

Same idea: you can set background to None and specify an integer value for sample_size. In that case, we will generate the background dataset using the Bayesian network.

All other operations remain the same, since compute is a method of the ShapleyValues class, which is inherited by the three subclasses: ConditionalShapValues, MarginalShapValues, and CausalShapValues.

marginalExplanation = marginalExplainer.compute(data=(train, False))

1.3 - Causal Shapley Values

The syntax is the same as Marginal.

background = train.sample(100)
causalExplainer = expl.CausalShapValues(bn=bn, target="Y", background=(background, False), logit=True)

causalExplanation = causalExplainer.compute((train, False))

1.4 - Plots

PyAgrum provides three different plots: the waterfall plot for local explanations, the beeswarm plot for global explanations, and the bar plot, which can be used for both. All plots require an Explanation object as input (i.e., the one returned by the compute method), and for the waterfall and beeswarm plots, you also need to specify the target you want to visualize. For the bar plot, if no target is specified, PyAgrum will generate a multi-bar plot for all targets.

import pyagrum.explain.notebook as explnb

explnb.beeswarm(explanation=conditionalExplanation, y=1)

svg

explnb.waterfall(explanation=localExpl, y=1)

svg

## Without specifying a target
explnb.bar(explanation=conditionalExplanation)

svg

## Without specifying a target
explnb.bar(explanation=conditionalExplanation, percentage=True)

svg

explnb.bar(explanation=conditionalExplanation, y=1)

svg

explnb.bar(explanation=conditionalExplanation, y=1, percentage=True)

svg

You can also visualize Shapley Values directly on the BN. The showShapValues function returns a coloured graph that makes it easier to understand which variable is important and where it is located in the graph.

explnb.showShapValues(bn, conditionalExplanation)

svg

1.5 - BN specific : Partial explanations

If you have an instance with unobserved values, you can still explain it without modifying the Bayesian network.

instance = {"X2": 0, "X1": 1, "X0": 0, "R": 1}

## We did not observe 'Z' and 'X3'
partexpl = conditionalExplainer.compute((instance, False))

explnb.waterfall(partexpl, 1)

svg

1.6 - Backward compatibility

The new version of the library is backward compatible, so existing code will continue to work without changes.

import pyagrum.lib.explain as explib

/var/folders/r1/pj4vdx_n4_d_xpsb04kzf97r0000gp/T/ipykernel_92859/3875023621.py:1: DeprecationWarning: The module 'pyagrum.lib.explain' has been deprecated since version 2.2.2. Please use the 'pyagrum.explain' module instead.
  import pyagrum.lib.explain as explib

explainer = explib.ShapValues(bn, "Y")

res = explainer.conditional(train.head(100), plot=True, plot_importance=True)

svg

explainer.marginal(train.head(100), plot=True, plot_importance=True)

svg

{'X2': 0.3246581323977566,
 'X1': 0.306574111797693,
 'X0': 0.0,
 'R': 0.0,
 'X3': 0.0,
 'Z': 0.6415963776533493}

Since the showShapValues function existed in previous versions, it remains compatible with the earlier syntax.

explib.showShapValues(bn, res)

svg

2 - BN specific : ShallValues

The SHALL (SHapley Additive Log-Likelihood) values are derived from the SHAP values.

While SHAP values explain the contribution of a model’s features to a given prediction (relative to a target), SHALL values are used to explain the contribution to the log-likelihood of a row in a database given a Bayesian Network, relative to the mean log-likelihood of the entire database. Hence, no target variable is specified when using SHALL values. The log-likelihood is computed by applying the logarithm directly to each individual probability during the calculation. By using the logarithm, and with a large number of rows, the mean log-likelihood of the database tends to approximate the negative entropy of the data—provided that the model accurately estimates the true underlying data distribution.

We use the same model as before:

bn

pyAgrum provides all three types of Shall values adapted from the Shapley values—Conditional, Marginal, and Causal—through the classes ConditionalShallValues, MarginalShallValues, and CausalShallValues.
For Conditional and Marginal Shall values, pyAGrUM computes empirical probabilities that reflect the observed data (“true to the data”). In contrast, for Causal Shall values, empirical estimation isn’t feasible, so the probabilities must be derived from the model itself (“true to the model”).

train = pd.read_csv(data_path)

2.1 - Conditional Shall Values

background = train.sample(500)

ConditionalShallValues requires a Bayesian Network and a background dataset for the computations. The background parameter is provided as a tuple containing the dataframe and a boolean that whether the DataFrame includes labels or positional values. The log parameter is a boolean that determines whether to apply the log function instead of using raw probabilities.
Note that all rows in the background data containing NaN values in columns corresponding to variables in the Bayesian Network will be dropped.

conditionalExplainer = expl.ConditionalShallValues(bn=bn, background=(background, False), log=True)

Once the ConditionalShallValues object is created, the main method to call is compute(), which returns an Explanation object containing all the information related to the explanation.

compute(data: tuple(pd.DataFrame, bool) | None, N: int = 100)-> Explanation

The idea is to provide a dataset to be explained, and to indicate whether it contains the labels or the positions of the feature values used in the Bayesian network. Alternatively, you can set data to None and specify an integer value for N. In that case, it will generate a dataset using the Bayesian network and provide the Shall values for it.

2.1.1 - Global Explanation

## The explanation with data set to None
conditionalExplanation = conditionalExplainer.compute(data=None, N=50)

conditionalExplanation = conditionalExplainer.compute(data=(train, False), N=50)

To obtain the feature importances, simply write :

## The result is provided for each target value.
conditionalExplanation.importances

{'X1': 0.4598938965178432,
 'X2': 0.4124971427627158,
 'Y': 0.36714032493727283,
 'X3': 0.49809386292809,
 'Z': 0.527645395442878,
 'X0': 0.3942869162010165,
 'R': 0.541233143444845}

You can also pass the object to be explained as a dictionary, as shown below:

instance = {"X2": [0, 1], "X1": [1, 1], "X0": [0, 1], "R": [0, 0], "Y": [1, 1], "X3": [1, 1], "Z": [0, 1]}
globalExpl = conditionalExplainer.compute(data=(instance, False))

2.1.2 - Local Explanation

The syntax is the same for a single instance — just specify which one.

localExpl = conditionalExplainer.compute((train.iloc[0], False))

To obtain the feature contributions, simply write :

localExpl._values

{'X1': -0.12188749787546412,
 'X2': 0.10446489723197486,
 'Y': 0.1073226226674095,
 'X3': -0.3757477837916573,
 'Z': 0.14473439724643358,
 'X0': 0.17170994019411356,
 'R': -0.1776662387709618}

You can also pass the instance to be explained as a dictionary, like this :

instance = {"X2": 0, "X1": 1, "X0": 0, "R": 1, "Y": 1, "X3": 1, "Z": 1}
localExpl = conditionalExplainer.compute((instance, False))

2.2 - Marginal Shall Values

The syntax of the Marginal Shall Values is the same as Conditional. All other operations remain the same, since compute is a method of the ShallValues class, which is inherited by the three subclasses: ConditionalShallValues, MarginalShallValues, and CausalShallValues.

background = train.sample(500)
marginalExplainer = expl.MarginalShallValues(bn=bn, background=(train, False), log=True)

marginalExplanation = marginalExplainer.compute(data=(train, False))

2.3 - Causal Shall Values

The syntax of Causal Shall Values is the same as the two precedents.

background = train.sample(100)
causalExplainer = expl.CausalShallValues(bn=bn, background=(background, False), log=True)

causalExplanation = causalExplainer.compute((train, False))

2.4 - Plots

PyAgrum provides three different plots:

the beeswarm plot for global explanations,
the waterfall plot for local explanations,
and the bar plot, which can be used for both.

All plots require an Explanation object as input (i.e., the one returned by the compute method) and no target is specified. Note that for Shall values, partial explanations are not allowed: you must provide all feature values for the instance.

explnb.beeswarm(explanation=conditionalExplanation)

svg

explnb.waterfall(explanation=localExpl)

svg

Bar plot for local explanation

explnb.bar(explanation=localExpl)

svg

Bar plot for global explanation

explnb.bar(explanation=localExpl, percentage=True)

svg