Back-Door Criterion (p150)
![]() | ![]() |
Authors: Aymen Merrouche and Pierre-Henri Wuillemin.
This notebook follows the example from “The Book Of Why” (Pearl, 2018) chapter 4 page 150
Back-Door Criterion
Section titled “Back-Door Criterion”import pyagrum as gumimport pyagrum.lib.notebook as gnbimport pyagrum.causal as cslimport pyagrum.causal.notebook as cslnbIn a causal diagram, confounding bias is due to the flow of non-causal information between treatment and outcome through back-door paths. To neutralize this bias, we need to block these paths.
To block a non-causal path, we must perform an adjustment operation for a variable or a set of variables that would block the flow of information on that path. Such a set of variables satisfies what we call the “back-door” criterion. A set of variables satisfies the back-door criterion for if and only if:
- blocks all back-door paths between and . A “back-door path” is any path in the causal diagram between and starting with an arrow pointing towards .
- No variable in is a descendant of on a causal path, if we adjust for such a variable we would block a path that carries causal information hence the causal effect of on would be biased.
If a set of variable satisfies the back-door criterion for , the causal effect of on is given by the formula:
Example 1:
Section titled “Example 1:”e1 = gum.fastBN("X->A->Y;A->B")e1m1 = csl.CausalModel(e1)cslnb.showCausalImpact(m1, "Y", doing="X", values={})|
|
| |
|---|---|---|
| 0.5159 | 0.4841 | |
| 0.5055 | 0.4945 | |
## This function returns the set of variables which satisfies the back-door criterion for (X, Y)## None if there are no back-door paths.setOfVars = m1.backDoor("X", "Y")print("The set of variables which satisfies the back-door criterion for (X, Y) is :", setOfVars)The set of variables which satisfies the back-door criterion for (X, Y) is : NoneNo incoming arrows into X, therefore there are no back-door paths between and (as if we did a graph surgery according to the do operator), direct causal path .
Example 2:
Section titled “Example 2:”e2 = gum.fastBN("A->B->C;A->X->E->Y;B<-D->E")e2m2 = csl.CausalModel(e2)gnb.show(m2)cslnb.showCausalImpact(m2, "Y", doing="X", values={})|
|
| |
|---|---|---|
| 0.2473 | 0.7527 | |
| 0.2099 | 0.7901 | |
## This function returns the set of variables which satisfies the back-door criterion for (X, Y)## None if there are no back-door paths.setOfVars = m2.backDoor("X", "Y")print("The set of variables which satisfies the back-door criterion for (X, Y) is :", setOfVars)The set of variables which satisfies the back-door criterion for (X, Y) is : NoneThere is one back-door path from to : We don’t need to control for any set of variables; this back-door path is blocked by collider node (two incoming arrows) Controlling for collider node would open this causal path (controlling for colliders increases bias), direct causal path .
Example 3:
Section titled “Example 3:”e3 = gum.fastBN("B->X->Y;X->A<-B->Y")e3m3 = csl.CausalModel(e3)cslnb.showCausalImpact(m3, "Y", doing="X", values={})|
|
| |
|---|---|---|
| 0.6552 | 0.3448 | |
| 0.1871 | 0.8129 | |
## This function returns the set of variables which satisfies the back-door criterion for (X, Y)## None if there are no back-door paths.setOfVars = m3.backDoor("X", "Y")print("The set of variables which satisfies the back-door criterion for (X, Y) is :", setOfVars)The set of variables which satisfies the back-door criterion for (X, Y) is : {'B'}There is one back-door path from to : We need to block it by controlling for wich satisfies the back-door criterion.
Example 4 (M-bias):
Section titled “Example 4 (M-bias):”e4 = gum.fastBN("X<-A->B<-C->Y")e4m4 = csl.CausalModel(e4)cslnb.showCausalImpact(m4, "Y", doing="X", values={})|
|
|
|---|---|
| 0.5043 | 0.4957 |
## This function returns the set of variables which satisfies the back-door criterion for (X, Y)## None if there are no back-door paths.setOfVars = m4.backDoor("X", "Y")print("The set of variables which satisfies the back-door criterion for (X, Y) is :", setOfVars)The set of variables which satisfies the back-door criterion for (X, Y) is : NoneThere is one back-door path from to : We don’t need to control for any set of variables, this back-door path is blocked by collider node , the two variables are d-separated, deconfounded, independent. Controlling for collider node would make them dependant (introducing the M-bias).
Example 5:
Section titled “Example 5:”e5 = gum.fastBN("X<-B<-A->X->Y<-C->B")e5m5 = csl.CausalModel(e5)cslnb.showCausalImpact(m5, "Y", doing="X", values={})|
|
| |
|---|---|---|
| 0.9698 | 0.0302 | |
| 0.8708 | 0.1292 | |

## This function returns the set of variables which satisfies the back-door criterion for (X, Y)## None if there are no back-door paths.setOfVars = m5.backDoor("X", "Y")print("The set of variables which satisfies the back-door criterion for (X, Y) is :", setOfVars)The set of variables which satisfies the back-door criterion for (X, Y) is : {'C'}The difference between this example and the previous one is that we added an arrow between and ( ), this opens a new back-door path between and that isn’t blocked by any colliders We need to block the non-causal information that flows through it, controlling for closes this backdoor path (it prevents information from getting from to ). However, this action will open the back-door path that was formerly blocked by collider node that we are adjusting for now: And, in this case, in addition to we would also control for or for to reblock the path we opened and to block the new path.
Another solution is to control for (it prevents information from getting from to ) which satisfies the back-door criterion, it blocks the new path without reopening the one that is blocked by .
Example 6:
Section titled “Example 6:”e6 = gum.fastBN("A->X;A->B;D->A;B->X;C->B;C->E;C->Y;D->C;E->Y;E->X;F->C;F->X;F->Y;G->X;G->Y;X->Y")e6m6 = csl.CausalModel(e6)cslnb.showCausalImpact(m6, "Y", doing="X", values={})|
|
| |
|---|---|---|
| 0.6867 | 0.3133 | |
| 0.5960 | 0.4040 | |
## This function returns the set of variables which satisfies the back-door criterion for (X, Y)## None if there are no back-door paths.setOfVars = m6.backDoor("X", "Y")print("The set of variables which satisfies the back-door criterion for (X, Y) is :", setOfVars)The set of variables which satisfies the back-door criterion for (X, Y) is : {'F', 'C', 'G', 'E'}Back-door paths are:
-
-
-
- and any other back-door paths that go through
- and any other back-door paths that go through
-
- and any other back-door paths that go through
- and any other back-door paths that go through
-
- Blocked by collider : and any other back-door paths that go through will go through
- Blocked by collider : and any other back-door paths that go through will go through
-
- and any other back-door paths that go through will go through
Two sets of variables that satisfy the back-door criterion are:
- and any other back-door paths that go through will go through
- {,,,} blocking (1), (2), (3) and (5)
- {,,,,} blocking (1), (2), (3), (5), opening (4) and reblocking it.
