Where is my Bag ? (p115)
![]() | ![]() |
Authors: Aymen Merrouche and Pierre-Henri Wuillemin.
This notebook follows the example from “The Book Of Why” (Pearl, 2018) chapter 3 page115
import pyagrum as gumimport pyagrum.lib.notebook as gnbimport pyagrum.causal as cslimport pyagrum.causal.notebook as cslnbimport matplotlib.pyplot as pltimport seaborn as snsAfter making a stopover, a passenger is waiting for his luggage. What are the chances that his luggage wasn’t properly routed and got lost? The appearance of his luggage on the carousel has two causes, its presence on the plane (if not, then there are no chances for him to recover it) and the time he spent waiting in front of the carousel. (The more he waits, the more his chances of seeing his luggage increase.)
causal graph for the Bag problem
Section titled “causal graph for the Bag problem”The corresponding causal diagram is the following:
ab = gum.fastBN("Elapsed time[11]->Bag on Carousel<-Bag on Plane")ab## We fill the CPTsab.cpt("Bag on Plane").fillWith(1).normalize()ab.cpt("Elapsed time").fillWith(1).normalize()ab.cpt("Bag on Carousel").fillWith([1.0, 0.0] * 11 + [1 - i / 20 if i % 2 == 0 else (i - 1) / 20 for i in range(22)])|
|
| ||
|---|---|---|---|
|
| 1.0000 | 0.0000 | |
| 1.0000 | 0.0000 | ||
| 1.0000 | 0.0000 | ||
| 1.0000 | 0.0000 | ||
| 1.0000 | 0.0000 | ||
| 1.0000 | 0.0000 | ||
| 1.0000 | 0.0000 | ||
| 1.0000 | 0.0000 | ||
| 1.0000 | 0.0000 | ||
| 1.0000 | 0.0000 | ||
| 1.0000 | 0.0000 | ||
|
| 1.0000 | 0.0000 | |
| 0.9000 | 0.1000 | ||
| 0.8000 | 0.2000 | ||
| 0.7000 | 0.3000 | ||
| 0.6000 | 0.4000 | ||
| 0.5000 | 0.5000 | ||
| 0.4000 | 0.6000 | ||
| 0.3000 | 0.7000 | ||
| 0.2000 | 0.8000 | ||
| 0.1000 | 0.9000 | ||
| 0.0000 | 1.0000 | ||
gnb.sideBySide( ab, ab.cpt("Bag on Plane"), ab.cpt("Elapsed time"), captions=["the BN", "marginal for $BagOnPlane$", "marginal for $Elapsed time$"],)It is obvious that:
- If the bag is on the plane, you will receive it within the 10 minutes:
## Knowing that 'Bag on Plane':1 and 'Elapsed time': 10gnb.showInference(ab, targets={"Bag on Carousel"}, evs={"Bag on Plane": 1, "Elapsed time": 10})- If the bag is not on the plane in the first place, there are no chances of receiving it :
## Knowing that 'Bag on Plane':0gnb.showInference(ab, targets={"Bag on Carousel"}, evs={"Bag on Plane": 0})First inference
Section titled “First inference”If minutes have passed and I still haven’t gotten my bag, what is the probability that it was on the plane (that I will eventually receive it)?
We are interested in probability: In other words, you are waiting by the carousel at the airport, minutes have passed and you still haven’t received your bag what’s the probability that you’ll eventually receive it ?. We’re looking for the probability that you’ll eventually receive your bag (i.e. your bag being on the plane as we saw above.) given that minutes have passed.
## inference engine : LazyPropagationie1 = gum.LazyPropagation(ab)ie1.setEvidence({"Elapsed time": 0, "Bag on Carousel": 0})time = {}## For every value of elapsed timefor t in range(0, 11): # We get the probability of eventually receiving the bag ie1.chgEvidence("Elapsed time", t) ie1.makeInference() time[t] = ie1.posterior("Bag on Plane")[1]## time is a dictionary : for x in [0,10] {x minutes : P(Bag On plane = true | ElapsedTime = x minutes,BagOnCarousel = False)}The curve of Abandoning Hope:
Section titled “The curve of Abandoning Hope:”## plot stylesns.set_style("darkgrid")## labels sizeplt.rc("xtick", labelsize=6)plt.rc("ytick", labelsize=6)## figure sizeplt.figure(figsize=(12, 4))plt.xlabel('Time Elapsed "min"', fontsize=12)plt.ylabel("Probability that your bag is on plane", fontsize=10)## titleplt.title("Probability of seeing your bag on the carousel given that x minutes have passed", fontsize=12)plt.bar(time.keys(), time.values())plt.show()After waiting half of the total time, don’t panic, There is still a probability of 33% that you will receive your luggage. (with a 50% probability, in the beginning, you should only lose one-third of your hope) !
gnb.showPosterior(ab, target="Bag on Plane", evs={"Elapsed time": 5, "Bag on Carousel": 0})Concerning Colliders :
Section titled “Concerning Colliders :”In the junction , the node is called a collider (where two or more arrowheads meet). This junction is not an open path no back-door paths between “Bag On Plane” and “Elapsed Time”. We don’t need to account for “Bag On Carousel” to assess the causal effect of “Bag On plane” on “Elapsed Time”. The two variables “Time Elapsed” and “Bag On plane” are d-separate, they are independent, however, they both are causes of “Bag On Carousel”
Assesing causal effect of “Bag on Plane”
Section titled “Assesing causal effect of “Bag on Plane””We don’t need to account for anything to assess the causal effect of “Bag On plane” on “Elapsed Time” :
<img src=“/images/reference/figure_3_5.png”width=“500” alt=“PyAgrum inline image”>
abModele = csl.CausalModel(ab, [("l", ("Elapsed time", "Bag on Plane"))])cslnb.showCausalImpact(abModele, "Elapsed time", doing="Bag on Plane", values={})Collider bias!
Section titled “Collider bias!”Although a little bit counter-intuitive, conditioning on “Bag On Carousel” will make the two variables dependent! Adjusting for a collider increases bias
If the bag is not on the carousel and 9 minutes have passed, then it is more likely that the bag is not on the plane.
To illustrate collider bias, we will assess the causal effect of “Elapsed Time” on “Bag On Plane” when we don’t adjust for collider “Bag On Carousel” and when we adjust for it. (for example, look only in the case where “Bag on Carousel” = 0)
Without adjusting for collider “Bag On Carousel”
Section titled “Without adjusting for collider “Bag On Carousel””abModele = csl.CausalModel(ab, [("l", ("Elapsed time", "Bag on Plane"))])cslnb.showCausalImpact(abModele, on={"Bag on Plane"}, doing={"Elapsed time"}, values={"Elapsed time": 9})|
|
|
|---|---|
| 0.5000 | 0.5000 |
The two variables are independent.
When adjusting for collider “Bag On Carousel”
Section titled “When adjusting for collider “Bag On Carousel””If we observe that the bag is not on the carousel (Bag On Carousel = 0), the more time passes the more likely the luggage is not on the plane.
abModele = csl.CausalModel(ab, [("l", ("Elapsed time", "Bag on Plane"))])cslnb.showCausalImpact( abModele, on={"Bag on Plane"}, doing={"Elapsed time"}, knowing={"Bag on Carousel"}, values={"Elapsed time": 7, "Bag on Carousel": 0},)|
|
|
|---|---|
| 0.7692 | 0.2308 |
The two variables become dependent.
Effect of “Elapsed Time” on “Bag on Plane”
Section titled “Effect of “Elapsed Time” on “Bag on Plane””We can draw a curve of the effect of “Elapsed Time” on “Bag on Plane” when adjusting for “Bag On Carousel” (=0) :
formula, impact, explanation = csl.causalImpact( abModele, on={"Bag on Plane"}, doing={"Elapsed time"}, knowing={"Bag on Carousel"})formula
As we saw previously, if you still didn’t receive your bag, the more you wait the more likely your bag is not on the plane.
## style du plotsns.set_style("darkgrid")## taille des lablesplt.rc("xtick", labelsize=6)plt.rc("ytick", labelsize=6)## taille de la figureplt.figure(figsize=(12, 4))plt.xlabel('Time Elapsed "min"', fontsize=12)plt.ylabel("Probability that your bag is not on the plane", fontsize=10)## titreplt.title( "Probability of not having your bag on the plane given that x minutes have passed and that you still didn't receive it", fontsize=12,)plt.bar(range(11), impact.extract({"Bag on Carousel": 0, "Bag on Plane": 0}).tolist())plt.show()
