Multinomial Simpson Paradox
this notebook shows a model for a multinomial Simpson paradox.
![]() | ![]() |
import pandas as pd
import pyagrum as gumimport pyagrum.lib.notebook as gnb
import pyagrum.causal as cslimport pyagrum.causal.notebook as cslnbBuilding the models
Section titled “Building the models”## building a model including a Simpson's paradoximport scipy.stats as stats
bn = gum.fastBN("A[0,99]->B[0:40:200]<-C[0,5]->A")
bn.cpt("C").fillFromDistribution(stats.uniform, loc=0, scale=5)bn.cpt("A").fillFromDistribution(stats.uniform, loc="C*12", scale=30)bn.cpt("B").fillFromDistribution(stats.norm, loc="5+C*4-int(A/8)", scale=2);# generating a CSV, taking this model as the causal one.gum.generateSample(bn, 400, "out/sample.csv", with_labels=False)df = pd.read_csv("out/sample.csv")df.plot.scatter(x="A", y="B", c="C", colormap="tab20");cm = csl.CausalModel(bn)_, p, _ = csl.causalImpact(cm, on="B", doing="A")## building an Markov-equivalent model, generating a CSV, taking this model as the causal one.bn2 = gum.BayesNet(bn)bn2.reverseArc("C", "A")
gum.generateSample(bn2, 400, "out/sample2.csv", with_labels=False)df2 = pd.read_csv("out/sample2.csv")
cm2 = csl.CausalModel(bn2)_, p2, _ = csl.causalImpact(cm2, on="B", doing="A")The observationnal model and its paradoxal structure (exactly the same with the second Markov-equivalent model)
Section titled “The observationnal model and its paradoxal structure (exactly the same with the second Markov-equivalent model)”gnb.flow.row( gnb.getBN(bn), df.plot.scatter(x="A", y="B"), df.plot.scatter(x="A", y="B", c="C", colormap="tab20"), captions=["the observationnal model", "the trend is increasing", "the trend is decreasing for any value for C !"],)gnb.flow.row( gnb.getBN(bn2), df2.plot.scatter(x="A", y="B"), df2.plot.scatter(x="A", y="B", c="C", colormap="tab20"), captions=["the Markov-equivalent model", "the trend is increasing", "the trend is decreasing for any value for C !"],)The paradox is revealed in the trend of the inferred means : the means are increasing with the value of except for any value of …
Section titled “The paradox is revealed in the trend of the inferred means : the means are increasing with the value of AAA except for any value of CCC …”gum.config["notebook", "histogram_epsilon"] = 0.001gum.config["notebook", "histogram_discretized_scale"] = 0.4for a in [10, 20, 30]: gnb.flow.add_html(gnb.getPosterior(bn, target="B", evs={"A": a}), f"$P(B|A={a})$")gnb.flow.new_line()for a in [10, 20, 30]: gnb.flow.add_html(gnb.getPosterior(bn, target="B", evs={"A": a, "C": 0}), f"P(B | $A={a},C=0)$")gnb.flow.new_line()for a in [10, 20, 30]: gnb.flow.add_html(gnb.getPosterior(bn, target="B", evs={"A": a, "C": 2}), f"P(B | $A={a},C=2$)")gnb.flow.new_line()for a in [10, 20, 30]: gnb.flow.add_html(gnb.getPosterior(bn, target="B", evs={"A": a, "C": 4}), f"P(B | $A={a},C=4$)")gnb.flow.display()Now that the paradoxal structure is understood and the paradox is revealed, will we choose to observe (or not) before deciding to increase or decrease (with the goal to maximize ) ?
Section titled “Now that the paradoxal structure is understood and the paradox is revealed, will we choose to observe CCC (or not) before deciding to increase or decrease AAA (with the goal to maximize BBB) ?”Of course, it depends on the causal structure of the problem !
gnb.flow.add_html(cslnb.getCausalModel(cm), "the first causal model")gnb.flow.new_line()for v in [10, 20, 30]: gnb.flow.add_html(gnb.getProba(p.extract({"A": v})), f"Doing $A={v}$")gnb.flow.display()If is cause for , observing really gives a new information about .
Section titled “If CCC is cause for AAA, observing CCC really gives a new information about BBB.”gnb.flow.add_html(cslnb.getCausalModel(cm2), "the second causal model")gnb.flow.new_line()for v in [10, 20, 30]: gnb.flow.add_html(gnb.getProba(p2.extract({"A": v})), f"Doing $A={v}$")gnb.flow.display()if is cause for , observing may lead to misinterpretations about the causal role of .
Section titled “if AAA is cause for CCC, observing CCC may lead to misinterpretations about the causal role of AAA.”
