Skip to content

Kullback-Leibler for Bayesian networks

Creative Commons LicenseaGrUMinteractive online version
%matplotlib inline
from pylab import *

import pyagrum and pyagrum.lib.notebook (for … notebooks :-) )

Section titled “import pyagrum and pyagrum.lib.notebook (for … notebooks :-) )”
import pyagrum as gum
import pyagrum.lib.notebook as gnb
bn = gum.loadBN("res/asia.bif")
## randomly re-generate parameters for every Conditional Probability Table
bn.generateCPTs()
bn
G tuberculosis tuberculosis tuberculos_or_cancer tuberculos_or_cancer tuberculosis->tuberculos_or_cancer smoking smoking bronchitis bronchitis smoking->bronchitis lung_cancer lung_cancer smoking->lung_cancer dyspnoea dyspnoea tuberculos_or_cancer->dyspnoea positive_XraY positive_XraY tuberculos_or_cancer->positive_XraY bronchitis->dyspnoea lung_cancer->tuberculos_or_cancer visit_to_Asia visit_to_Asia visit_to_Asia->tuberculosis
bn2 = gum.loadBN("res/asia.bif")
bn2.generateCPTs()
bn2
G tuberculosis tuberculosis tuberculos_or_cancer tuberculos_or_cancer tuberculosis->tuberculos_or_cancer smoking smoking bronchitis bronchitis smoking->bronchitis lung_cancer lung_cancer smoking->lung_cancer dyspnoea dyspnoea tuberculos_or_cancer->dyspnoea positive_XraY positive_XraY tuberculos_or_cancer->positive_XraY bronchitis->dyspnoea lung_cancer->tuberculos_or_cancer visit_to_Asia visit_to_Asia visit_to_Asia->tuberculosis
gnb.flow.row(bn.cpt(3), bn2.cpt(3), captions=["a CPT in bn", "same CPT in bn2 (with different parameters)"])
positive_XraY
tuberculos_or_cancer
0
1
0
0.67000.3300
1
0.43160.5684

a CPT in bn
positive_XraY
tuberculos_or_cancer
0
1
0
0.29430.7057
1
0.43150.5685

same CPT in bn2 (with different parameters)

Exact and (Gibbs) approximated KL-divergence

Section titled “Exact and (Gibbs) approximated KL-divergence”

In order to compute KL-divergence, we just need to be sure that the 2 distributions are defined on the same domain (same variables, etc.)

Exact KL

g1 = gum.ExactBNdistance(bn, bn2)
print(g1.compute())
{'klPQ': 2.476584381649645, 'errorPQ': 0, 'klQP': 2.244520928404808, 'errorQP': 0, 'hellinger': 0.813592705605187, 'bhattacharya': 0.4019212130892461, 'jensen-shannon': 0.4136335100698562}

If the models are not on the same domain :

bn_different_domain = gum.loadBN("res/alarm.dsl")
## g=gum.BruteForceKL(bn,bn_different_domain) # a KL-divergence between asia and alarm ... :(
#
## would cause
# ---------------------------------------------------------------------------
## OperationNotAllowed Traceback (most recent call last)
#
## OperationNotAllowed: this operation is not allowed : KL : the 2 BNs are not compatible (not the same vars : visit_to_Asia?)

Gibbs-approximated KL

g = gum.GibbsBNdistance(bn, bn2)
g.setVerbosity(True)
g.setMaxTime(120)
g.setBurnIn(5000)
g.setEpsilon(1e-7)
g.setPeriodSize(500)
print(g.compute())
print("Computed in {0} s".format(g.currentTime()))
{'klPQ': 2.475361213496724, 'errorPQ': 0, 'klQP': 2.1957241806099814, 'errorQP': 0, 'hellinger': 0.8105538873770256, 'bhattacharya': 0.3989042244816927, 'jensen-shannon': 0.411228001161747}
Computed in 1.338172 s
print("--")
print(g.messageApproximationScheme())
print("--")
print("Temps de calcul : {0}".format(g.currentTime()))
print("Nombre d'itérations : {0}".format(g.nbrIterations()))
--
stopped with epsilon=1e-07
--
Temps de calcul : 1.338172
Nombre d'itérations : 380500
p = plot(g.history(), "g")

svg

Since it may be difficult to know what happens during approximation algorithm, pyAgrum allows to follow the iteration using animated matplotlib figure

g = gum.GibbsBNdistance(bn, bn2)
g.setMaxTime(60)
g.setBurnIn(500)
g.setEpsilon(1e-7)
g.setPeriodSize(5000)
gnb.animApproximationScheme(g) # logarithmique scale for Y
g.compute()
{'klPQ': 2.469542051480538,
'errorPQ': 0,
'klQP': 2.1146518638711105,
'errorQP': 0,
'hellinger': 0.8035301805274394,
'bhattacharya': 0.404240337375901,
'jensen-shannon': 0.40444581031110494}

svg