Learning

pyAgrum provides a complete framework for learning Bayesian networks from data. It includes various algorithms for structure learning, parameter learning, and model evaluation. The library supports both score-based and constraint-based approaches, allowing users to choose the method that best fits their needs.

pyAgrum brings together all Bayesian network learning processes in a single, easy-to-use class: pyagrum.BNLearner. This class provides direct access to complete learning algorithms and their parameters (such as priors, scores, constraints, etc.), and also offers low-level functions that facilitate the development of new learning algorithms (for example, computing chi² or conditional likelihood on the dataset).

BNLearner allows to choose :

the structure learning algorithm (MIIC, Greedy Hill Climbing, K2, etc.),
the parameter learning method (including EM),
the scoring function (BDeu, AIC, etc.) for score-based algorithms,
the prior (smoothing, Dirichlet, etc.),
the constraints (e.g., forbidding certain arcs, specifying a partial order among variables, etc.),
the correction method (NML, etc.) for the MIIC algorithm,
and many low-level functions, such as computing the chi², G² score, or the conditional likelihood on the dataset.

pyagrum.BNLearner is able to learn a Bayesian network from a database (a pandas.DataFrame) or from a csv file.

Class BNLearner

Methods for performing learning:

fitParameters	latentVariables	learnBN
learnDAG	learnParameters	learnPDAG

Structure learning algorithms:

isConstraintBased	isScoreBased	useGreedyHillClimbing
useK2	useLocalSearchWithTabuList	useMIIC

Managing structure learning constraints:

addForbiddenArc	addMandatoryArc	addNoChildrenNode
addNoParentNode	addPossibleEdge	eraseForbiddenArc
eraseMandatoryArc	eraseNoChildrenNode	eraseNoParentNode
erasePossibleEdge	setForbiddenArcs	setMandatoryArcs
setMaxIndegree	setSliceOrder

Scores and priors (for structure learning):

useBDeuPrior	useDirichletPrior	useSmoothingPrior
useScoreAIC	useScoreBD	useScoreBDeu
useScoreBIC	useScoreK2	useScoreLog2Likelihood
useMDLCorrection	useNMLCorrection	useNoCorrection

EM parameter learning:

EMdisableEpsilon	EMdisableMaxIter	EMdisableMaxTime
EMdisableMinEpsilonRate	EMEpsilon	EMHistory
EMMaxIter	EMMaxTime	EMMinEpsilonRate
EMState	EMStateAsInt	EMVerbosity
EMenableEpsilon	EMenableMaxIter	EMenableMaxTime
EMenableMinEpsilonRate	forbidEM	EMisEnabledEpsilon
EMisEnabledMaxIter	EMisEnabledMaxTime	EMisEnabledMinEpsilonRate
isUsingEM	EMnbrIterations	EMsetEpsilon
EMsetMaxIter	EMsetMaxTime	EMsetMinEpsilonRate
EMsetVerbosity	useEM	useEMWithDiffCriterion
useEMWithRateCriterion

Database inspection / direct requesting:

chi2	correctedMutualInformation	databaseWeight
domainSize	G2	idFromName
nameFromId	names	nbCols
nbRows	hasMissingValues	logLikelihood
mutualInformation	pseudoCount	rawPseudoCount
recordWeight	score	setDatabaseWeight
setRecordWeight

Fine-tuning the behavior of the BNLearner:

copyState	getNumberOfThreads	isGumNumberOfThreadsOverriden
setNumberOfThreads

class pyagrum.BNLearner(*args)

This class provides functionality for learning Bayesian Networks from data.

BNLearner(filename,inducedTypes=True) -> BNLearner : Parameters: : - source (str or pandas.DataFrame) – the data to learn from - missingSymbols (List[str]) – list of strings that will be interpreted as missing values (by default : ?) - inducedTypes (Bool) – whether BNLearner should try to automatically find the type of each variable

BNLearner(filename,src) -> BNLearner : Parameters: : - source (str or pandas.DataFrame) – the data to learn from - src (pyagrum.BayesNet) – the Bayesian network used to find those modalities - missingSymbols (List[str]) – list of strings that will be interpreted as missing values (by default : ?)

BNLearner(learner) -> BNLearner : Parameters: : - learner (pyagrum.BNLearner) – the BNLearner to copy

EMEpsilon()

Returns a float corresponding to the minimal difference between two consecutive log-likelihoods under which the EM parameter learning algorithm stops.

Returns: the minimal difference between two consecutive log-likelihoods under which EM stops.
Return type: float

EMHistory()

Returns a list containing the log-likelihoods recorded after each expectation/maximization iteration of the EM parameter learning algorithm.

Returns: A list of all the log-likelihoods recorded during EM’s execution
Return type: List[float]

Warning

Recording log-likelihoods is enabled only when EM is executed in verbose mode. See method EMsetVerbosity().

EMMaxIter()

Returns an int containing the max number of iterations the EM parameter learning algorithm is allowed to perform when the max iterations stopping criterion is enabled.

Returns: the max number of expectation/maximization iterations EM is allowed to perform
Return type: float

EMMaxTime()

Returns a float indicating EM’s time limit when the max time stopping criterion is used by the EM parameter learning algorithm.

Returns: the max time EM is allowed to execute its expectation/maximization iterations
Return type: float

EMMinEpsilonRate()

Returns a float corresponding to the minimal log-likelihood’s evolution rate under which the EM parameter learning algorithm stops its iterations.

Returns: the limit under which EM stops its expectation/maximization iterations
Return type: float

EMPeriodSize()

Return type: int

EMStateApproximationScheme()

Return type: int

EMStateMessage()

Return type: str

EMVerbosity()

Returns a Boolean indicating whether the EM parameter learning algorithm is in a verbose mode.

Note that EM verbosity is necessary for recording the history of the log-likelihoods computed at each expectation/maximization step.

Returns: indicates whether EM’s verbose mode is active or not
Return type: bool

EMdisableEpsilon()

Disables the minimal difference between two consecutive log-likelihoods as a stopping criterion for the EM parameter learning algorithm.

Returns: the BNLearner itself, so that we can chain useXXX() methods.
Return type: pyagrum.BNLearner

EMdisableMaxIter()

Do not limit EM to perform a maximal number of iterations.

Returns: the BNLearner itself, so that we can chain useXXX() methods.
Return type: pyagrum.BNLearner

EMdisableMaxTime()

Allow EM to learn parameters for an infinite amount of time.

Returns: the BNLearner itself, so that we can chain useXXX() methods.
Return type: pyagrum.BNLearner

EMdisableMinEpsilonRate()

Disables the minimal log-likelihood’s evolution rate as an EM parameter learning stopping criterion.

Returns: the BNLearner itself, so that we can chain useXXX() methods.
Return type: pyagrum.BNLearner

EMenableEpsilon()

Enforces that the minimal difference between two consecutive log-likelihoods is a stopping criterion for the EM parameter learning algorithm.

Return type: BNLearner
Returns:
- pyagrum.BNLearner – the BNLearner itself, so that we can chain useXXX() methods.
- Warnings
- ———
- Setting the min difference between two consecutive log-likelihoods as a stopping
- criterion disables the min log-likelihood evolution rate as a stopping criterion.

EMenableMaxIter()

Enables a limit on the number of iterations performed by EM. This number is equal to the last number specified with Method EMsetMaxIter(). See Method EMMaxIter() to get its current value.

Returns: the BNLearner itself, so that we can chain useXXX() methods.
Return type: pyagrum.BNLearner

EMenableMaxTime()

Forbid EM to run more than a given amount of time.

Returns: the BNLearner itself, so that we can chain useXXX() methods.
Return type: pyagrum.BNLearner

EMenableMinEpsilonRate()

Enables the minimal log-likelihood’s evolution rate as an EM parameter learning stopping criterion.

Returns: the BNLearner itself, so that we can chain useXXX() methods.
Return type: pyagrum.BNLearner

Warning

Setting this stopping criterion disables the min log-likelihod difference criterion.

EMisEnabledEpsilon()

Returns a Boolean indicating whether the minimal difference between two consecutive log-likelihoods is a stopping criterion for the EM parameter learning algorithm.

Return type: bool

EMisEnabledMaxIter()

Returns a Boolean indicating whether the max number of iterations is used by EM as a stopping criterion.

Return type: bool

EMisEnabledMaxTime()

Returns a Boolean indicating whether the max time criterion is used as an EM stopping criterion.

Return type: bool

EMisEnabledMinEpsilonRate()

Returns a Boolean indicating whether the minimal log-likelihood’s evolution rate is considered as a stopping criterion by the EM parameter learning algorithm.

Return type: bool

EMnbrIterations()

Returns the number of iterations performed by the EM parameter learning algorithm.

Return type: int

EMsetEpsilon(eps)

Enforces that the minimal difference between two consecutive log-likelihoods is chosen as a stopping criterion of the EM parameter learning algorithm and specifies the threshold on this criterion.

Parameters: eps (float) – the log-likelihood difference below which EM stops its iterations
Returns: the BNLearner itself, so that we can chain useXXX() methods.
Return type: pyagrum.BNLearner
Raises: pyagrum.OutOfBounds – If eps <= 0.

EMsetMaxIter(max)

Enforces a limit on the number of expectation/maximization steps performed by EM.

Parameters: max (int) – the maximal number of iterations that EM is allowed to perform
Returns: the BNLearner itself, so that we can chain useXXX() methods.
Return type: pyagrum.BNLearner
Raises: pyagrum.OutOfBounds – If max <= 1.

EMsetMaxTime(timeout)

Adds a constraint on the time that EM is allowed to run for learning parameters.

Parameters: timeout (float) – the timeout in milliseconds
Returns: the BNLearner itself, so that we can chain useXXX() methods.
Return type: pyagrum.BNLearner
Raises: pyagrum.OutOfBounds – If timeout<=0.0

EMsetMinEpsilonRate(rate)

Enforces that the minimal log-likelihood’s evolution rate is considered by the EM parameter learning algorithm as a stopping criterion.

Parameters: rate (float) – the log-likelihood evolution rate below which EM stops its iterations
Returns: the BNLearner itself, so that we can chain useXXX() methods.
Return type: pyagrum.BNLearner
Raises: pyagrum.OutOfBounds – If rate <= 0.

Warning

Setting this stopping criterion disables the min log-likelihod difference criterion (if this one was enabled).

EMsetPeriodSize(p)

Parameters: p (int)
Return type: BNLearner

EMsetVerbosity(v)

Sets or unsets the verbosity of the EM parameter learning algorithm.

Verbosity is necessary for keeping track of the history of the learning. See Method EMHistory().

Parameters: v (bool) – sets EM’s verbose mode if and only if v = True.
Returns: the BNLearner itself, so that we can chain useXXX() methods.
Return type: pyagrum.BNLearner

G2(*args)

G2 computes the G2 statistic and p-value of two variables conditionally to a list of other variables.

The variables correspond to columns in the database and are specified as the names of these columns in the database. The list of variables in the conditioning set can be empty. In this case, no need to specify it.

Usage: : * G2(name1, name2, knowing=[])

Parameters:
- name1 (str) – the name of a variable/column in the database
- name2 (str) – the name/column of another variable
- knowing (List [**str ]) – the list of the column names of the conditioning variables
Returns: the G2 statistics and the corresponding p-value as a Tuple
Return type: Tuple[float,float]

addForbiddenArc(*args)

Forbid the arc passed in argument to be added during structure learning (methods learnDAG() or learnBN()).

Usage: : 1. addForbiddenArc(tail, head) 2. addForbiddenArc(arc)

Parameters:
- arc (pyagrum.Arc) – an arc
- head (int str) – a variable’s id or name
- tail (int str) – a variable’s id or name
Return type: BNLearner

addMandatoryArc(*args)

Allow an arc to be added if necessary during structure learning (methods learnDAG() or learnBN()).

Usage: : 1. addMandatoryArc(tail, head) 2. addMandatoryArc(arc)

Parameters:
- arc (pyagrum.Arc) – an arc
- head (int str) – a variable’s id or name
- tail (int str) – a variable’s id or name
Raises: pyagrum.InvalidDirectedCycle – If the added arc creates a directed cycle in the DAG
Return type: BNLearner

addNoChildrenNode(*args)

Add to structure learning algorithms the constraint that this node cannot have any children.

Parameters: node (int str) – a variable’s id or name
Return type: BNLearner

addNoParentNode(*args)

Add the constraint that this node cannot have any parent.

Parameters: node (int str) – a variable’s id or name
Return type: BNLearner

addPossibleEdge(*args)

assign a new possible edge

Warning

By default, all edge is possible. However, once at least one possible edge is defined, all other edges not declared possible are considered as impossible.

Parameters:
- arc (pyagrum.Arc) – an arc
- head (int str) – a variable’s id or name
- tail (int str) – a variable’s id or name
Return type: BNLearner

chi2(*args)

chi2 computes the chi2 statistic and p-value of two variables conditionally to a list of other variables.

Usage: : * chi2(name1, name2, knowing=[])

Parameters:
- name1 (str) – the name of a variable/column in the database
- name2 (str) – the name/column of another variable
- knowing (List [**str ]) – the list of the column names of the conditioning variables
Returns: the chi2 statistics and the associated p-value as a Tuple
Return type: Tuple[float,float]

copyState(learner)

Copy the state of the given pyagrum.BNLearner (as argument).

Parameters:
- pyagrum.BNLearner – the learner whose state is copied.
- learner (BNLearner)
Return type: None

correctedMutualInformation(*args)

computes the mutual information between two columns, given a list of other columns (log2).

Warning

This function takes into account correction and prior. If you want the ‘raw’ mutual information, use pyagrum.BNLearner.mutualInformation

Parameters:
- name1 (str) – the name of the first column
- name2 (str) – the name of the second column
- knowing (List [**str ]) – the list of names of conditioning columns
Returns: the G2 statistic and the associated p-value as a Tuple
Return type: Tuple[float,float]

currentTime()

Returns: get the current running time in second (float)
Return type: float

databaseWeight()

Get the database weight which is given as an equivalent sample size.

Returns: The weight of the database
Return type: float

domainSize(*args)

Return the domain size of the variable with the given name.

Parameters: n (str | int) – the name of the id of the variable
Return type: int

epsilon()

Returns: the value of epsilon
Return type: float

eraseForbiddenArc(*args)

Allow the arc to be added if necessary.

Parameters:
- arc (pyagrum.Arc) – an arc
- head (int str) – a variable’s id or name
- tail (int str) – a variable’s id or name
Return type: BNLearner

eraseMandatoryArc(*args)

Parameters:
- arc (pyagrum.Arc) – an arc
- head (int str) – a variable’s id or name
- tail (int str) – a variable’s id or name
Return type: BNLearner

eraseNoChildrenNode(*args)

Remove in structure learning algorithms the constraint that this node cannot have any children.

Parameters: node (int str) – a variable’s id or name
Return type: BNLearner

eraseNoParentNode(*args)

Remove the constraint that this node cannot have any parent.

Parameters: node (int str) – a variable’s id or name
Return type: BNLearner

erasePossibleEdge(*args)

Allow the 2 arcs to be added if necessary.

Parameters:
- arc (pyagrum.Arc) – an arc
- head (int str) – a variable’s id or name
- tail (int str) – a variable’s id or name
Return type: BNLearner

fitParameters(bn, take_into_account_score=True)

forbidEM()

Forbids the use of EM for parameter learning.

Returns: the BNLearner itself, so that we can chain useXXX() methods.
Return type: pyagrum.BNLearner

getNumberOfThreads()

Return the number of threads used by the BNLearner during structure and parameter learning.

Returns: the number of threads used by the BNLearner during structure and parameter learning
Return type: int

hasMissingValues()

Indicates whether there are missing values in the database.

Returns: True if there are some missing values in the database.
Return type: bool

history()

Returns: the scheme history
Return type: tuple
Raises: pyagrum.OperationNotAllowed – If the scheme did not performed or if verbosity is set to false

idFromName(var_name)

Parameters:
- var_names (str) – a variable’s name
- var_name (str)
Returns: the column id corresponding to a variable name
Return type: int
Raises: pyagrum.MissingVariableInDatabase – If a variable of the BN is not found in the database.

isConstraintBased()

Return wether the current learning method is constraint-based or not.

Returns: True if the current learning method is constraint-based.
Return type: bool

isGumNumberOfThreadsOverriden()

Check if the number of threads use by the learner is the default one or not.

Returns: True if the number of threads used by the BNLearner has been set.
Return type: bool

isScoreBased()

Return wether the current learning method is score-based or not.

Returns: True if the current learning method is score-based.
Return type: bool

isUsingEM()

returns a Boolean indicating whether EM is used for parameter learning when the database contains missing values.

Return type: bool

latentVariables()

Warning

learner must be using MIIC algorithm

Returns: the list of latent variables
Return type: list

learnBN()

Learns a BayesNet (both parameters and structure) from the BNLearner’s database

Returns: the learnt BayesNet
Return type: pyagrum.BayesNet

learnDAG()

learn a structure from a file

Returns: the learned DAG
Return type: pyagrum.DAG

learnEssentialGraph()

learn an essential graph from a file

Returns: the learned essential graph
Return type: pyagrum.EssentialGraph

learnPDAG()

learn a partially directed acyclic graph (PDAG) from the BNLearner’s database

Returns: the learned PDAG
Return type: pyagrum.PDAG

Warning

The learning method must be constraint-based (MIIC, etc.) and not score-based (K2, GreedyHillClimbing, etc.)

learnParameters(*args)

Creates a Bayes net whose structure corresponds to that passed in argument or to the last structure learnt by Method learnDAG(), and whose parameters are learnt from the BNLearner’s database.

usage: : 1. learnParameters(dag, take_into_account_score=True) 2. learnParameters(bn, take_into_account_score=True) 3. learnParameters(take_into_account_score=True)

When the first argument of Method learnParameters() is a DAG or a Bayes net (usages 1. and 2.), this one specifies the graphical structure of the returned Bayes net. Otherwise (usage 3.), Method learnParameters() is called implicitly with the last DAG learnt by the BNLearner.

The difference between calling this method with a DAG (usages 1. and 3.) or a Bayes net (usage 2.) arises when the database contains missing values and EM is used to learn the parameters. EM needs to initialize the conditional probability distributions (CPT) before iterating the expectation/maximimzation steps. When a DAG is passed in argument, these initializations are performed using a specific estimator that does not take into account the missing values in the database. The resulting CPTs are then perturbed randomly (see the noise in method useEM()). When a Bayes net is passed in argument, its CPT for a node A can be either filled exclusively with zeroes or not. In the first case, the initialization is performed as described above. In the second case, the value of A’s CPT is used as is, and a subsequent perturbation controlled by the noise level is applied.

Parameters:
- dag (pyagrum.DAG) – specifies the graphical structure of the returned Bayes net.
- bn (pyagrum.BayesNet) – specifies the graphical structure of the returned Bayes net and, when the database contains missing values and EM is used for learning, force EM to initialize the CPTs of the resulting Bayes net to the values of those passed in argument (when they are not fully filled with zeroes) before iterating over the expectation/maximization steps.
- take_into_account_score (bool , default=True) – The graphical structure passed in argument may have been learnt from a structure learning. In this case, if the score used to learn the structure has an implicit prior (like K2 which has a 1-smoothing prior), it is important to also take into account this implicit prior for parameter learning. By default (take_into_account_score=True), we will learn parameters by taking into account the prior specified by methods usePriorXXX() + the implicit prior of the score (if any). If take_into_account_score=False, we just take into account the prior specified by usePriorXXX().
Returns: the learnt BayesNet
Return type: pyagrum.BayesNet
Raises:
- pyagrum.MissingVariableInDatabase – If a variable of the Bayes net is not found in the database
- pyagrum.MissingValueInDatabase – If the database contains some missing values and EM is not used for the learning
- pyagrum.OperationNotAllowed – If EM is used but no stopping criterion has been selected
- pyagrum.UnknownLabelInDatabase – If a label is found in the database that do not correspond to the variable

Warning

When using a pyagrum.DAG as input parameter, the NodeIds in the dag and index of rows in the database must fit in order to coherently fix the structure of the BN. Generally, it is safer to use a pyagrum.BayesNet as input or even to use pyagrum.BNLearner.fitParameters.

logLikelihood(*args)

logLikelihood computes the log-likelihood for the columns in vars, given the columns in the list knowing (optional)

Parameters:
- vars (List [**str ]) – the name of the columns of interest
- knowing (List [**str ]) – the (optional) list of names of conditioning columns
Returns: the log-likelihood (base 2)
Return type: float

maxIter()

Returns: the criterion on number of iterations
Return type: int

maxTime()

Returns: the timeout(in seconds)
Return type: float

messageApproximationScheme()

Returns: the approximation scheme message
Return type: str

minEpsilonRate()

Returns: the value of the minimal epsilon rate
Return type: float

mutualInformation(*args)

computes the (log2) mutual information between two columns, given a list of other columns.

Warning

This function gives the ‘raw’ mutual information. If you want a version taking into account correction and prior, use pyagrum.BNLearner.correctedMutualInformation

Parameters:
- name1 (str) – the name of the first column
- name2 (str) – the name of the second column
- knowing (List [**str ]) – the list of names of conditioning columns
Returns: the log2 mutual information
Return type: float

nameFromId(id)

Parameters: id (int) – a node id
Returns: the variable’s name
Return type: str

names()

Returns: the names of the variables in the database
Return type: Tuple[str]

nbCols()

Return the number of columns in the database

Returns: the number of columns in the database
Return type: int

nbRows()

Return the number of row in the database

Returns: the number of rows in the database
Return type: int

nbrIterations()

Returns: the number of iterations
Return type: int

periodSize()

Returns: the number of samples between 2 stopping
Return type: int
Raises: pyagrum.OutOfBounds – If p<1

pseudoCount(vars)

access to pseudo-count (priors taken into account)

Parameters: vars (list [**str ]) – a list of name of vars to add in the pseudo_count
Return type: a Tensor containing this pseudo-counts

rawPseudoCount(*args)

computes the pseudoCount (taking priors into account) of the list of variables as a list of floats.

Parameters: vars (List [**intstr ]) – the list of variables
Returns: the pseudo-count as a list of float
Return type: List[float]

recordWeight(i)

Get the weight of the ith record

Parameters: i (int) – the position of the record in the database
Raises: pyagrum.OutOfBounds – if i is outside the set of indices of the records
Returns: The weight of the ith record of the database
Return type: float

score(*args)

Returns the value of the score currently in use by the BNLearner of a variable given a set of other variables

Parameters:
- name1 (str) – the name of the variable at the LHS of the conditioning bar
- knowing (List [**str ]) – the list of names of the conditioning variables
Returns: the value of the score
Return type: float

setDatabaseWeight(new_weight)

Set the database weight which is given as an equivalent sample size.

Warning

The same weight is assigned to all the rows of the learning database so that the sum of their weights is equal to the value of the parameter weight.

Parameters:
- weight (float) – the database weight
- new_weight (float)
Return type: None

setEpsilon(eps)

Parameters: eps (float) – the epsilon we want to use
Raises: pyagrum.OutOfBounds – If eps<0
Return type: None

setInitialDAG(dag)

Sets the initial structure (DAG) used by the structure learning algorithm.

Parameters: dag (pyagrum.DAG) – an initial pyagrum.DAG structure
Return type: BNLearner

setMaxIndegree(max_indegree)

Parameters: max_indegree (int) – the limit number of parents
Return type: BNLearner

setMaxIter(max)

Parameters: max (int) – the maximum number of iteration
Raises: pyagrum.OutOfBounds – If max <= 1
Return type: None

setMaxTime(timeout)

Parameters:
- tiemout (float) – stopping criterion on timeout (in seconds)
- timeout (float)
Raises: pyagrum.OutOfBounds – If timeout<=0.0
Return type: None

setMinEpsilonRate(rate)

Parameters: rate (float) – the minimal epsilon rate
Return type: None

setNumberOfThreads(nb)

If the parameter n passed in argument is different from 0, the BNLearner will use n threads during learning, hence overriding pyAgrum default number of threads. If, on the contrary, n is equal to 0, the BNLearner will comply with pyAgrum default number of threads.

Parameters:
- n (int) – the number of threads to be used by the BNLearner
- nb (int)
Return type: None

setPeriodSize(p)

Parameters: p (int) – number of samples between 2 stopping
Raises: pyagrum.OutOfBounds – If p<1
Return type: None

setPossibleEdges(*args)

Adds a constraint to the structure learning algorithm by fixing the set of possible edges.

Parameters: edges (Set [**Tuple [**int ] ]) – a set of edges as couples of nodeIds.
Return type: None

setPossibleSkeleton(skeleton)

Add a constraint by fixing the set of possible edges as a pyagrum.UndiGraph.

Parameters:
- g (pyagrum.UndiGraph) – the fixed skeleton
- skeleton (UndiGraph)
Return type: BNLearner

setRecordWeight(i, weight)

Set the weight of the ith record

Parameters:
- i (int) – the position of the record in the database
- weight (float) – the weight assigned to this record
Raises: pyagrum.OutOfBounds – if i is outside the set of indices of the records
Return type: None

setSliceOrder(*args)

Set a partial order on the nodes.

Parameters: l (list) – a list of sequences (composed of ids of rows or string)
Return type: BNLearner

setVerbosity(v)

Parameters: v (bool) – verbosity
Return type: None

state()

Returns a dictionary containing the current state of the BNLearner.

Returns: a dictionary containing the current state of the BNLearner.
Return type: Dict[str,Any]

useBDeuPrior(weight=1.0)

The BDeu prior adds weight to all the cells of the counting tables. In other words, it adds weight rows in the database with equally probable values.

Parameters: weight (float) – the prior weight
Return type: BNLearner

useDirichletPrior(*args)

Use the Dirichlet prior.

Parameters:
- source (str |pyagrum.BayesNet) – the Dirichlet related source (filename of a database or a Bayesian network)
- weight (float (**optional )) – the weight of the prior (the ‘size’ of the corresponding ‘virtual database’)
Return type: BNLearner

useEM(*args)

Sets whether we use EM for parameter learning or not, depending on the value of epsilon.

usage: : * useEM(epsilon, noise=0.1)

When epsilon is equal to 0.0, EM is forbidden, else EM is used for parameter learning whenever the database contains missing values. In this case, its stopping criterion is a threshold on the log-likelihood evolution rate, i.e., if llc and llo refer to the log-likelihoods at the current and previous EM steps respectively, EM will stop when (llc - llo) / llc drops below epsilon. If you wish to be more specific on which stopping criterion to use, you may prefer exploiting methods useEMWithRateCriterion() or useEMWithDiffCriterion().

Parameters:
- epsilon (float) –
  
  if epsilon>0 then EM is used and stops whenever the relative difference between two consecutive log-likelihoods (log-likelihood evolution rate) drops below epsilon.
  
  if epsilon=0.0 then EM is not used. But if you wish to forbid the use of EM, prefer executing Method forbidEM() rather than useEM(0.0) as it is more unequivocal.
- noise (float , default=0.1) – During EM’s initialization, the CPTs are randomly perturbed using the following formula: new_CPT = (1-noise) * CPT + noise * random_CPT. Parameter noise must belong to interval [0,1]. By default, noise is equal to 0.1.
Returns: the BNLearner itself, so that we can chain useXXX() methods.
Return type: pyagrum.BNLearner
Raises: pyagrum.OutOfBounds – if epsilon is strictly negative or if noise does not belong to interval [0,1].

useEMWithDiffCriterion(*args)

Enforces that EM with the log-likelihood min difference criterion will be used for parameter learning whenever the dataset contains missing values.

Parameters:
- epsilon (float) – epsilon sets the approximation stopping criterion: EM stops whenever the difference between two consecutive log-likelihoods drops below epsilon. Note that epsilon should be strictly positive.
- noise (float (**optional , default = 0.1 )) – During EM’s initialization, the CPTs are randomly perturbed using the following formula: new_CPT = (1-noise) * CPT + noise * random_CPT. Parameter noise must belong to interval [0,1]. By default, noise is equal to 0.1.
Returns: the BNLearner itself, so that we can chain useXXX() methods.
Return type: pyagrum.BNLearner
Raises: pyagrum.OutOfBounds – if epsilon is not strictly positive or if noise does not belong to interval [0,1].

useEMWithRateCriterion(*args)

Enforces that EM with the log-likelihood min evolution rate stopping criterion will be used for parameter learning when the dataset contains missing values.

Parameters:
- epsilon (float) – epsilon sets the approximation stopping criterion: EM stops whenever the absolute value of the relative difference between two consecutive log-likelihoods drops below epsilon. Note that epsilon should be strictly positive.
- noise (float , default=0.1) – During EM’s initialization, the CPTs are randomly perturbed using the following formula: new_CPT = (1-noise) * CPT + noise * random_CPT. Parameter noise must belong to interval [0,1]. By default, noise is equal to 0.1.
Returns: the BNLearner itself, so that we can chain useXXX() methods.
Return type: pyagrum.BNLearner
Raises: pyagrum.OutOfBounds – if epsilon is not strictly positive or if noise does not belong to interval [0,1].

useGreedyHillClimbing()

Indicate that we wish to use a greedy hill climbing algorithm.

Return type: BNLearner

useK2(*args)

Indicate to use the K2 algorithm (which needs a total ordering of the variables).

Parameters: order (list [**int or str ]) – sequences of (ids or name)
Return type: BNLearner

useLocalSearchWithTabuList(tabu_size=100, nb_decrease=2)

Indicate that we wish to use a local search with tabu list

Parameters:
- tabu_size (int) – The size of the tabu list
- nb_decrease (int) – The max number of changes decreasing the score consecutively that we allow to apply
Return type: BNLearner

useMDLCorrection()

Indicate that we wish to use the MDL correction for MIIC

Return type: BNLearner

useMIIC()

Indicate that we wish to use MIIC.

Return type: BNLearner

useNMLCorrection()

Indicate that we wish to use the NML correction for MIIC

Return type: BNLearner

useNoCorrection()

Indicate that we wish to use the NoCorr correction for MIIC

Return type: BNLearner

useNoPrior()

Use no prior.

Return type: BNLearner

useScoreAIC()

Indicate that we wish to use an AIC score.

Return type: BNLearner

useScoreBD()

Indicate that we wish to use a BD score.

Return type: BNLearner

useScoreBDeu()

Indicate that we wish to use a BDeu score.

Return type: BNLearner

useScoreBIC()

Indicate that we wish to use a BIC score.

Return type: BNLearner

useScoreK2()

Indicate that we wish to use a K2 score.

Return type: BNLearner

useScoreLog2Likelihood()

Indicate that we wish to use a Log2Likelihood score.

Return type: BNLearner

useSmoothingPrior(weight=1)

Use the prior smoothing.

Parameters: weight (float) – pass in argument a weight if you wish to assign a weight to the smoothing, otherwise the current weight of the learner will be used.
Return type: BNLearner

verbosity()

Returns: True if the verbosity is enabled
Return type: bool