Learning
pyAgrum provides a complete framework for learning Bayesian networks from data. It includes various algorithms for structure learning, parameter learning, and model evaluation. The library supports both score-based and constraint-based approaches, allowing users to choose the method that best fits their needs.
pyAgrum brings together all Bayesian network learning processes in a single, easy-to-use class: pyagrum.BNLearner. This class provides direct access to complete learning algorithms and their parameters (such as priors, scores, constraints, etc.), and also offers low-level functions that facilitate the development of new learning algorithms (for example, computing chi² or conditional likelihood on the dataset).
BNLearner allows to choose :
- the structure learning algorithm (MIIC, Greedy Hill Climbing, K2, etc.),
- the parameter learning method (including EM),
- the scoring function (BDeu, AIC, etc.) for score-based algorithms,
- the prior (smoothing, Dirichlet, etc.),
- the constraints (e.g., forbidding certain arcs, specifying a partial order among variables, etc.),
- the correction method (NML, etc.) for the MIIC algorithm,
- and many low-level functions, such as computing the chi², G² score, or the conditional likelihood on the dataset.
pyagrum.BNLearner is able to learn a Bayesian network from a database (a pandas.DataFrame) or from a csv file.
Class BNLearner
Section titled “Class BNLearner”Methods for performing learning:
Section titled “Methods for performing learning:”| fitParameters | latentVariables | learnBN |
|---|---|---|
| learnDAG | learnParameters | learnPDAG |
Structure learning algorithms:
Section titled “Structure learning algorithms:”| isConstraintBased | isScoreBased | useGreedyHillClimbing |
|---|---|---|
| useK2 | useLocalSearchWithTabuList | useMIIC |
Managing structure learning constraints:
Section titled “Managing structure learning constraints:”Scores and priors (for structure learning):
Section titled “Scores and priors (for structure learning):”| useBDeuPrior | useDirichletPrior | useSmoothingPrior |
|---|---|---|
| useScoreAIC | useScoreBD | useScoreBDeu |
| useScoreBIC | useScoreK2 | useScoreLog2Likelihood |
| useMDLCorrection | useNMLCorrection | useNoCorrection |
EM parameter learning:
Section titled “EM parameter learning:”Database inspection / direct requesting:
Section titled “Database inspection / direct requesting:”Fine-tuning the behavior of the BNLearner:
Section titled “Fine-tuning the behavior of the BNLearner:”| copyState | getNumberOfThreads | isGumNumberOfThreadsOverriden |
|---|---|---|
| setNumberOfThreads |
class pyagrum.BNLearner(*args)
Section titled “class pyagrum.BNLearner(*args)”This class provides functionality for learning Bayesian Networks from data.
BNLearner(filename,inducedTypes=True) -> BNLearner : Parameters: : - source (str or pandas.DataFrame) – the data to learn from - missingSymbols (List[str]) – list of strings that will be interpreted as missing values (by default : ?) - inducedTypes (Bool) – whether BNLearner should try to automatically find the type of each variable
BNLearner(filename,src) -> BNLearner : Parameters: : - source (str or pandas.DataFrame) – the data to learn from - src (pyagrum.BayesNet) – the Bayesian network used to find those modalities - missingSymbols (List[str]) – list of strings that will be interpreted as missing values (by default : ?)
BNLearner(learner) -> BNLearner : Parameters: : - learner (pyagrum.BNLearner) – the BNLearner to copy
EMEpsilon()
Section titled “EMEpsilon()”Returns a float corresponding to the minimal difference between two consecutive log-likelihoods under which the EM parameter learning algorithm stops.
- Returns: the minimal difference between two consecutive log-likelihoods under which EM stops.
- Return type: float
EMHistory()
Section titled “EMHistory()”Returns a list containing the log-likelihoods recorded after each expectation/maximization iteration of the EM parameter learning algorithm.
- Returns: A list of all the log-likelihoods recorded during EM’s execution
- Return type: List[float]
Warning
Recording log-likelihoods is enabled only when EM is executed in verbose mode. See method EMsetVerbosity().
EMMaxIter()
Section titled “EMMaxIter()”Returns an int containing the max number of iterations the EM parameter learning algorithm is allowed to perform when the max iterations stopping criterion is enabled.
- Returns: the max number of expectation/maximization iterations EM is allowed to perform
- Return type: float
EMMaxTime()
Section titled “EMMaxTime()”Returns a float indicating EM’s time limit when the max time stopping criterion is used by the EM parameter learning algorithm.
- Returns: the max time EM is allowed to execute its expectation/maximization iterations
- Return type: float
EMMinEpsilonRate()
Section titled “EMMinEpsilonRate()”Returns a float corresponding to the minimal log-likelihood’s evolution rate under which the EM parameter learning algorithm stops its iterations.
- Returns: the limit under which EM stops its expectation/maximization iterations
- Return type: float
EMPeriodSize()
Section titled “EMPeriodSize()”- Return type:
int
EMStateApproximationScheme()
Section titled “EMStateApproximationScheme()”- Return type:
int
EMStateMessage()
Section titled “EMStateMessage()”- Return type:
str
EMVerbosity()
Section titled “EMVerbosity()”Returns a Boolean indicating whether the EM parameter learning algorithm is in a verbose mode.
Note that EM verbosity is necessary for recording the history of the log-likelihoods computed at each expectation/maximization step.
- Returns: indicates whether EM’s verbose mode is active or not
- Return type: bool
EMdisableEpsilon()
Section titled “EMdisableEpsilon()”Disables the minimal difference between two consecutive log-likelihoods as a stopping criterion for the EM parameter learning algorithm.
- Returns: the BNLearner itself, so that we can chain useXXX() methods.
- Return type: pyagrum.BNLearner
EMdisableMaxIter()
Section titled “EMdisableMaxIter()”Do not limit EM to perform a maximal number of iterations.
- Returns: the BNLearner itself, so that we can chain useXXX() methods.
- Return type: pyagrum.BNLearner
EMdisableMaxTime()
Section titled “EMdisableMaxTime()”Allow EM to learn parameters for an infinite amount of time.
- Returns: the BNLearner itself, so that we can chain useXXX() methods.
- Return type: pyagrum.BNLearner
EMdisableMinEpsilonRate()
Section titled “EMdisableMinEpsilonRate()”Disables the minimal log-likelihood’s evolution rate as an EM parameter learning stopping criterion.
- Returns: the BNLearner itself, so that we can chain useXXX() methods.
- Return type: pyagrum.BNLearner
EMenableEpsilon()
Section titled “EMenableEpsilon()”Enforces that the minimal difference between two consecutive log-likelihoods is a stopping criterion for the EM parameter learning algorithm.
- Return type:
BNLearner - Returns:
- pyagrum.BNLearner – the BNLearner itself, so that we can chain useXXX() methods.
- Warnings
- ———
- Setting the min difference between two consecutive log-likelihoods as a stopping
- criterion disables the min log-likelihood evolution rate as a stopping criterion.
EMenableMaxIter()
Section titled “EMenableMaxIter()”Enables a limit on the number of iterations performed by EM. This number is equal to the last number specified with Method EMsetMaxIter(). See Method EMMaxIter() to get its current value.
- Returns: the BNLearner itself, so that we can chain useXXX() methods.
- Return type: pyagrum.BNLearner
EMenableMaxTime()
Section titled “EMenableMaxTime()”Forbid EM to run more than a given amount of time.
- Returns: the BNLearner itself, so that we can chain useXXX() methods.
- Return type: pyagrum.BNLearner
EMenableMinEpsilonRate()
Section titled “EMenableMinEpsilonRate()”Enables the minimal log-likelihood’s evolution rate as an EM parameter learning stopping criterion.
- Returns: the BNLearner itself, so that we can chain useXXX() methods.
- Return type: pyagrum.BNLearner
Warning
Setting this stopping criterion disables the min log-likelihod difference criterion.
EMisEnabledEpsilon()
Section titled “EMisEnabledEpsilon()”Returns a Boolean indicating whether the minimal difference between two consecutive log-likelihoods is a stopping criterion for the EM parameter learning algorithm.
- Return type:
bool
EMisEnabledMaxIter()
Section titled “EMisEnabledMaxIter()”Returns a Boolean indicating whether the max number of iterations is used by EM as a stopping criterion.
- Return type:
bool
EMisEnabledMaxTime()
Section titled “EMisEnabledMaxTime()”Returns a Boolean indicating whether the max time criterion is used as an EM stopping criterion.
- Return type:
bool
EMisEnabledMinEpsilonRate()
Section titled “EMisEnabledMinEpsilonRate()”Returns a Boolean indicating whether the minimal log-likelihood’s evolution rate is considered as a stopping criterion by the EM parameter learning algorithm.
- Return type:
bool
EMnbrIterations()
Section titled “EMnbrIterations()”Returns the number of iterations performed by the EM parameter learning algorithm.
- Return type:
int
EMsetEpsilon(eps)
Section titled “EMsetEpsilon(eps)”Enforces that the minimal difference between two consecutive log-likelihoods is chosen as a stopping criterion of the EM parameter learning algorithm and specifies the threshold on this criterion.
- Parameters: eps (float) – the log-likelihood difference below which EM stops its iterations
- Returns: the BNLearner itself, so that we can chain useXXX() methods.
- Return type: pyagrum.BNLearner
- Raises: pyagrum.OutOfBounds – If eps <= 0.
EMsetMaxIter(max)
Section titled “EMsetMaxIter(max)”Enforces a limit on the number of expectation/maximization steps performed by EM.
- Parameters: max (int) – the maximal number of iterations that EM is allowed to perform
- Returns: the BNLearner itself, so that we can chain useXXX() methods.
- Return type: pyagrum.BNLearner
- Raises: pyagrum.OutOfBounds – If max <= 1.
EMsetMaxTime(timeout)
Section titled “EMsetMaxTime(timeout)”Adds a constraint on the time that EM is allowed to run for learning parameters.
- Parameters: timeout (float) – the timeout in milliseconds
- Returns: the BNLearner itself, so that we can chain useXXX() methods.
- Return type: pyagrum.BNLearner
- Raises: pyagrum.OutOfBounds – If timeout<=0.0
EMsetMinEpsilonRate(rate)
Section titled “EMsetMinEpsilonRate(rate)”Enforces that the minimal log-likelihood’s evolution rate is considered by the EM parameter learning algorithm as a stopping criterion.
- Parameters: rate (float) – the log-likelihood evolution rate below which EM stops its iterations
- Returns: the BNLearner itself, so that we can chain useXXX() methods.
- Return type: pyagrum.BNLearner
- Raises: pyagrum.OutOfBounds – If rate <= 0.
Warning
Setting this stopping criterion disables the min log-likelihod difference criterion (if this one was enabled).
EMsetPeriodSize(p)
Section titled “EMsetPeriodSize(p)”- Parameters:
p (
int) - Return type:
BNLearner
EMsetVerbosity(v)
Section titled “EMsetVerbosity(v)”Sets or unsets the verbosity of the EM parameter learning algorithm.
Verbosity is necessary for keeping track of the history of the learning. See Method EMHistory().
- Parameters: v (bool) – sets EM’s verbose mode if and only if v = True.
- Returns: the BNLearner itself, so that we can chain useXXX() methods.
- Return type: pyagrum.BNLearner
G2(*args)
Section titled “G2(*args)”G2 computes the G2 statistic and p-value of two variables conditionally to a list of other variables.
The variables correspond to columns in the database and are specified as the names of these columns in the database. The list of variables in the conditioning set can be empty. In this case, no need to specify it.
Usage: : * G2(name1, name2, knowing=[])
- Parameters:
- name1 (str) – the name of a variable/column in the database
- name2 (str) – the name/column of another variable
- knowing (List [**str ]) – the list of the column names of the conditioning variables
- Returns: the G2 statistics and the corresponding p-value as a Tuple
- Return type: Tuple[float,float]
addForbiddenArc(*args)
Section titled “addForbiddenArc(*args)”Forbid the arc passed in argument to be added during structure learning (methods learnDAG() or learnBN()).
Usage: : 1. addForbiddenArc(tail, head) 2. addForbiddenArc(arc)
- Parameters:
- arc (pyagrum.Arc) – an arc
- head (int str) – a variable’s id or name
- tail (int str) – a variable’s id or name
- Return type:
BNLearner
addMandatoryArc(*args)
Section titled “addMandatoryArc(*args)”Allow an arc to be added if necessary during structure learning (methods learnDAG() or learnBN()).
Usage: : 1. addMandatoryArc(tail, head) 2. addMandatoryArc(arc)
- Parameters:
- arc (pyagrum.Arc) – an arc
- head (int str) – a variable’s id or name
- tail (int str) – a variable’s id or name
- Raises: pyagrum.InvalidDirectedCycle – If the added arc creates a directed cycle in the DAG
- Return type:
BNLearner
addNoChildrenNode(*args)
Section titled “addNoChildrenNode(*args)”Add to structure learning algorithms the constraint that this node cannot have any children.
- Parameters: node (int str) – a variable’s id or name
- Return type:
BNLearner
addNoParentNode(*args)
Section titled “addNoParentNode(*args)”Add the constraint that this node cannot have any parent.
- Parameters: node (int str) – a variable’s id or name
- Return type:
BNLearner
addPossibleEdge(*args)
Section titled “addPossibleEdge(*args)”assign a new possible edge
Warning
By default, all edge is possible. However, once at least one possible edge is defined, all other edges not declared possible are considered as impossible.
- Parameters:
- arc (pyagrum.Arc) – an arc
- head (int str) – a variable’s id or name
- tail (int str) – a variable’s id or name
- Return type:
BNLearner
chi2(*args)
Section titled “chi2(*args)”chi2 computes the chi2 statistic and p-value of two variables conditionally to a list of other variables.
The variables correspond to columns in the database and are specified as the names of these columns in the database. The list of variables in the conditioning set can be empty. In this case, no need to specify it.
Usage: : * chi2(name1, name2, knowing=[])
- Parameters:
- name1 (str) – the name of a variable/column in the database
- name2 (str) – the name/column of another variable
- knowing (List [**str ]) – the list of the column names of the conditioning variables
- Returns: the chi2 statistics and the associated p-value as a Tuple
- Return type: Tuple[float,float]
copyState(learner)
Section titled “copyState(learner)”Copy the state of the given pyagrum.BNLearner (as argument).
- Parameters:
- pyagrum.BNLearner – the learner whose state is copied.
- learner (
BNLearner)
- Return type:
None
correctedMutualInformation(*args)
Section titled “correctedMutualInformation(*args)”computes the mutual information between two columns, given a list of other columns (log2).
Warning
This function takes into account correction and prior. If you want the ‘raw’ mutual information, use pyagrum.BNLearner.mutualInformation
- Parameters:
- name1 (str) – the name of the first column
- name2 (str) – the name of the second column
- knowing (List [**str ]) – the list of names of conditioning columns
- Returns: the G2 statistic and the associated p-value as a Tuple
- Return type: Tuple[float,float]
currentTime()
Section titled “currentTime()”- Returns: get the current running time in second (float)
- Return type: float
databaseWeight()
Section titled “databaseWeight()”Get the database weight which is given as an equivalent sample size.
- Returns: The weight of the database
- Return type: float
domainSize(*args)
Section titled “domainSize(*args)”Return the domain size of the variable with the given name.
- Parameters: n (str | int) – the name of the id of the variable
- Return type:
int
epsilon()
Section titled “epsilon()”- Returns: the value of epsilon
- Return type: float
eraseForbiddenArc(*args)
Section titled “eraseForbiddenArc(*args)”Allow the arc to be added if necessary.
- Parameters:
- arc (pyagrum.Arc) – an arc
- head (int str) – a variable’s id or name
- tail (int str) – a variable’s id or name
- Return type:
BNLearner
eraseMandatoryArc(*args)
Section titled “eraseMandatoryArc(*args)”- Parameters:
- arc (pyagrum.Arc) – an arc
- head (int str) – a variable’s id or name
- tail (int str) – a variable’s id or name
- Return type:
BNLearner
eraseNoChildrenNode(*args)
Section titled “eraseNoChildrenNode(*args)”Remove in structure learning algorithms the constraint that this node cannot have any children.
- Parameters: node (int str) – a variable’s id or name
- Return type:
BNLearner
eraseNoParentNode(*args)
Section titled “eraseNoParentNode(*args)”Remove the constraint that this node cannot have any parent.
- Parameters: node (int str) – a variable’s id or name
- Return type:
BNLearner
erasePossibleEdge(*args)
Section titled “erasePossibleEdge(*args)”Allow the 2 arcs to be added if necessary.
- Parameters:
- arc (pyagrum.Arc) – an arc
- head (int str) – a variable’s id or name
- tail (int str) – a variable’s id or name
- Return type:
BNLearner
fitParameters(bn, take_into_account_score=True)
Section titled “fitParameters(bn, take_into_account_score=True)”forbidEM()
Section titled “forbidEM()”Forbids the use of EM for parameter learning.
- Returns: the BNLearner itself, so that we can chain useXXX() methods.
- Return type: pyagrum.BNLearner
getNumberOfThreads()
Section titled “getNumberOfThreads()”Return the number of threads used by the BNLearner during structure and parameter learning.
- Returns: the number of threads used by the BNLearner during structure and parameter learning
- Return type: int
hasMissingValues()
Section titled “hasMissingValues()”Indicates whether there are missing values in the database.
- Returns: True if there are some missing values in the database.
- Return type: bool
history()
Section titled “history()”- Returns: the scheme history
- Return type: tuple
- Raises: pyagrum.OperationNotAllowed – If the scheme did not performed or if verbosity is set to false
idFromName(var_name)
Section titled “idFromName(var_name)”- Parameters:
- var_names (str) – a variable’s name
- var_name (
str)
- Returns: the column id corresponding to a variable name
- Return type: int
- Raises: pyagrum.MissingVariableInDatabase – If a variable of the BN is not found in the database.
isConstraintBased()
Section titled “isConstraintBased()”Return wether the current learning method is constraint-based or not.
- Returns: True if the current learning method is constraint-based.
- Return type: bool
isGumNumberOfThreadsOverriden()
Section titled “isGumNumberOfThreadsOverriden()”Check if the number of threads use by the learner is the default one or not.
- Returns: True if the number of threads used by the BNLearner has been set.
- Return type: bool
isScoreBased()
Section titled “isScoreBased()”Return wether the current learning method is score-based or not.
- Returns: True if the current learning method is score-based.
- Return type: bool
isUsingEM()
Section titled “isUsingEM()”returns a Boolean indicating whether EM is used for parameter learning when the database contains missing values.
- Return type:
bool
latentVariables()
Section titled “latentVariables()”Warning
learner must be using MIIC algorithm
- Returns: the list of latent variables
- Return type: list
learnBN()
Section titled “learnBN()”Learns a BayesNet (both parameters and structure) from the BNLearner’s database
- Returns: the learnt BayesNet
- Return type: pyagrum.BayesNet
learnDAG()
Section titled “learnDAG()”learn a structure from a file
- Returns: the learned DAG
- Return type: pyagrum.DAG
learnEssentialGraph()
Section titled “learnEssentialGraph()”learn an essential graph from a file
- Returns: the learned essential graph
- Return type: pyagrum.EssentialGraph
learnPDAG()
Section titled “learnPDAG()”learn a partially directed acyclic graph (PDAG) from the BNLearner’s database
- Returns: the learned PDAG
- Return type: pyagrum.PDAG
Warning
The learning method must be constraint-based (MIIC, etc.) and not score-based (K2, GreedyHillClimbing, etc.)
learnParameters(*args)
Section titled “learnParameters(*args)”Creates a Bayes net whose structure corresponds to that passed in argument or to the last structure learnt by Method learnDAG(), and whose parameters are learnt from the BNLearner’s database.
usage: : 1. learnParameters(dag, take_into_account_score=True) 2. learnParameters(bn, take_into_account_score=True) 3. learnParameters(take_into_account_score=True)
When the first argument of Method learnParameters() is a DAG or a Bayes net (usages 1. and 2.), this one specifies the graphical structure of the returned Bayes net. Otherwise (usage 3.), Method learnParameters() is called implicitly with the last DAG learnt by the BNLearner.
The difference between calling this method with a DAG (usages 1. and 3.) or a Bayes net (usage 2.) arises when the database contains missing values and EM is used to learn the parameters. EM needs to initialize the conditional probability distributions (CPT) before iterating the expectation/maximimzation steps. When a DAG is passed in argument, these initializations are performed using a specific estimator that does not take into account the missing values in the database. The resulting CPTs are then perturbed randomly (see the noise in method useEM()). When a Bayes net is passed in argument, its CPT for a node A can be either filled exclusively with zeroes or not. In the first case, the initialization is performed as described above. In the second case, the value of A’s CPT is used as is, and a subsequent perturbation controlled by the noise level is applied.
- Parameters:
- dag (pyagrum.DAG) – specifies the graphical structure of the returned Bayes net.
- bn (pyagrum.BayesNet) – specifies the graphical structure of the returned Bayes net and, when the database contains missing values and EM is used for learning, force EM to initialize the CPTs of the resulting Bayes net to the values of those passed in argument (when they are not fully filled with zeroes) before iterating over the expectation/maximization steps.
- take_into_account_score (bool , default=True) – The graphical structure passed in argument may have been learnt from a structure learning. In this case, if the score used to learn the structure has an implicit prior (like K2 which has a 1-smoothing prior), it is important to also take into account this implicit prior for parameter learning. By default (take_into_account_score=True), we will learn parameters by taking into account the prior specified by methods usePriorXXX() + the implicit prior of the score (if any). If take_into_account_score=False, we just take into account the prior specified by usePriorXXX().
- Returns: the learnt BayesNet
- Return type: pyagrum.BayesNet
- Raises:
- pyagrum.MissingVariableInDatabase – If a variable of the Bayes net is not found in the database
- pyagrum.MissingValueInDatabase – If the database contains some missing values and EM is not used for the learning
- pyagrum.OperationNotAllowed – If EM is used but no stopping criterion has been selected
- pyagrum.UnknownLabelInDatabase – If a label is found in the database that do not correspond to the variable
Warning
When using a pyagrum.DAG as input parameter, the NodeIds in the dag and index of rows in the database must fit in order to coherently fix the structure of the BN. Generally, it is safer to use a pyagrum.BayesNet as input or even to use pyagrum.BNLearner.fitParameters.
logLikelihood(*args)
Section titled “logLikelihood(*args)”logLikelihood computes the log-likelihood for the columns in vars, given the columns in the list knowing (optional)
- Parameters:
- vars (List [**str ]) – the name of the columns of interest
- knowing (List [**str ]) – the (optional) list of names of conditioning columns
- Returns: the log-likelihood (base 2)
- Return type: float
maxIter()
Section titled “maxIter()”- Returns: the criterion on number of iterations
- Return type: int
maxTime()
Section titled “maxTime()”- Returns: the timeout(in seconds)
- Return type: float
messageApproximationScheme()
Section titled “messageApproximationScheme()”- Returns: the approximation scheme message
- Return type: str
minEpsilonRate()
Section titled “minEpsilonRate()”- Returns: the value of the minimal epsilon rate
- Return type: float
mutualInformation(*args)
Section titled “mutualInformation(*args)”computes the (log2) mutual information between two columns, given a list of other columns.
Warning
This function gives the ‘raw’ mutual information. If you want a version taking into account correction and prior, use pyagrum.BNLearner.correctedMutualInformation
- Parameters:
- name1 (str) – the name of the first column
- name2 (str) – the name of the second column
- knowing (List [**str ]) – the list of names of conditioning columns
- Returns: the log2 mutual information
- Return type: float
nameFromId(id)
Section titled “nameFromId(id)”- Parameters:
id (
int) – a node id - Returns: the variable’s name
- Return type: str
names()
Section titled “names()”- Returns: the names of the variables in the database
- Return type: Tuple[str]
nbCols()
Section titled “nbCols()”Return the number of columns in the database
- Returns: the number of columns in the database
- Return type: int
nbRows()
Section titled “nbRows()”Return the number of row in the database
- Returns: the number of rows in the database
- Return type: int
nbrIterations()
Section titled “nbrIterations()”- Returns: the number of iterations
- Return type: int
periodSize()
Section titled “periodSize()”- Returns: the number of samples between 2 stopping
- Return type: int
- Raises: pyagrum.OutOfBounds – If p<1
pseudoCount(vars)
Section titled “pseudoCount(vars)”access to pseudo-count (priors taken into account)
- Parameters: vars (list [**str ]) – a list of name of vars to add in the pseudo_count
- Return type: a Tensor containing this pseudo-counts
rawPseudoCount(*args)
Section titled “rawPseudoCount(*args)”computes the pseudoCount (taking priors into account) of the list of variables as a list of floats.
- Parameters: vars (List [**intstr ]) – the list of variables
- Returns: the pseudo-count as a list of float
- Return type: List[float]
recordWeight(i)
Section titled “recordWeight(i)”Get the weight of the ith record
- Parameters: i (int) – the position of the record in the database
- Raises: pyagrum.OutOfBounds – if i is outside the set of indices of the records
- Returns: The weight of the ith record of the database
- Return type: float
score(*args)
Section titled “score(*args)”Returns the value of the score currently in use by the BNLearner of a variable given a set of other variables
- Parameters:
- name1 (str) – the name of the variable at the LHS of the conditioning bar
- knowing (List [**str ]) – the list of names of the conditioning variables
- Returns: the value of the score
- Return type: float
setDatabaseWeight(new_weight)
Section titled “setDatabaseWeight(new_weight)”Set the database weight which is given as an equivalent sample size.
Warning
The same weight is assigned to all the rows of the learning database so that the sum of their weights is equal to the value of the parameter weight.
- Parameters:
- weight (float) – the database weight
- new_weight (
float)
- Return type:
None
setEpsilon(eps)
Section titled “setEpsilon(eps)”- Parameters: eps (float) – the epsilon we want to use
- Raises: pyagrum.OutOfBounds – If eps<0
- Return type:
None
setInitialDAG(dag)
Section titled “setInitialDAG(dag)”Sets the initial structure (DAG) used by the structure learning algorithm.
- Parameters: dag (pyagrum.DAG) – an initial pyagrum.DAG structure
- Return type:
BNLearner
setMaxIndegree(max_indegree)
Section titled “setMaxIndegree(max_indegree)”- Parameters: max_indegree (int) – the limit number of parents
- Return type:
BNLearner
setMaxIter(max)
Section titled “setMaxIter(max)”- Parameters: max (int) – the maximum number of iteration
- Raises: pyagrum.OutOfBounds – If max <= 1
- Return type:
None
setMaxTime(timeout)
Section titled “setMaxTime(timeout)”- Parameters:
- tiemout (float) – stopping criterion on timeout (in seconds)
- timeout (
float)
- Raises: pyagrum.OutOfBounds – If timeout<=0.0
- Return type:
None
setMinEpsilonRate(rate)
Section titled “setMinEpsilonRate(rate)”- Parameters: rate (float) – the minimal epsilon rate
- Return type:
None
setNumberOfThreads(nb)
Section titled “setNumberOfThreads(nb)”If the parameter n passed in argument is different from 0, the BNLearner will use n threads during learning, hence overriding pyAgrum default number of threads. If, on the contrary, n is equal to 0, the BNLearner will comply with pyAgrum default number of threads.
- Parameters:
- n (int) – the number of threads to be used by the BNLearner
- nb (
int)
- Return type:
None
setPeriodSize(p)
Section titled “setPeriodSize(p)”- Parameters: p (int) – number of samples between 2 stopping
- Raises: pyagrum.OutOfBounds – If p<1
- Return type:
None
setPossibleEdges(*args)
Section titled “setPossibleEdges(*args)”Adds a constraint to the structure learning algorithm by fixing the set of possible edges.
- Parameters: edges (Set [**Tuple [**int ] ]) – a set of edges as couples of nodeIds.
- Return type:
None
setPossibleSkeleton(skeleton)
Section titled “setPossibleSkeleton(skeleton)”Add a constraint by fixing the set of possible edges as a pyagrum.UndiGraph.
- Parameters:
- g (pyagrum.UndiGraph) – the fixed skeleton
- skeleton (
UndiGraph)
- Return type:
BNLearner
setRecordWeight(i, weight)
Section titled “setRecordWeight(i, weight)”Set the weight of the ith record
- Parameters:
- i (int) – the position of the record in the database
- weight (float) – the weight assigned to this record
- Raises: pyagrum.OutOfBounds – if i is outside the set of indices of the records
- Return type:
None
setSliceOrder(*args)
Section titled “setSliceOrder(*args)”Set a partial order on the nodes.
- Parameters: l (list) – a list of sequences (composed of ids of rows or string)
- Return type:
BNLearner
setVerbosity(v)
Section titled “setVerbosity(v)”- Parameters: v (bool) – verbosity
- Return type:
None
state()
Section titled “state()”Returns a dictionary containing the current state of the BNLearner.
- Returns: a dictionary containing the current state of the BNLearner.
- Return type: Dict[str,Any]
useBDeuPrior(weight=1.0)
Section titled “useBDeuPrior(weight=1.0)”The BDeu prior adds weight to all the cells of the counting tables. In other words, it adds weight rows in the database with equally probable values.
- Parameters: weight (float) – the prior weight
- Return type:
BNLearner
useDirichletPrior(*args)
Section titled “useDirichletPrior(*args)”Use the Dirichlet prior.
- Parameters:
- source (str |pyagrum.BayesNet) – the Dirichlet related source (filename of a database or a Bayesian network)
- weight (float (**optional )) – the weight of the prior (the ‘size’ of the corresponding ‘virtual database’)
- Return type:
BNLearner
useEM(*args)
Section titled “useEM(*args)”Sets whether we use EM for parameter learning or not, depending on the value of epsilon.
usage: : * useEM(epsilon, noise=0.1)
When epsilon is equal to 0.0, EM is forbidden, else EM is used for parameter learning whenever the database contains missing values. In this case, its stopping criterion is a threshold on the log-likelihood evolution rate, i.e., if llc and llo refer to the log-likelihoods at the current and previous EM steps respectively, EM will stop when (llc - llo) / llc drops below epsilon. If you wish to be more specific on which stopping criterion to use, you may prefer exploiting methods useEMWithRateCriterion() or useEMWithDiffCriterion().
-
Parameters:
-
epsilon (float) –
if epsilon>0 then EM is used and stops whenever the relative difference between two consecutive log-likelihoods (log-likelihood evolution rate) drops below epsilon.
if epsilon=0.0 then EM is not used. But if you wish to forbid the use of EM, prefer executing Method forbidEM() rather than useEM(0.0) as it is more unequivocal.
-
noise (float , default=0.1) – During EM’s initialization, the CPTs are randomly perturbed using the following formula: new_CPT = (1-noise) * CPT + noise * random_CPT. Parameter noise must belong to interval [0,1]. By default, noise is equal to 0.1.
-
-
Returns: the BNLearner itself, so that we can chain useXXX() methods.
-
Return type: pyagrum.BNLearner
-
Raises: pyagrum.OutOfBounds – if epsilon is strictly negative or if noise does not belong to interval [0,1].
useEMWithDiffCriterion(*args)
Section titled “useEMWithDiffCriterion(*args)”Enforces that EM with the log-likelihood min difference criterion will be used for parameter learning whenever the dataset contains missing values.
- Parameters:
- epsilon (float) – epsilon sets the approximation stopping criterion: EM stops whenever the difference between two consecutive log-likelihoods drops below epsilon. Note that epsilon should be strictly positive.
- noise (float (**optional , default = 0.1 )) – During EM’s initialization, the CPTs are randomly perturbed using the following formula: new_CPT = (1-noise) * CPT + noise * random_CPT. Parameter noise must belong to interval [0,1]. By default, noise is equal to 0.1.
- Returns: the BNLearner itself, so that we can chain useXXX() methods.
- Return type: pyagrum.BNLearner
- Raises: pyagrum.OutOfBounds – if epsilon is not strictly positive or if noise does not belong to interval [0,1].
useEMWithRateCriterion(*args)
Section titled “useEMWithRateCriterion(*args)”Enforces that EM with the log-likelihood min evolution rate stopping criterion will be used for parameter learning when the dataset contains missing values.
- Parameters:
- epsilon (float) – epsilon sets the approximation stopping criterion: EM stops whenever the absolute value of the relative difference between two consecutive log-likelihoods drops below epsilon. Note that epsilon should be strictly positive.
- noise (float , default=0.1) – During EM’s initialization, the CPTs are randomly perturbed using the following formula: new_CPT = (1-noise) * CPT + noise * random_CPT. Parameter noise must belong to interval [0,1]. By default, noise is equal to 0.1.
- Returns: the BNLearner itself, so that we can chain useXXX() methods.
- Return type: pyagrum.BNLearner
- Raises: pyagrum.OutOfBounds – if epsilon is not strictly positive or if noise does not belong to interval [0,1].
useGreedyHillClimbing()
Section titled “useGreedyHillClimbing()”Indicate that we wish to use a greedy hill climbing algorithm.
- Return type:
BNLearner
useK2(*args)
Section titled “useK2(*args)”Indicate to use the K2 algorithm (which needs a total ordering of the variables).
- Parameters: order (list [**int or str ]) – sequences of (ids or name)
- Return type:
BNLearner
useLocalSearchWithTabuList(tabu_size=100, nb_decrease=2)
Section titled “useLocalSearchWithTabuList(tabu_size=100, nb_decrease=2)”Indicate that we wish to use a local search with tabu list
- Parameters:
- tabu_size (int) – The size of the tabu list
- nb_decrease (int) – The max number of changes decreasing the score consecutively that we allow to apply
- Return type:
BNLearner
useMDLCorrection()
Section titled “useMDLCorrection()”Indicate that we wish to use the MDL correction for MIIC
- Return type:
BNLearner
useMIIC()
Section titled “useMIIC()”Indicate that we wish to use MIIC.
- Return type:
BNLearner
useNMLCorrection()
Section titled “useNMLCorrection()”Indicate that we wish to use the NML correction for MIIC
- Return type:
BNLearner
useNoCorrection()
Section titled “useNoCorrection()”Indicate that we wish to use the NoCorr correction for MIIC
- Return type:
BNLearner
useNoPrior()
Section titled “useNoPrior()”Use no prior.
- Return type:
BNLearner
useScoreAIC()
Section titled “useScoreAIC()”Indicate that we wish to use an AIC score.
- Return type:
BNLearner
useScoreBD()
Section titled “useScoreBD()”Indicate that we wish to use a BD score.
- Return type:
BNLearner
useScoreBDeu()
Section titled “useScoreBDeu()”Indicate that we wish to use a BDeu score.
- Return type:
BNLearner
useScoreBIC()
Section titled “useScoreBIC()”Indicate that we wish to use a BIC score.
- Return type:
BNLearner
useScoreK2()
Section titled “useScoreK2()”Indicate that we wish to use a K2 score.
- Return type:
BNLearner
useScoreLog2Likelihood()
Section titled “useScoreLog2Likelihood()”Indicate that we wish to use a Log2Likelihood score.
- Return type:
BNLearner
useSmoothingPrior(weight=1)
Section titled “useSmoothingPrior(weight=1)”Use the prior smoothing.
- Parameters: weight (float) – pass in argument a weight if you wish to assign a weight to the smoothing, otherwise the current weight of the learner will be used.
- Return type:
BNLearner
verbosity()
Section titled “verbosity()”- Returns: True if the verbosity is enabled
- Return type: bool