The CLG model
A CLG is :
: - A pyagrum.DiGraph to represents dependency between random variables. The model does not allows cycles.
- A dictionary id2var to map each NodeID to a
pyagrum.clg.GaussianVariablerandom variable. - A dictionary name2id to map each variable’s name to its NodeID.
- A dictionary arc2coef to map each arc to its coefficient.
A CLG is equivalent to a SEM (Structural Equation Model) with Gaussian variables.
class pyagrum.clg.CLG(clg=None)
Section titled “class pyagrum.clg.CLG(clg=None)”CompareStructure(clg_to_compare)
Section titled “CompareStructure(clg_to_compare)”We use the f-score to compare the causal structure of the two CLGs. We create two BNs with the same structure as the two CLGs and then compare the two BNs.
- Parameters: clg_to_compare (CLG) – The CLG to compare with.
- Returns: The f-score of the comparison.
- Return type: float
add(var)
Section titled “add(var)”Add a new variable to the CLG.
- Parameters: var (GaussianVariable) – The variable to be added to the CLG.
- Returns: The id of the added variable.
- Return type: NodeId
- Raises:
- ValueError – if the argument is None.
- NameError – if the name of the variable is empty.
- NameError – if a variable with the same name already exists in the CLG.
addArc(val1, val2, coef=1)
Section titled “addArc(val1, val2, coef=1)”Add an arc val->val2 with a coefficient coef to the CLG.
- Parameters:
- val1 (NameOrId) – The name or the NodeId of the parent variable.
- val2 (NameOrId) – The name or the NodeId of the child variable.
- coef (float or int) – The coefficient of the arc.
- Returns: The tuple of the NodeIds of the parent and the child variables.
- Return type: Tuple[NodeId, NodeId]
- Raises:
- gum.NotFound – if one of the names is not found in the CLG.
- ValueError – if the coefficient is 0.
arcs()
Section titled “arcs()”Return the list of arcs in the CLG.
- Returns: The list of arcs in the CLG.
- Return type: List[Tuple[NodeId, NodeId]]
children(val)
Section titled “children(val)”Return the list of children ids from the name or the id of a node.
- Parameters: val (NameOrId) – The name or the NodeId of the variable.
- Returns: The set of children nodes’ ids.
- Return type: Set[NodeId]
children_names(val)
Section titled “children_names(val)”Return the list of children names from the name or the id of a node.
- Parameters: val (NameOrId) – The name or the NodeId of the variable.
- Returns: The list of val’s children’s names.
- Return type: List[str]
coefArc(val1, val2)
Section titled “coefArc(val1, val2)”Return the coefficient of the arc val1->val2.
- Parameters:
- val1 (NameOrId) – The name or the NodeId of the parent variable.
- val2 (NameOrId) – The name or the NodeId of the child variable.
- Returns: The coefficient of the arc.
- Return type: float
- Raises:
- pyagrum.NotFound – if one of the names is not found in the CLG.
- pyagrum.NotFound – if the arc does not exist.
copy(clg)
Section titled “copy(clg)”Return the graph of the CLG (which is a DAG).
- Returns: The graph of the CLG.
- Return type: gum.DAG
dag2dict()
Section titled “dag2dict()”Return a dictionary representing the DAG of the CLG.
- Returns: C – A directed graph DAG representing the causal structure.
- Return type: Dict[NodeId, Set[NodeId]]
eraseArc(val1, val2)
Section titled “eraseArc(val1, val2)”Erase the arc val->val2.
existsArc(val1, val2)
Section titled “existsArc(val1, val2)”Check if an arc val->val2 exists.
- Parameters:
- val1 (NameOrId) – The name or the NodeId of the parent variable.
- val2 (NameOrId) – The name or the NodeId of the child variable.
- Returns: True if the arc exists.
- Return type: bool
- Raises: gum.NotFound – if one of the names is not found in the CLG.
idFromName(name)
Section titled “idFromName(name)”Return the NodeId from the name.
- Parameters: name (str) – The name of the variable.
- Returns: The NodeId of the variable.
- Return type: NodeId
- Raises: gum.NotFound – if the name is not found in the CLG.
logLikelihood(data)
Section titled “logLikelihood(data)”Return the log-likelihood of the data.
- Parameters: data (csv file) – The data.
- Returns: The log-likelihood of the data for the CLG.
- Return type: float
name(node)
Section titled “name(node)”Return the associated name of the variable.
- Parameters: node (NodeId) – The id of the variable.
- Returns: The associated name of the variable.
- Return type: str
- Raises: gum.NotFound – if the node is not found in the CLG.
nameOrId(val)
Section titled “nameOrId(val)”Return the NodeId from the name or the NodeId.
- Parameters: val (NameOrId) – The name or the NodeId of the variable.
- Returns: The NodeId of the variable.
- Return type: NodeId
names()
Section titled “names()”Return the list of names in the CLG.
- Returns: The list of names in the CLG.
- Return type: List[str]
nodes()
Section titled “nodes()”Return the list of NodeIds in the CLG.
- Returns: The list of NodeIds in the CLG.
- Return type: List[NodeId]
parent_names(val)
Section titled “parent_names(val)”Return the list of parents names from the name or the id of a node.
- Parameters: val (NameOrId) – The name or the NodeId of the variable.
- Returns: The list of val’s parents’ names.
- Return type: List[str]
parents(val)
Section titled “parents(val)”Return the list of parent ids from the name or the id of a node.
- Parameters: val (NameOrId) – The name or the NodeId of the variable.
- Returns: The set of parent nodes’ ids.
- Return type: Set[NodeId]
setCoef(val1, val2, coef)
Section titled “setCoef(val1, val2, coef)”Set the coefficient of an arc val1->val2.
- Parameters:
- val1 (NameOrId) – The name or the NodeId of the parent variable.
- val2 (NameOrId) – The name or the NodeId of the child variable.
- coef (float or int) – The new coefficient of the arc.
- Raises:
- gum.NotFound – if one of the names is not found in the CLG.
- ValueError – if the coefficient is 0.
- ValueError – if the arc does not exist.
setMu(node, mu)
Section titled “setMu(node, mu)”Set the mean of a variable.
- Parameters:
- node (NodeId) – The id of the variable.
- mu (float) – The new mean of the variable.
- Raises: gum.NotFound – if the node is not found in the CLG.
setSigma(node, sigma)
Section titled “setSigma(node, sigma)”Set the standard deviation of a variable.
- Parameters:
- node (NodeId) – The id of the variable.
- sigma (float) – The new standard deviation of the variable.
- Raises: gum.NotFound – if the node is not found in the CLG.
toDot()
Section titled “toDot()”topologicalOrder()
Section titled “topologicalOrder()”Return the topological order of the CLG.
- Returns: The list of NodeIds in the topological order.
- Return type: List[NodeId]
variable(val)
Section titled “variable(val)”Return the variable from the NodeId or from the name.
- Parameters: val (NameOrId) – The name or the NodeId of the variable.
- Returns: The variable.
- Return type: GaussianVariable
- Raises: gum.NotFound – if val is not Found in the CLG.
variables()
Section titled “variables()”Return the list of the variables in the CLG.
- Returns: The list of the variables in the CLG.
- Return type: List[GaussianVariable]
class pyagrum.clg.SEM
Section titled “class pyagrum.clg.SEM”This class is used to parse a SEM into a CLG model or convert a CLG model into a SEM.
sem = SEM(‘’’
hyper parameters
Section titled “hyper parameters”A = 4[5] B = 3[5] C = -2[5]
equations
Section titled “equations”D = A[.2] # D is a noisy version of A E = 1 + D + 2 B[2] F = E + C + 3.5*B + E[0.001] ‘’’)
FIND_FLOAT = ’^([0-9]*\\.?[0-9]*)$‘
Section titled “FIND_FLOAT = ’^([0-9]*\\.?[0-9]*)$‘”FIND_STDDEV = ’^\\[([0-9]*\\.?[0-9]*)\\]$‘
Section titled “FIND_STDDEV = ’^\\[([0-9]*\\.?[0-9]*)\\]$‘”FINDTERM *= ’^([0-9]*\\.?[0-9]*)\\*?([a-zA-Z]\\w*)$‘*
Section titled “FINDTERM *= ’^([0-9]*\\.?[0-9]*)\\*?([a-zA-Z]\\w*)$‘*”FINDVAR *= ’^([a-zA-Z]\\w*)$‘*
Section titled “FINDVAR *= ’^([a-zA-Z]\\w*)$‘*”ID = ‘[a-zA-Z]\\w*‘_
Section titled “ID = ‘[a-zA-Z]\\w*‘_”NUMBER = ‘[0-9]*\\.?[0-9]*‘
Section titled “NUMBER = ‘[0-9]*\\.?[0-9]*‘”static loadCLG(filename)
Section titled “static loadCLG(filename)”Load the CLG from the file containing a SEM.
- Parameters: filename (str) – The name of the file containing the SEM of CLG.
- Return type: the loaded CLG
static saveCLG(clg, filename)
Section titled “static saveCLG(clg, filename)”Save the CLG as a SEM to a file.
- Parameters:
- clg (CLG) – The CLG model to be saved.
- filename (str) – The name of the file containing the SEM of CLG.
static toclg(sem)
Section titled “static toclg(sem)”This function parses a SEM into a CLG model.
- Parameters: sem (str) – The SEM to be parsed.
- Returns: The CLG model corresponding to the SEM.
- Return type: CLG
static tosem(clg)
Section titled “static tosem(clg)”This function converts a CLG model into a SEM.
- Parameters: clg (CLG) – The CLG model to be converted.
- Returns: lines – The SEM corresponding to the CLG model.
- Return type: str
Other functions for CLG
Section titled “Other functions for CLG”pyagrum.clg.randomCLG(nb_variables, names, MuMax=5, MuMin=-5, SigmaMax=10, SigmaMin=1, ArcCoefMax=10, ArcCoefMin=5)
Section titled “pyagrum.clg.randomCLG(nb_variables, names, MuMax=5, MuMin=-5, SigmaMax=10, SigmaMin=1, ArcCoefMax=10, ArcCoefMin=5)”This function generates a random CLG with nb_variables variables.
- Parameters:
- nb_variables (int) – The number of variables in the CLG.
- names (str) – The list of names of the variables.
- MuMax (float) – The maximum value of mu.
- MuMin (float) – The minimum value of mu.
- SigmaMax (float) – The maximum value of sigma.
- SigmaMin (float) – The minimum value of sigma.
- ArcCoefMax (float) – The maximum value of the coefficient of the arc.
- ArcCoefMin (float) – The minimum value of the coefficient of the arc.
- Returns: The random CLG.
- Return type: CLG