Skip to content

Learning a CTBN

One of the main features of this library is the possibility to learn a CTBN.

More precisely what can be learned is : : - The dependency graph of a CTBN

  • The CIMs of a CTBN
  • (The variables and their labels from a sample)

Tools to extract data from samples are necessary. This is the role of class pyagrum.ctbn.Trajectory and function pyagrum.ctbn.CTBNFromData().

Before introducing the algorithms, here are the following definitions : : - MxxuM_{xx'|u} is the number of time a variable X go from a state x to a state x’, conditioned by an instance of its parents u. It is filled using samples.

  • MxuM_{x|u} is the number of time X goes to state x.
  • TxuT_{x|u} is the time spent in state x, conditioned by an instance of its parents u.
  • Mxxy,uM_{xx'|y,u} and Txy,uT_{x|y,u} are the same but with another conditioning variable Y in state y.

Those can be stored in pyagrum.Tensor.

Being conditioned by an instance means that the extracted data comes from time intervals where conditioning variables take specific values.

Goal : finding the qi,juq_{i,j|u} (i.e qxuq_{x|u} and qxxuq_{x \rightarrow x'|u}) coefficients.

Idea : qxuq_{x|u} = MxuTxu\frac{M_{x|u}}{T_{x|u}}; PX(xx)=MxxuMxu=qxxuqxuP_X(x\rightarrow x') = \frac{M_{x \rightarrow x'|u}}{M_{x|u}} = \frac{q_{x \rightarrow x'|u}}{q_{x|u}} Then qxxu=MxxuTxuq_{x \rightarrow x'|u} = \frac{M_{x \rightarrow x'|u}}{T_{x|u}}

To learn the graph of a CTBN (ie the dependence between variables) we use the CTPC algorithm from Bregoli et al. [BSS20] (and using Nodelman et al. [NSK02]). The independence test used is based on Fisher and chi2 tests to compare exponential distributions.

Class used to learn a CTBN (independence between variables and CIMs) using samples.

  • Parameters: source (str |**Dict [**int , List [**Tuple [**float , str , str ] ] ]) – Path to the csv file containing the samples(trajectories). Or directly the trajectories in a python dict.

Learns the parameters of ctbn’s CIMs.

  • Parameters: ctbn (CTBN) – CTBN containing the CIMs to learn.

Learns a CTBN, using the CTPC(continuous-time PC) algorithm. Reference : A. Bregoli, M. Scutari, F. Stella, Constraint-Based Learning for Continuous-Time Bayesian Networks, arXiv:2007.03248, 2020.

  • Parameters: template (CTBN) – CTBN used to find variables. If not given, variables are searched inside the trajectories. (if the trajectory is very short, some variables can be missed).
  • Returns: The learned ctbn.
  • Return type: CTBN

Reads trajectories from a csv file. Storing format : {IdSample, time, var, state}

  • Parameters: filename (str) – Path to the file.
  • Returns: The trajectories, a trajectory for every index.
  • Return type: Dict[int, List[Tuple[float, str, str]]]

Constructs a CTBN and add the corresponding variables found in the trajectories.

Warning

If data is too short, some variables or state labels might be missed.

  • Parameters: data (Dict [**int , List [**Tuple [**float , str , str ] ] ]) – The trajectories used to look for variables.
  • Returns: The resulting CTBN.
  • Return type: CTBN

Computes a CIM (Conditional Intensity Matrix) using stats from a trajectory. Variables in the tensor are not copied but directly used in the result to avoid memory issues.

  • Parameters:
    • X (str) – Name of the variable to compute CIM for.
    • M (pyagrum.Tensor) – Tensor containing the number of transitions for each pair of X’s states.
    • T (pyagrum.Tensor) – Tensor containing the time spent to transition from every state of X.
  • Returns: The resulting tensor, X’s CIM.
  • Return type: pyagrum.Tensor

class pyagrum.ctbn.Trajectory(source, ctbn=None)

Section titled “class pyagrum.ctbn.Trajectory(source, ctbn=None)”

Tools to extract useful informations from a trajectory. It is used for parameters/graph learning. It can be created from a trajectory (a dict of trajectories) or from a file that contains one.

  • Parameters:
    • source (str |**Dict [**int , List [**Tuple [**float , str , str ] ] ]) – The path to a csv file containing the samples or the dict of trajectories itself.
    • ctbn (CTBN) – To link the variables’s name in the trajectory to their pyAgrum variable. If not given, a new CTBN is created with the variables and labels found in the trajectory. (warning : if the trajectory is short, all of the variables may not be found correctly).

The samples.

  • Type: Dict[int, List[Tuple[float, str, str]]]

The CTBN used to link the names in the trajectory to pyAgrum variables.

The time length of the trajectory.

  • Type: float

Computes the CIMs of the variables in self.ctbn. Conditioning is given by the graph of self.ctbn.

Computes time spent and number of transitions values of X and returns them as pyagrum.Tensor.

  • Parameters:
    • X (str) – Name of the variable.
    • U (List [**str ]) – List of conditioning variable’s name.
  • Returns: The resulting tensors.
  • Return type: Tuple[pyagrum.Tensor, pyagrum.Tensor]

Computes time spent and number of transitions values of X when conditioned by Y and U and returns them as pyagrum.Tensor. Used for independence testing.

  • Parameters:
    • X (str) – Name of the variable.
    • Y (str) – Name of a conditioning variable not in U.
    • U (List [**str ]) – List of conditioning variable’s name.
  • Returns: The resulting tensors.
  • Return type: Tuple[pyagrum.Tensor, pyagrum.Tensor, pyagrum.Tensor]

Fills the tensors given.

  • Parameters:
    • X (str) – Name of the variable.
    • inst_u (Dict [**str , str ]) – Instance of conditioning variables.
    • Txu (pyagrum.Tensor) – Tensor to fill. Contains the time spent in each state.
    • Mxu (pyagrum.Tensor) – Tensor to fill. Contains the number of transitions from any pair of states.

setStatsForTests(X, Y, inst_u, Txu, Txyu, Mxyu)

Section titled “setStatsForTests(X, Y, inst_u, Txu, Txyu, Mxyu)”

Fills the tensors given. They are used for independence testing.

  • Parameters:
    • X (str) – Name of the variable.
    • Y (str) – Name of a conditioning variable.
    • inst_u (Dict [**str , str ]) – Instance of conditioning variables.
    • Txu (pyagrum.Tensor) – Tensor to fill. Contains the time spent in each state. Conditioned by variables in inst_u.
    • Txyu (pyagrum.Tensor) – Tensor to fill. Contains the time spent in each state. Conditioned by Y and variables in inst_u.
    • Mxyu (pyagrum.Tensor) – Tensor to fill. Contains the number of transitions from any pair of states. Conditioned by Y and variables in inst_u.

class pyagrum.ctbn.Stats(trajectory, X, Y, par)

Section titled “class pyagrum.ctbn.Stats(trajectory, X, Y, par)”

Stores all tensors used for learning.

  • Parameters:
    • trajectory (Trajectory) – Samples used to find stats.
    • X (str) – Name of the variable to study.
    • Y (str) – Name of the variable used for conditioning variable X.
    • par (List [**str ]) – List of conditioning variables of X.

Tensor containing the number of transitions the variable X does from any of its states for any instance of its parents and variable“Y“.

Tensor containing the number of transitions the variable X does from any of its states for any instance of its parents.

Tensor containing the time spent by X to transition from a state to another for any instance of its parents.

Tensor containing the time spent by X to transition from a state to another for any instance of its parents and of Y.

Conditional Intensity Matrix(CIM) of X.

Conditional Intensity Matrix(CIM) of X that includes the conditioning variable Y.

class pyagrum.ctbn.StatsIndepTest.FChi2Test(tr)

Section titled “class pyagrum.ctbn.StatsIndepTest.FChi2Test(tr)”

Bases: IndepTest

This class use 2 independence tests : Fisher Test (F-test) and chi2 Test. To test independence between 2 variables, we first consider them independent. There is independence until one of the 2 tests (F and chi2) contradict the independence hypothesis. If the hyopothesis is not rejected, the variables are considered independent.

  • Parameters: tr (Trajectory) – Samples used to extract stats.

Saves variables X and Y and the conditioning set U, and generates stats to be used in statistical tests.

  • Parameters:
    • X (str) – Name of the variable.
    • Y (str) – Name of the variable to test independence from, not in U.
    • U (List [**str ]) – List of conditioning variables.

Compute chi2-test value for every instance of the variables.

Compute F-test value for every instance of the variables.

  • Parameters:
    • M (pyagrum.Tensor) – A matrix M_{x, x’ | y, U}, for some instantiation U of the conditioning set and y of a specific parent.
    • Y (str) – A parent.
  • Returns: The tensor M_{x, x’ | U} by summing over all values of y.
  • Return type: pyagrum.Tensor

nullStateToStateTransitionHypothesisChi2(X, Y, _)

Section titled “nullStateToStateTransitionHypothesisChi2(X, Y, _)”

Decides if the null state to state transition hypothesis is rejected using chi2-test.

  • Parameters:
    • X (str) – A random variable.
    • Y (str) – A parent of X.
    • _ (List[str]) – A subset of the parents of X that does not contain Y.
    • _
  • Returns: False if X is not independent of Y given the conditioning set U.
  • Return type: bool

Decides if the null time to transition hypothesis is rejected using F-test.

  • Parameters:
    • X (str) – A random variable.
    • Y (str) – A parent of X.
    • _ (List[str]) – A subset of the parents of X that does not contain Y.
    • _
  • Returns: False if X is not independent of Y given the conditioning set U.
  • Return type: bool
  • Parameters:
    • X (str) – Name of the variable.
    • Y (str) – Name of the variable to test independence from, not in U.
    • U (List [**str ]) – List of conditioning variables.
  • Returns: true if X is independent to Y given U, otherwise false.
  • Return type: bool

class pyagrum.ctbn.StatsIndepTest.IndepTest

Section titled “class pyagrum.ctbn.StatsIndepTest.IndepTest”

Bases: object

Mother class used to test independance between 2 variables knowing some other parents.

  • Parameters:
    • X (str) – Head of the arc we want to test.
    • Y (str) – Tail of the arc we want to test.
    • U (List [**str ]) – Known parents.
  • Return type: bool

class pyagrum.ctbn.StatsIndepTest.Oracle(ctbn)

Section titled “class pyagrum.ctbn.StatsIndepTest.Oracle(ctbn)”

Bases: IndepTest

Oracle’s testing tools.

  • Parameters: ctbn (CTBN)
  • Parameters:
    • X (str) – Head of the arc we want to test.
    • Y (str) – Tail of the arc we want to test.
    • U (List [**str ]) – Known parents.
  • Returns: False if there is an arc from Y to X knowing U, True otherwise.
  • Return type: bool

pyagrum.ctbn.StatsIndepTest.sqrtTensor(tensor)

Section titled “pyagrum.ctbn.StatsIndepTest.sqrtTensor(tensor)”

Applies sqrt function to all values inside the tensor.