Learning a CTBN
One of the main features of this library is the possibility to learn a CTBN.
More precisely what can be learned is : : - The dependency graph of a CTBN
- The CIMs of a CTBN
- (The variables and their labels from a sample)
Tools to extract data from samples are necessary. This is the role of class pyagrum.ctbn.Trajectory and function pyagrum.ctbn.CTBNFromData().
Before introducing the algorithms, here are the following definitions : : - is the number of time a variable X go from a state x to a state x’, conditioned by an instance of its parents u. It is filled using samples.
- is the number of time X goes to state x.
- is the time spent in state x, conditioned by an instance of its parents u.
- and are the same but with another conditioning variable Y in state y.
Those can be stored in pyagrum.Tensor.
Being conditioned by an instance means that the extracted data comes from time intervals where conditioning variables take specific values.
Learning parameters : learning the CIMs
Section titled “Learning parameters : learning the CIMs”Goal : finding the (i.e and ) coefficients.
Idea : = ; Then
Learning the graph
Section titled “Learning the graph”To learn the graph of a CTBN (ie the dependence between variables) we use the CTPC algorithm from Bregoli et al. [BSS20] (and using Nodelman et al. [NSK02]). The independence test used is based on Fisher and chi2 tests to compare exponential distributions.
class pyagrum.ctbn.Learner(source)
Section titled “class pyagrum.ctbn.Learner(source)”Class used to learn a CTBN (independence between variables and CIMs) using samples.
- Parameters: source (str |**Dict [**int , List [**Tuple [**float , str , str ] ] ]) – Path to the csv file containing the samples(trajectories). Or directly the trajectories in a python dict.
fitParameters(ctbn)
Section titled “fitParameters(ctbn)”Learns the parameters of ctbn’s CIMs.
- Parameters: ctbn (CTBN) – CTBN containing the CIMs to learn.
learnCTBN(template=None)
Section titled “learnCTBN(template=None)”Learns a CTBN, using the CTPC(continuous-time PC) algorithm. Reference : A. Bregoli, M. Scutari, F. Stella, Constraint-Based Learning for Continuous-Time Bayesian Networks, arXiv:2007.03248, 2020.
- Parameters: template (CTBN) – CTBN used to find variables. If not given, variables are searched inside the trajectories. (if the trajectory is very short, some variables can be missed).
- Returns: The learned ctbn.
- Return type: CTBN
pyagrum.ctbn.readTrajectoryCSV(filename)
Section titled “pyagrum.ctbn.readTrajectoryCSV(filename)”Reads trajectories from a csv file. Storing format : {IdSample, time, var, state}
- Parameters: filename (str) – Path to the file.
- Returns: The trajectories, a trajectory for every index.
- Return type: Dict[int, List[Tuple[float, str, str]]]
pyagrum.ctbn.CTBNFromData(data)
Section titled “pyagrum.ctbn.CTBNFromData(data)”Constructs a CTBN and add the corresponding variables found in the trajectories.
Warning
If data is too short, some variables or state labels might be missed.
- Parameters: data (Dict [**int , List [**Tuple [**float , str , str ] ] ]) – The trajectories used to look for variables.
- Returns: The resulting CTBN.
- Return type: CTBN
pyagrum.ctbn.computeCIMFromStats(X, M, T)
Section titled “pyagrum.ctbn.computeCIMFromStats(X, M, T)”Computes a CIM (Conditional Intensity Matrix) using stats from a trajectory. Variables in the tensor are not copied but directly used in the result to avoid memory issues.
- Parameters:
- X (str) – Name of the variable to compute CIM for.
- M (pyagrum.Tensor) – Tensor containing the number of transitions for each pair of
X’s states. - T (pyagrum.Tensor) – Tensor containing the time spent to transition from every state of
X.
- Returns:
The resulting tensor,
X’s CIM. - Return type: pyagrum.Tensor
class pyagrum.ctbn.Trajectory(source, ctbn=None)
Section titled “class pyagrum.ctbn.Trajectory(source, ctbn=None)”Tools to extract useful informations from a trajectory. It is used for parameters/graph learning. It can be created from a trajectory (a dict of trajectories) or from a file that contains one.
- Parameters:
- source (str |**Dict [**int , List [**Tuple [**float , str , str ] ] ]) – The path to a csv file containing the samples or the dict of trajectories itself.
- ctbn (CTBN) – To link the variables’s name in the trajectory to their pyAgrum variable. If not given, a new CTBN is created with the variables and labels found in the trajectory. (warning : if the trajectory is short, all of the variables may not be found correctly).
The samples.
- Type: Dict[int, List[Tuple[float, str, str]]]
The CTBN used to link the names in the trajectory to pyAgrum variables.
- Type: CTBN
timeHorizon
Section titled “timeHorizon”The time length of the trajectory.
- Type: float
computeAllCIMs()
Section titled “computeAllCIMs()”Computes the CIMs of the variables in self.ctbn. Conditioning is given by the graph of self.ctbn.
computeStats(X, U)
Section titled “computeStats(X, U)”Computes time spent and number of transitions values of X and returns them as pyagrum.Tensor.
- Parameters:
- X (str) – Name of the variable.
- U (List [**str ]) – List of conditioning variable’s name.
- Returns: The resulting tensors.
- Return type: Tuple[pyagrum.Tensor, pyagrum.Tensor]
computeStatsForTests(X, Y, U)
Section titled “computeStatsForTests(X, Y, U)”Computes time spent and number of transitions values of X when conditioned by Y and U and
returns them as pyagrum.Tensor. Used for independence testing.
- Parameters:
- X (str) – Name of the variable.
- Y (str) – Name of a conditioning variable not in
U. - U (List [**str ]) – List of conditioning variable’s name.
- Returns: The resulting tensors.
- Return type: Tuple[pyagrum.Tensor, pyagrum.Tensor, pyagrum.Tensor]
setStatValues(X, inst_u, Txu, Mxu)
Section titled “setStatValues(X, inst_u, Txu, Mxu)”Fills the tensors given.
- Parameters:
- X (str) – Name of the variable.
- inst_u (Dict [**str , str ]) – Instance of conditioning variables.
- Txu (pyagrum.Tensor) – Tensor to fill. Contains the time spent in each state.
- Mxu (pyagrum.Tensor) – Tensor to fill. Contains the number of transitions from any pair of states.
setStatsForTests(X, Y, inst_u, Txu, Txyu, Mxyu)
Section titled “setStatsForTests(X, Y, inst_u, Txu, Txyu, Mxyu)”Fills the tensors given. They are used for independence testing.
- Parameters:
- X (str) – Name of the variable.
- Y (str) – Name of a conditioning variable.
- inst_u (Dict [**str , str ]) – Instance of conditioning variables.
- Txu (pyagrum.Tensor) – Tensor to fill. Contains the time spent in each state. Conditioned by variables in
inst_u. - Txyu (pyagrum.Tensor) – Tensor to fill. Contains the time spent in each state. Conditioned by
Yand variables ininst_u. - Mxyu (pyagrum.Tensor) – Tensor to fill. Contains the number of transitions from any pair of states. Conditioned by
Yand variables ininst_u.
class pyagrum.ctbn.Stats(trajectory, X, Y, par)
Section titled “class pyagrum.ctbn.Stats(trajectory, X, Y, par)”Stores all tensors used for learning.
- Parameters:
- trajectory (Trajectory) – Samples used to find stats.
- X (str) – Name of the variable to study.
- Y (str) – Name of the variable used for conditioning variable
X. - par (List [**str ]) – List of conditioning variables of
X.
Tensor containing the number of transitions the variable X does from any
of its states for any instance of its parents and variable“Y“.
- Type: pyagrum.Tensor
Tensor containing the number of transitions the variable X does from any
of its states for any instance of its parents.
- Type: pyagrum.Tensor
Tensor containing the time spent by X to transition from a state to another for any instance of its parents.
- Type: pyagrum.Tensor
Tensor containing the time spent by X to transition from a state to another for any instance of its parents and of Y.
- Type: pyagrum.Tensor
Conditional Intensity Matrix(CIM) of X.
- Type: pyagrum.Tensor
Conditional Intensity Matrix(CIM) of X that includes the conditioning variable Y.
- Type: pyagrum.Tensor
class pyagrum.ctbn.StatsIndepTest.FChi2Test(tr)
Section titled “class pyagrum.ctbn.StatsIndepTest.FChi2Test(tr)”Bases: IndepTest
This class use 2 independence tests : Fisher Test (F-test) and chi2 Test. To test independence between 2 variables, we first consider them independent. There is independence until one of the 2 tests (F and chi2) contradict the independence hypothesis. If the hyopothesis is not rejected, the variables are considered independent.
- Parameters: tr (Trajectory) – Samples used to extract stats.
addVariables(X, Y, U)
Section titled “addVariables(X, Y, U)”Saves variables X and Y and the conditioning set U, and generates stats to be used in statistical tests.
- Parameters:
- X (str) – Name of the variable.
- Y (str) – Name of the variable to test independence from, not in
U. - U (List [**str ]) – List of conditioning variables.
computeChi2()
Section titled “computeChi2()”Compute chi2-test value for every instance of the variables.
- Returns: chi2-test value.
- Return type: pyagrum.Tensor
computeF()
Section titled “computeF()”Compute F-test value for every instance of the variables.
- Returns: F-test value.
- Return type: pyagrum.Tensor
getMxxGivenU(M, Y)
Section titled “getMxxGivenU(M, Y)”- Parameters:
- M (pyagrum.Tensor) – A matrix M_{x, x’ | y, U}, for some instantiation U of the conditioning set and y of a specific parent.
- Y (str) – A parent.
- Returns: The tensor M_{x, x’ | U} by summing over all values of y.
- Return type: pyagrum.Tensor
nullStateToStateTransitionHypothesisChi2(X, Y, _)
Section titled “nullStateToStateTransitionHypothesisChi2(X, Y, _)”Decides if the null state to state transition hypothesis is rejected using chi2-test.
- Parameters:
- X (str) – A random variable.
- Y (str) – A parent of
X. - _ (
List[str]) – A subset of the parents ofXthat does not containY. - _
- Returns:
False if
Xis not independent ofYgiven the conditioning setU. - Return type: bool
nullTimeToTransitionHypothesisF(X, Y, _)
Section titled “nullTimeToTransitionHypothesisF(X, Y, _)”Decides if the null time to transition hypothesis is rejected using F-test.
- Parameters:
- X (str) – A random variable.
- Y (str) – A parent of
X. - _ (
List[str]) – A subset of the parents ofXthat does not containY. - _
- Returns:
False if
Xis not independent ofYgiven the conditioning setU. - Return type: bool
testIndep(X, Y, U)
Section titled “testIndep(X, Y, U)”- Parameters:
- X (str) – Name of the variable.
- Y (str) – Name of the variable to test independence from, not in
U. - U (List [**str ]) – List of conditioning variables.
- Returns:
true if
Xis independent toYgivenU, otherwise false. - Return type: bool
class pyagrum.ctbn.StatsIndepTest.IndepTest
Section titled “class pyagrum.ctbn.StatsIndepTest.IndepTest”Bases: object
Mother class used to test independance between 2 variables knowing some other parents.
abstractmethod testIndep(X, Y, U)
Section titled “abstractmethod testIndep(X, Y, U)”- Parameters:
- X (str) – Head of the arc we want to test.
- Y (str) – Tail of the arc we want to test.
- U (List [**str ]) – Known parents.
- Return type:
bool
class pyagrum.ctbn.StatsIndepTest.Oracle(ctbn)
Section titled “class pyagrum.ctbn.StatsIndepTest.Oracle(ctbn)”Bases: IndepTest
Oracle’s testing tools.
- Parameters: ctbn (CTBN)
testIndep(X, Y, U)
Section titled “testIndep(X, Y, U)”- Parameters:
- X (str) – Head of the arc we want to test.
- Y (str) – Tail of the arc we want to test.
- U (List [**str ]) – Known parents.
- Returns: False if there is an arc from Y to X knowing U, True otherwise.
- Return type: bool
pyagrum.ctbn.StatsIndepTest.sqrtTensor(tensor)
Section titled “pyagrum.ctbn.StatsIndepTest.sqrtTensor(tensor)”Applies sqrt function to all values inside the tensor.
- Parameters: tensor (pyagrum.Tensor) – tensor to play sqrt to.
- Returns: sqrt of tensor.
- Return type: pyagrum.Tensor