other pyagrum.lib modules

bn2roc

The purpose of this module is to provide tools for building ROC and PR from Bayesian Network.

pyagrum.lib.bn2roc.animPR(bn, datasrc, target=‘Y’, label=‘1’)

Interactive selection of a threshold using TPR and FPR for BN and data

Parameters:
- bn (pyagrum.BayesNet) – a Bayesian network
- datasrc (str |**DataFrame) – a csv filename or a pandas.DataFrame
- target (str) – the target
- label (str) – the target label

pyagrum.lib.bn2roc.animROC(bn, datasrc, target=‘Y’, label=‘1’)

Interactive selection of a threshold using TPR and FPR for BN and data

Parameters:
- bn (pyagrum.BayesNet) – a Bayesian network
- datasrc (str |**DataFrame) – a csv filename or a pandas.DataFrame
- target (str) – the target
- label (str) – the target label

pyagrum.lib.bn2roc.getPRpoints(bn, datasrc, target, label, with_labels=True, significant_digits=10)

Compute the points of the PR curve

Parameters:
- bn (pyagrum.BayesNet) – a Bayesian network
- datasrc (str |**DataFrame) – a csv filename or a pandas.DataFrame
- target (str) – the target
- label (str) – the target’s label
- with_labels (bool) – whether we use label or id (especially for parameter label)
- significant_digits – number of significant digits when computing probabilities
Returns: List[Tuple[float,float]] : the list of points (precision,recall)

pyagrum.lib.bn2roc.getROCpoints(bn, datasrc, target, label, with_labels=True, significant_digits=10)

Compute the points of the ROC curve

Parameters:
- bn (pyagrum.BayesNet) – a Bayesian network
- datasrc (str | DataFrame) – a csv filename or a DataFrame
- target (str) – the target
- label (str) – the target’s label
- with_labels (bool) – whether we use label or id (especially for parameter label)
- significant_digits – number of significant digits when computing probabilities
Returns: List[Tuple[int,int]] : the list of points (FalsePositifRate,TruePositifRate)

pyagrum.lib.bn2roc.showPR(bn, datasrc, target, label, , beta=1, show_progress=True, show_fig=True, save_fig=False, with_labels=True, significant_digits=10)

Compute the ROC curve and save the result in the folder of the csv file.

Parameters:
- bn (pyagrum.BayesNet) – a Bayesian network
- datasrc (str |**DataFrame) – a csv filename or a pandas.DataFrame
- target (str) – the target
- label (str) – the target label
- show_progress (bool) – indicates if the progress bar must be printed
- save_fig – save the result ?
- show_fig – plot the resuls ?
- with_labels – labels in csv ?
- significant_digits – number of significant digits when computing probabilities

pyagrum.lib.bn2roc.showROC(bn, datasrc, target, label, show_progress=True, show_fig=True, save_fig=False, with_labels=True, significant_digits=10)

Compute the ROC curve and save the result in the folder of the csv file.

Parameters:
- bn (pyagrum.BayesNet) – a Bayesian network
- datasrc (str |**DataFrame) – a csv filename or a pandas.DataFrame
- target (str) – the target
- label (str) – the target label
- show_progress (bool) – indicates if the progress bar must be printed
- save_fig – save the result
- show_fig – plot the resuls
- with_labels – labels in csv
- significant_digits – number of significant digits when computing probabilities

pyagrum.lib.bn2roc.showROC_PR(bn, datasrc, target, label, , beta=1, show_progress=True, show_fig=True, save_fig=False, with_labels=True, show_ROC=True, show_PR=True, significant_digits=10, bgcolor=None)

Compute the ROC curve and save the result in the folder of the csv file.

Parameters:
- bn (pyagrum.BayesNet) – a Bayesian network
- datasrc (str |**DataFrame) – a csv filename or a pandas.DataFrame
- target (str) – the target
- label (str) – the target label
- beta (float) – the value of beta for the F-beta score
- show_progress (bool) – indicates if the progress bar must be printed
- save_fig – save the result
- show_fig – plot the resuls
- with_labels – labels in csv
- show_ROC (bool) – whether we show the ROC figure
- show_PR (bool) – whether we show the PR figure
- significant_digits – number of significant digits when computing probabilities
- bgcolor – HTML background color for the figure (default: None if transparent)
Returns: (pointsROC, thresholdROC, pointsPR, thresholdPR)
Return type: tuple

bn2scores

The purpose of this module is to provide tools for computing different scores from a BN.

pyagrum.lib.bn2scores.checkCompatibility(bn, fields, csv_name)

check if the variables of the bn are in the fields

Parameters:
- bn (gum.BayesNet) – the model
- fields (Dict [**str ,**int ]) – Dict of name,position in the file
- csv_name (str) – name of the csv file
Raises: gum.DatabaseError – if a BN variable is not in fields
Returns: return a dictionary of position for BN variables in fields
Return type: Dict[int,str]

pyagrum.lib.bn2scores.computeScores(bn_name, csv_name, visible=False, dialect=None)

Compute scores (likelihood, aic, bic, mdl, etc.) from a bn w.r.t to a csv

Parameters:
- bn_name (pyagrum.BayesNet | str) – a gum.BayesianNetwork or a filename for a BN
- csv_name (str) – a filename for the CSV database
- visible (bool) – do we show the progress
- dialect (csv.Dialect) – if not provided, dialect will be inferred using csv.Sniffer().sniff(csvfile.read(1024))
Returns: percentDatabaseUsed,scores
Return type: Tuple[float,Dict[str,float]]

pyagrum.lib.bn2scores.lines_count(filename)

count lines in a file

bn_vs_bn

The purpose of this module is to provide tools for comaring different BNs.

class pyagrum.lib.bn_vs_bn.GraphicalBNComparator(bn1, bn2, delta=1e-06)

Bases: object

BNGraphicalComparator allows to compare in multiple way 2 BNs… The smallest assumption is that the names of the variables are the same in the 2 BNs. But some comparisons will have also to check the type and domainSize of the variables.

The bns have not exactly the same role : _bn1 is rather the referent model for the comparison whereas _bn2 is the compared one to the referent model.

Parameters:
- bn1 (str or pyagrum.BayesNet) – a BN or a filename for reference
- bn2 (str or pyagrum.BayesNet) – another BN or antoher filename for comparison

dotDiff()

Return a pydot graph that compares the arcs of _bn1 (reference) with those of self._bn2. full black line: the arc is common for both full red line: the arc is common but inverted in _bn2 dotted black line: the arc is added in _bn2 dotted red line: the arc is removed in _bn2

Warning

if pydot is not installed, this function just returns None

Returns: the result dot graph or None if pydot can not be imported
Return type: pydot.Dot

equivalentBNs()

Check if the 2 BNs are equivalent :

same variables
same graphical structure
same parameters
Returns: “OK” if bn are the same, a description of the error otherwise
Return type: str

hamming()

Compute hamming and structural hamming distance

Hamming distance is the difference of edges comparing the 2 skeletons, and Structural Hamming difference is the difference comparing the cpdags, including the arcs’ orientation.

Returns: A dictionary containing PURE_HAMMING,STRUCTURAL_HAMMING
Return type: dict[double,double]

scores()

Compute Precision, Recall, F-score for self._bn2 compared to self._bn1

precision and recall are computed considering BN1 as the reference

Fscore is 2*(recall* precision)/(recall+precision) and is the weighted average of Precision and Recall.

dist2opt=square root of (1-precision)^2+(1-recall)^2 and represents the euclidian distance to the ideal point (precision=1, recall=1)

Returns: A dictionnary containing ‘precision’, ‘recall’, ‘fscore’, ‘dist2opt’ and so on.
Return type: dict[str,double]

skeletonScores()

Compute Precision, Recall, F-score for skeletons of self._bn2 compared to self._bn1

precision and recall are computed considering BN1 as the reference

Fscor is 2*(recall* precision)/(recall+precision) and is the weighted average of Precision and Recall.

dist2opt=square root of (1-precision)^2+(1-recall)^2 and represents the euclidian distance to the ideal point (precision=1, recall=1)

Returns: A dictionnary containing ‘precision’, ‘recall’, ‘fscore’, ‘dist2opt’ and so on.
Return type: dict[str,double]

pyagrum.lib.bn_vs_bn.graphDiff(bnref, bncmp, noStyle=False)

Return a pydot graph that compares the arcs of bnref to bncmp. graphDiff allows bncmp to have less nodes than bnref. (this is not the case in GraphicalBNComparator.dotDiff())

if noStyle is False use 4 styles (fixed in pyagrum.config) : : - the arc is common for both

the arc is common but inverted in _bn2
the arc is added in _bn2
the arc is removed in _bn2

See graphDiffLegend() to add a legend to the graph. .. warning:: if pydot is not installed, this function just returns None

Returns: the result dot graph or None if pydot can not be imported
Return type: pydot.Dot

other pyagrum.lib modules

bn2roc

pyagrum.lib.bn2roc.animPR(bn, datasrc, target=‘Y’, label=‘1’)

pyagrum.lib.bn2roc.animROC(bn, datasrc, target=‘Y’, label=‘1’)

pyagrum.lib.bn2roc.getPRpoints(bn, datasrc, target, label, with_labels=True, significant_digits=10)

pyagrum.lib.bn2roc.getROCpoints(bn, datasrc, target, label, with_labels=True, significant_digits=10)

pyagrum.lib.bn2roc.showPR(bn, datasrc, target, label, , beta=1, show_progress=True, show_fig=True, save_fig=False, with_labels=True, significant_digits=10)

pyagrum.lib.bn2roc.showROC(bn, datasrc, target, label, show_progress=True, show_fig=True, save_fig=False, with_labels=True, significant_digits=10)

pyagrum.lib.bn2roc.showROC_PR(bn, datasrc, target, label, , beta=1, show_progress=True, show_fig=True, save_fig=False, with_labels=True, show_ROC=True, show_PR=True, significant_digits=10, bgcolor=None)

bn2scores

pyagrum.lib.bn2scores.checkCompatibility(bn, fields, csv_name)

pyagrum.lib.bn2scores.computeScores(bn_name, csv_name, visible=False, dialect=None)

pyagrum.lib.bn2scores.lines_count(filename)

bn_vs_bn

class pyagrum.lib.bn_vs_bn.GraphicalBNComparator(bn1, bn2, delta=1e-06)

dotDiff()

equivalentBNs()

hamming()

scores()

skeletonScores()

pyagrum.lib.bn_vs_bn.graphDiff(bnref, bncmp, noStyle=False)

pyagrum.lib.bn_vs_bn.graphDiffLegend()