Data Utilities
Binary Policies
ExclusivePolicy
- class BinaryPolicy.ExclusivePolicy(digraph: DiGraph, X: ndarray, y: ndarray, sample_weight=None)
Bases:
BinaryPolicyImplement the exclusive policy of the referenced paper.
- __init__(digraph: DiGraph, X: ndarray, y: ndarray, sample_weight=None)
Initialize a BinaryPolicy with the required data.
- Parameters
digraph (nx.DiGraph) – DiGraph which is used for inferring nodes relationships.
X (np.ndarray) – Features which will be used for fitting a model.
y (np.ndarray) – Labels which will be assigned to the different samples. Has to be 2D array.
sample_weight (array-like of shape (n_samples,), default=None) – Array of weights that are assigned to individual samples. If not provided, then each sample is given unit weight.
- get_binary_examples(node) tuple
Gather all positive and negative examples for a given node.
- Parameters
node – Node for which the positive and negative examples should be searched.
- Returns
X (np.ndarray) – The subset with positive and negative features.
y (np.ndarray) – The subset with positive and negative labels.
- negative_examples(node) ndarray
Gather all negative examples corresponding to the given node.
This includes all examples except the positive ones.
- Parameters
node – Node for which the negative examples should be searched.
- Returns
negative_examples – A mask for which examples are included (True) and which are not.
- Return type
np.ndarray
- positive_examples(node) ndarray
Gather all positive examples corresponding to the given node.
This only includes examples for the given node.
- Parameters
node – Node for which the positive examples should be searched.
- Returns
positive_examples – A mask for which examples are included (True) and which are not.
- Return type
np.ndarray
LessExclusivePolicy
- class BinaryPolicy.LessExclusivePolicy(digraph: DiGraph, X: ndarray, y: ndarray, sample_weight=None)
Bases:
ExclusivePolicyImplement the less exclusive policy of the referenced paper.
- __init__(digraph: DiGraph, X: ndarray, y: ndarray, sample_weight=None)
Initialize a BinaryPolicy with the required data.
- Parameters
digraph (nx.DiGraph) – DiGraph which is used for inferring nodes relationships.
X (np.ndarray) – Features which will be used for fitting a model.
y (np.ndarray) – Labels which will be assigned to the different samples. Has to be 2D array.
sample_weight (array-like of shape (n_samples,), default=None) – Array of weights that are assigned to individual samples. If not provided, then each sample is given unit weight.
- get_binary_examples(node) tuple
Gather all positive and negative examples for a given node.
- Parameters
node – Node for which the positive and negative examples should be searched.
- Returns
X (np.ndarray) – The subset with positive and negative features.
y (np.ndarray) – The subset with positive and negative labels.
- negative_examples(node) ndarray
Gather all negative examples corresponding to the given node.
This includes all examples except the examples for the current node and its children.
- Parameters
node – Node for which the negative examples should be searched.
- Returns
negative_examples – A mask for which examples are included (True) and which are not.
- Return type
np.ndarray
- positive_examples(node) ndarray
Gather all positive examples corresponding to the given node.
This only includes examples for the given node.
- Parameters
node – Node for which the positive examples should be searched.
- Returns
positive_examples – A mask for which examples are included (True) and which are not.
- Return type
np.ndarray
InclusivePolicy
- class BinaryPolicy.InclusivePolicy(digraph: DiGraph, X: ndarray, y: ndarray, sample_weight=None)
Bases:
BinaryPolicyImplement the inclusive policy of the referenced paper.
- __init__(digraph: DiGraph, X: ndarray, y: ndarray, sample_weight=None)
Initialize a BinaryPolicy with the required data.
- Parameters
digraph (nx.DiGraph) – DiGraph which is used for inferring nodes relationships.
X (np.ndarray) – Features which will be used for fitting a model.
y (np.ndarray) – Labels which will be assigned to the different samples. Has to be 2D array.
sample_weight (array-like of shape (n_samples,), default=None) – Array of weights that are assigned to individual samples. If not provided, then each sample is given unit weight.
- get_binary_examples(node) tuple
Gather all positive and negative examples for a given node.
- Parameters
node – Node for which the positive and negative examples should be searched.
- Returns
X (np.ndarray) – The subset with positive and negative features.
y (np.ndarray) – The subset with positive and negative labels.
- negative_examples(node) ndarray
Gather all negative examples corresponding to the given node.
This includes all examples, except the examples for the given node, its descendants and successors.
- Parameters
node – Node for which the negative examples should be searched.
- Returns
negative_examples – A mask for which examples are included (True) and which are not.
- Return type
np.ndarray
- positive_examples(node) ndarray
Gather all positive examples corresponding to the given node.
This includes examples for the given node and its descendants.
- Parameters
node – Node for which the positive examples should be searched.
- Returns
positive_examples – A mask for which examples are included (True) and which are not.
- Return type
np.ndarray
LessInclusivePolicy
- class BinaryPolicy.LessInclusivePolicy(digraph: DiGraph, X: ndarray, y: ndarray, sample_weight=None)
Bases:
InclusivePolicyImplement the less inclusive policy of the referenced paper.
- __init__(digraph: DiGraph, X: ndarray, y: ndarray, sample_weight=None)
Initialize a BinaryPolicy with the required data.
- Parameters
digraph (nx.DiGraph) – DiGraph which is used for inferring nodes relationships.
X (np.ndarray) – Features which will be used for fitting a model.
y (np.ndarray) – Labels which will be assigned to the different samples. Has to be 2D array.
sample_weight (array-like of shape (n_samples,), default=None) – Array of weights that are assigned to individual samples. If not provided, then each sample is given unit weight.
- get_binary_examples(node) tuple
Gather all positive and negative examples for a given node.
- Parameters
node – Node for which the positive and negative examples should be searched.
- Returns
X (np.ndarray) – The subset with positive and negative features.
y (np.ndarray) – The subset with positive and negative labels.
- negative_examples(node) ndarray
Gather all negative examples corresponding to the given node.
This includes all examples, except the examples for the given node and its descendants.
- Parameters
node – Node for which the negative examples should be searched.
- Returns
negative_examples – A mask for which examples are included (True) and which are not.
- Return type
np.ndarray
- positive_examples(node) ndarray
Gather all positive examples corresponding to the given node.
This includes examples for the given node and its descendants.
- Parameters
node – Node for which the positive examples should be searched.
- Returns
positive_examples – A mask for which examples are included (True) and which are not.
- Return type
np.ndarray
SiblingsPolicy
- class BinaryPolicy.SiblingsPolicy(digraph: DiGraph, X: ndarray, y: ndarray, sample_weight=None)
Bases:
InclusivePolicyImplement the siblings policy of the referenced paper.
- __init__(digraph: DiGraph, X: ndarray, y: ndarray, sample_weight=None)
Initialize a BinaryPolicy with the required data.
- Parameters
digraph (nx.DiGraph) – DiGraph which is used for inferring nodes relationships.
X (np.ndarray) – Features which will be used for fitting a model.
y (np.ndarray) – Labels which will be assigned to the different samples. Has to be 2D array.
sample_weight (array-like of shape (n_samples,), default=None) – Array of weights that are assigned to individual samples. If not provided, then each sample is given unit weight.
- get_binary_examples(node) tuple
Gather all positive and negative examples for a given node.
- Parameters
node – Node for which the positive and negative examples should be searched.
- Returns
X (np.ndarray) – The subset with positive and negative features.
y (np.ndarray) – The subset with positive and negative labels.
- negative_examples(node) ndarray
Gather all negative examples corresponding to the given node.
This includes all examples for nodes that have the same ancestors as the given node, as well as their descendants.
- Parameters
node – Node for which the negative examples should be searched.
- Returns
negative_examples – A mask for which examples are included (True) and which are not.
- Return type
np.ndarray
- positive_examples(node) ndarray
Gather all positive examples corresponding to the given node.
This includes examples for the given node and its descendants.
- Parameters
node – Node for which the positive examples should be searched.
- Returns
positive_examples – A mask for which examples are included (True) and which are not.
- Return type
np.ndarray
ExclusiveSiblingsPolicy
- class BinaryPolicy.ExclusiveSiblingsPolicy(digraph: DiGraph, X: ndarray, y: ndarray, sample_weight=None)
Bases:
ExclusivePolicyImplement the exclusive siblings policy of the referenced paper.
- __init__(digraph: DiGraph, X: ndarray, y: ndarray, sample_weight=None)
Initialize a BinaryPolicy with the required data.
- Parameters
digraph (nx.DiGraph) – DiGraph which is used for inferring nodes relationships.
X (np.ndarray) – Features which will be used for fitting a model.
y (np.ndarray) – Labels which will be assigned to the different samples. Has to be 2D array.
sample_weight (array-like of shape (n_samples,), default=None) – Array of weights that are assigned to individual samples. If not provided, then each sample is given unit weight.
- get_binary_examples(node) tuple
Gather all positive and negative examples for a given node.
- Parameters
node – Node for which the positive and negative examples should be searched.
- Returns
X (np.ndarray) – The subset with positive and negative features.
y (np.ndarray) – The subset with positive and negative labels.
- negative_examples(node) ndarray
Gather all negative examples corresponding to the given node.
This includes examples for all nodes that have the same parent as the given node.
- Parameters
node – Node for which the negative examples should be searched.
- Returns
negative_examples – A mask for which examples are included (True) and which are not.
- Return type
np.ndarray
- positive_examples(node) ndarray
Gather all positive examples corresponding to the given node.
This only includes examples for the given node.
- Parameters
node – Node for which the positive examples should be searched.
- Returns
positive_examples – A mask for which examples are included (True) and which are not.
- Return type
np.ndarray
Hierarchical Metrics
Precision
- metrics.precision(y_true: ndarray, y_pred: ndarray)
Compute precision score for hierarchical classification.
\(hP = \displaystyle{\frac{\sum_{i}| \alpha_i \cap \beta_i |}{\sum_{i}| \alpha_i |}}\), where \(\alpha_i\) is the set consisting of the most specific classes predicted for test example \(i\) and all their ancestor classes, while \(\beta_i\) is the set containing the true most specific classes of test example \(i\) and all their ancestors, with summations computed over all test examples.
- Parameters
y_true (np.array of shape (n_samples, n_levels)) – Ground truth (correct) labels.
y_pred (np.array of shape (n_samples, n_levels)) – Predicted labels, as returned by a classifier.
- Returns
precision – What proportion of positive identifications was actually correct?
- Return type
float
Recall
- metrics.recall(y_true: ndarray, y_pred: ndarray)
Compute recall score for hierarchical classification.
\(\displaystyle{hR = \frac{\sum_i|\alpha_i \cap \beta_i|}{\sum_i|\beta_i|}}\), where \(\alpha_i\) is the set consisting of the most specific classes predicted for test example \(i\) and all their ancestor classes, while \(\beta_i\) is the set containing the true most specific classes of test example \(i\) and all their ancestors, with summations computed over all test examples.
- Parameters
y_true (np.array of shape (n_samples, n_levels)) – Ground truth (correct) labels.
y_pred (np.array of shape (n_samples, n_levels)) – Predicted labels, as returned by a classifier.
- Returns
recall – What proportion of actual positives was identified correctly?
- Return type
float
F-score
- metrics.f1(y_true: ndarray, y_pred: ndarray)
Compute f1 score for hierarchical classification.
\(\displaystyle{hF = \frac{2 \times hP \times hR}{hP + hR}}\), where \(hP\) is the hierarchical precision and \(hR\) is the hierarchical recall.
- Parameters
y_true (np.array of shape (n_samples, n_levels)) – Ground truth (correct) labels.
y_pred (np.array of shape (n_samples, n_levels)) – Predicted labels, as returned by a classifier.
- Returns
f1 – Weighted average of the precision and recall
- Return type
float