Data Utilities¶

Policies¶

ExclusivePolicy¶

LessExclusivePolicy¶

InclusivePolicy¶

LessInclusivePolicy¶

SiblingsPolicy¶

ExclusiveSiblingsPolicy¶

Utility functions¶

Helper functions for data manipulation and graph creation.

data.convert_categorical_labels_to_numeric(data: ndarray, placeholder_label: str = '') → Tuple[ndarray, dict]¶

Take a string array and convert the values to enumerated integers.

Parameters:

data (np.array) – Data to be processed.
placeholder_label (str) – label representing no present class.

Returns:

conversion (np.array) – The converted integer array.
mapping (dict) – Mapping from the original value to the new enumeration.

data.find_max_depth(graph: DiGraph, root: int | str | str_) → int¶

Find the maximum depth of a DAG given a node as root.

Parameters:

graph (nx.DiGraph) – Graph for which the root should be found.
root (int or str) – Node from which the distance should be measured.

Returns:

max_depth – The number of nodes in the furthest path starting from the root node. A graph that only has the root would return 1.

Return type:

int

data.find_root(graph: DiGraph) → int | str | str_¶

Take a graph and return one of its root nodes.

Parameters:: graph (nx.DiGraph) – Graph for which the root should be found.
Returns:: root – The label of the root node.
Return type:: int or str

data.flatten_labels(labels: ndarray, placeholder_label: int = -1) → ndarray¶

Flattens hierarchical labels to only the most specific label per entry.

Expects the most specific label to be on the rightmost side of the label array for every entry.

Parameters:

labels (np.array) – 2d label matrix, formatted to be row, column.
placeholder_label (int or str) – value describing an undefined label in order to support uneven hierarchies and labels.

Returns:

new_labels – 1d array of the most specific label for each row.

Return type:

np.array

data.graph_from_edge_pairs(file: str, delimiter: str = ',', skip_header: int = 1) → DiGraph¶

Create a DAG from a file containing (parent, child) pairs.

Parameters:

file (str) – File containing the edge pairs.
delimiter (str, default=',') – The delimiter of the file.
skip_header (int, default=1) – The number of rows header rows that should be skipped.

Returns:

graph – The graph with all corresponding edges and their nodes.

Return type:

nx.DiGraph

data.graph_from_hierarchical_labels(data: ndarray, placeholder: str | int | None = None) → DiGraph¶

Construct a DAG from hierarchical labels.

In the case that multiple root nodes are found, a new root node is inserted and all previous root nodes are connected to it.

Parameters:

data (np.array) – Hierarchical labels, formatted to be (row, col). The columns should be ordered from least specific to most specific class. If some columns are invalid (i.e. there are columns with a number of labels lower than the number of columns), then they should be marked by a placeholder.
placeholder (str or int, default=None) – Value for non-existent nodes in the data. Has to match data type of data

Returns:

graph – The graph with all corresponding edges and their nodes.

Return type:

DiGraph

data.is_numeric_label(array: ndarray) → bool¶

Determine whether an array has a numerical label format.

Supported formats are booleans, unsigned integers, signed integers and floats.

Parameters:: array (np.array) – The array to check.
Returns:: result – True if the array is has a supported format, False otherwise
Return type:: bool

data.minimal_graph_depth(graph: DiGraph) → int¶

Calculate the minimal depth in which all nodes can be hit.

Parameters:: graph (nx.DiGraph) – Graph to be analyzed.
Returns:: depth – The minimal depth.
Return type:: int

data.minimal_per_node_depth(graph: DiGraph) → dict¶

Calculate the minimal depth which is needed to hit a node, for all nodes.

Parameters:: graph (nx.DiGraph) – Graph to be analyzed.
Returns:: node_depth – A mapping for node : depth
Return type:: dict

Data Utilities¶

Policies¶

ExclusivePolicy¶

LessExclusivePolicy¶

InclusivePolicy¶

LessInclusivePolicy¶

SiblingsPolicy¶

ExclusiveSiblingsPolicy¶

Utility functions¶

hiclass

Navigation

Related Topics