Calibrating a Classifier

A minimalist example showing how to calibrate a HiClass LCN model. The calibration method can be selected with the calibration_method parameter, for example:

  • Isotonic Regression
  • Platt scaling
  • Beta scaling
  • IVAP
  • CVAP
rf = RandomForestClassifier()
classifier = LocalClassifierPerNode(
    local_classifier=rf,
    calibration_method='isotonic'
)
rf = RandomForestClassifier()
classifier = LocalClassifierPerNode(
    local_classifier=rf,
    calibration_method='platt'
)
rf = RandomForestClassifier()
classifier = LocalClassifierPerNode(
    local_classifier=rf,
    calibration_method='beta'
)
rf = RandomForestClassifier()
classifier = LocalClassifierPerNode(
    local_classifier=rf,
    calibration_method='ivap'
)
rf = RandomForestClassifier()
classifier = LocalClassifierPerNode(
    local_classifier=rf,
    calibration_method='cvap'
)

Furthermore, probabilites of multiple levels can be aggregated by defining a probability combiner:

  • Multiply (Default)
  • Geometric Mean
  • Arithmetic Mean
  • No Aggregation
rf = RandomForestClassifier()
classifier = LocalClassifierPerNode(
    local_classifier=rf,
    calibration_method='isotonic',
    probability_combiner='multiply'
)
rf = RandomForestClassifier()
classifier = LocalClassifierPerNode(
    local_classifier=rf,
    calibration_method='isotonic',
    probability_combiner='geometric'
)
rf = RandomForestClassifier()
classifier = LocalClassifierPerNode(
    local_classifier=rf,
    calibration_method='isotonic',
    probability_combiner='arithmetic'
)
rf = RandomForestClassifier()
classifier = LocalClassifierPerNode(
    local_classifier=rf,
    calibration_method='isotonic',
    probability_combiner=None
)

A hierarchical classifier can be calibrated by calling calibrate on the model or by using a Pipeline:

  • Default
  • Pipeline
rf = RandomForestClassifier()
classifier = LocalClassifierPerNode(
    local_classifier=rf,
    calibration_method='isotonic'
)

classifier.fit(X_train, Y_train)
classifier.calibrate(X_cal, Y_cal)
classifier.predict_proba(X_test)
from hiclass import Pipeline

rf = RandomForestClassifier()
classifier = LocalClassifierPerNode(
    local_classifier=rf,
    calibration_method='isotonic'
)

pipeline = Pipeline([
    ('classifier', classifier),
])

pipeline.fit(X_train, Y_train)
pipeline.calibrate(X_cal, Y_cal)
pipeline.predict_proba(X_test)

In the code below, isotonic regression is used to calibrate the model.

['Cow' 'Lizard' 'Sheep' 'Snake']
[[0.25 0.25 0.25 0.25]
 [0.25 0.25 0.25 0.25]
 [0.25 0.25 0.25 0.25]
 [0.25 0.25 0.25 0.25]]

from sklearn.ensemble import RandomForestClassifier

from hiclass import LocalClassifierPerNode

# Define data
X_train = [[1], [2], [3], [4]]
X_test = [[4], [3], [2], [1]]
X_cal = [[5], [6], [7], [8]]
Y_train = [
    ["Animal", "Mammal", "Sheep"],
    ["Animal", "Mammal", "Cow"],
    ["Animal", "Reptile", "Snake"],
    ["Animal", "Reptile", "Lizard"],
]

Y_cal = [
    ["Animal", "Mammal", "Cow"],
    ["Animal", "Mammal", "Sheep"],
    ["Animal", "Reptile", "Lizard"],
    ["Animal", "Reptile", "Snake"],
]

# Use random forest classifiers for every node
rf = RandomForestClassifier()

# Use local classifier per node with isotonic regression as calibration method
classifier = LocalClassifierPerNode(
    local_classifier=rf, calibration_method="isotonic", probability_combiner="multiply"
)

# Train local classifier per node
classifier.fit(X_train, Y_train)

# Calibrate local classifier per node
classifier.calibrate(X_cal, Y_cal)

# Predict probabilities
probabilities = classifier.predict_proba(X_test)

# Print probabilities and labels for the last level
print(classifier.classes_[2])
print(probabilities)

Total running time of the script: (0 minutes 0.547 seconds)

Gallery generated by Sphinx-Gallery