Note
Go to the end to download the full example code.
Calibrating a Classifier
A minimalist example showing how to calibrate a HiClass LCN model. The calibration method can be selected with the calibration_method parameter, for example:
rf = RandomForestClassifier()
classifier = LocalClassifierPerNode(
local_classifier=rf,
calibration_method='isotonic'
)
rf = RandomForestClassifier()
classifier = LocalClassifierPerNode(
local_classifier=rf,
calibration_method='platt'
)
rf = RandomForestClassifier()
classifier = LocalClassifierPerNode(
local_classifier=rf,
calibration_method='beta'
)
rf = RandomForestClassifier()
classifier = LocalClassifierPerNode(
local_classifier=rf,
calibration_method='ivap'
)
rf = RandomForestClassifier()
classifier = LocalClassifierPerNode(
local_classifier=rf,
calibration_method='cvap'
)
Furthermore, probabilites of multiple levels can be aggregated by defining a probability combiner:
rf = RandomForestClassifier()
classifier = LocalClassifierPerNode(
local_classifier=rf,
calibration_method='isotonic',
probability_combiner='multiply'
)
rf = RandomForestClassifier()
classifier = LocalClassifierPerNode(
local_classifier=rf,
calibration_method='isotonic',
probability_combiner='geometric'
)
rf = RandomForestClassifier()
classifier = LocalClassifierPerNode(
local_classifier=rf,
calibration_method='isotonic',
probability_combiner='arithmetic'
)
rf = RandomForestClassifier()
classifier = LocalClassifierPerNode(
local_classifier=rf,
calibration_method='isotonic',
probability_combiner=None
)
A hierarchical classifier can be calibrated by calling calibrate on the model or by using a Pipeline:
rf = RandomForestClassifier()
classifier = LocalClassifierPerNode(
local_classifier=rf,
calibration_method='isotonic'
)
classifier.fit(X_train, Y_train)
classifier.calibrate(X_cal, Y_cal)
classifier.predict_proba(X_test)
from hiclass import Pipeline
rf = RandomForestClassifier()
classifier = LocalClassifierPerNode(
local_classifier=rf,
calibration_method='isotonic'
)
pipeline = Pipeline([
('classifier', classifier),
])
pipeline.fit(X_train, Y_train)
pipeline.calibrate(X_cal, Y_cal)
pipeline.predict_proba(X_test)
In the code below, isotonic regression is used to calibrate the model.
['Cow' 'Lizard' 'Sheep' 'Snake']
[[0.25 0.25 0.25 0.25]
[0.25 0.25 0.25 0.25]
[0.25 0.25 0.25 0.25]
[0.25 0.25 0.25 0.25]]
from sklearn.ensemble import RandomForestClassifier
from hiclass import LocalClassifierPerNode
# Define data
X_train = [[1], [2], [3], [4]]
X_test = [[4], [3], [2], [1]]
X_cal = [[5], [6], [7], [8]]
Y_train = [
["Animal", "Mammal", "Sheep"],
["Animal", "Mammal", "Cow"],
["Animal", "Reptile", "Snake"],
["Animal", "Reptile", "Lizard"],
]
Y_cal = [
["Animal", "Mammal", "Cow"],
["Animal", "Mammal", "Sheep"],
["Animal", "Reptile", "Lizard"],
["Animal", "Reptile", "Snake"],
]
# Use random forest classifiers for every node
rf = RandomForestClassifier()
# Use local classifier per node with isotonic regression as calibration method
classifier = LocalClassifierPerNode(
local_classifier=rf, calibration_method="isotonic", probability_combiner="multiply"
)
# Train local classifier per node
classifier.fit(X_train, Y_train)
# Calibrate local classifier per node
classifier.calibrate(X_cal, Y_cal)
# Predict probabilities
probabilities = classifier.predict_proba(X_test)
# Print probabilities and labels for the last level
print(classifier.classes_[2])
print(probabilities)
Total running time of the script: (0 minutes 0.556 seconds)