Computing Optimal Cut-Offs
Probabilities from classification models can have two problems:
- Miscalibration: A p of .9 often doesn’t mean a 90% chance of 1 (assuming a dichotomous y). (You can calibrate it using isotonic regression.)
- Optimal cut-offs: For multi-class classifiers, we do not know what probability value will maximize the accuracy or F1 score. Or any metric for which you need to trade off between FP and FN.
One way to solve #2 is to run the true labels (out of sample, otherwise there is concern about bias) and probabilities through a brute-force optimizer, which gives you the optimal cut-off for the metric. Here’s the script for doing the same, along with an illustration.