Ch 6: Remove Bias from ML Output¶
Introduction¶
What if you already have a model in production making predictions? You can't go back and retrain. Chapters 5's techniques (reweighting, ACF) require intervention before or during training. This chapter covers post-prediction bias mitigation.
When to Use Output-Level Correction
- Model is already in production
- Can't retrain due to cost or complexity
- Need a safety net even after pre-processing and in-processing treatments
- Need model-agnostic approach that works regardless of algorithm
Reject Option Classifier (ROC)¶
The ROC approach operates on the decision boundary — the zone where the model is least confident.
Intuition¶
In binary classification, predictions near the decision boundary (P ≈ 0.5) are the least certain. The ROC says: in this uncertain zone, flip predictions that disadvantage the unprivileged group.
The Decision Boundary¶
For a standard binary classifier:
The ROC introduces a critical region around the boundary:
Where \(\theta^-\) and \(\theta^+\) define the uncertainty band (typically 0.5 ± some margin).
graph LR
A["P(Y=1) < θ⁻"] -->|"Confident Negative"| B[Keep Prediction]
C["θ⁻ ≤ P(Y=1) ≤ θ⁺"] -->|"Critical Region"| D[Apply ROC Rules]
E["P(Y=1) > θ⁺"] -->|"Confident Positive"| B
D --> F{Protected Group?}
F -->|Unprivileged| G[Flip to Favourable]
F -->|Privileged| H[Flip to Unfavourable]
Continue: Reject Option Classifier Details →