Human-Uncertainty Distillation for Calibrated Vision Models on CIFAR-10H

Authors

  • Ziliang Samuel Zhong New York University, NY, USA Author
  • Ruiyan Ma Software Engineering, UC Irvine, CA, USA Author
  • Hailey Zhao Business Analytics, Columbia University, NY, USA Author

DOI:

https://doi.org/10.69987/JACS.2023.30206

Keywords:

uncertainty calibration, CIFAR-10H, human soft labels, knowledge distillation, label distributions, selective prediction, robustness

Abstract

Human uncertainty is informative when a visual example is genuinely ambiguous, because a full label distribution captures plausible class confusions that a hard one-hot label suppresses. This paper evaluates human-uncertainty distillation (HUD) directly on CIFAR-10H, which provides human label distributions for the 10,000-image CIFAR-10 test set. The study uses a stratified 60/20/20 split of CIFAR-10H, yielding 6000 training images, 2000 validation images, and 2000 test images. A compact vision classifier is trained from standardized HOG and color descriptors so that the effect of the supervision signal can be isolated from a larger backbone. HUD combines label-smoothed hard-label supervision with human soft-label distillation whose weight increases on high-entropy human targets, together with a small entropy-alignment penalty. On the held-out test split, HUD reached 58.43% top-1 accuracy, 1.1872 negative log-likelihood, 0.0284 expected calibration error, 0.5449 Brier score, and 0.2188 area under the risk-coverage curve. Relative to standard cross-entropy training, HUD improved accuracy by 1.50 percentage points, reduced negative log-likelihood by 3.6%, reduced expected calibration error by 50.9%, and reduced Brier score by 4.0%. Label smoothing remained a strong baseline, but HUD produced the best student negative log-likelihood, Brier score, human-label cross-entropy, and selective-prediction AURC. Under five corruption families at three severities, HUD improved mean corrupted accuracy from 0.4145 to 0.4201 and reduced mean corrupted ECE from 0.1574 to 0.1233. The results show that real human soft labels can improve likelihood, calibration, selective prediction, and robustness even when top-1 gains are modest.

Author Biography

  • Hailey Zhao, Business Analytics, Columbia University, NY, USA

     

     

     

Downloads

Published

2023-02-17

How to Cite

Ziliang Samuel Zhong, Ruiyan Ma, & Hailey Zhao. (2023). Human-Uncertainty Distillation for Calibrated Vision Models on CIFAR-10H. Journal of Advanced Computing Systems , 3(2), 77-89. https://doi.org/10.69987/JACS.2023.30206

Share