DriftGuard: Multi-Signal Drift Early Warning and Safe Re-Training/Rollback for CTR/CVR Models

Hanqi Zhang

doi:10.69987/JACS.2023.30703

Authors

Hanqi Zhang Computer Science, University of Michigan at Ann Arbor, MI, USA Author

DOI:

https://doi.org/10.69987/JACS.2023.30703

Keywords:

distribution shift, concept drift, CTR, CVR, monitoring, change-point detection, MMD, PSI, retraining, rollback, MLOps

Abstract

When CTR/CVR prediction models are deployed in production, their offline validation accuracy is rarely stable. Real traffic evolves continuously due to changes in user intent, inventory, policies, and the surrounding ecosystem. As a result, both distribution shift (covariate and label shift) and concept drift can silently degrade business KPIs. This paper presents DriftGuard, a practical monitoring-and-mitigation framework that couples multi-signal drift scoring with change-point detection and an action policy layer that can automatically trigger re-training and safely rollback. DriftGuard fuses (i) a covariate-shift score based on a linear Maximum Mean Discrepancy (MMD) between reference and recent feature distributions, (ii) a prediction-shift score based on the Population Stability Index (PSI) over predicted probability histograms, and (iii) a performance-shift score that monitors batch log-loss and a Page–Hinkley increment statistic. A lightweight CUSUM-style change-point detector converts scores into alarms, while an action policy decides between periodic retraining, alarm-triggered retraining, and alarm-triggered retraining with rollback based on post-deployment canary performance. We conduct end-to-end experimental evaluations on three public benchmarks that cover both tabular/text CTR-like streams and KPI time series: the WILDS CivilComments dataset, the Wild-Time HuffPost benchmark, and the Numenta Anomaly Benchmark (NAB). To obtain controlled ground-truth drift types, we construct reproducible synthetic deployment streams from the full datasets via stratified sampling that induces covariate shift, label shift, and temporal shift. Across the two CTR/CVR-style streams, DriftGuard achieves strong transition-level drift detection (mean AUROC 0.933 on CivilComments and 0.917 on HuffPost) and supports mitigation policies that reduce cumulative log-loss by 20.6% and 25.8% respectively versus a no-action baseline. On NAB KPI streams, DriftGuard is competitive with strong statistical baselines and provides interpretable alarms aligned with annotated anomaly windows. These results suggest that combining complementary signals with conservative mitigation policies yields a robust, engineering-ready approach to drift management.