Profit-Maximizing Cost-Sensitive Credit Scoring with LLM-Extracted Policy Constraints
DOI:
https://doi.org/10.69987/JACS.2024.40307Keywords:
Credit scoring, cost-sensitive learning, expected profit, threshold optimization, policy extraction, large language modelAbstract
Credit scoring models are typically trained to optimize statistical accuracy (e.g., AUC) and are later thresholded using ad-hoc business rules. This separation can be economically suboptimal because the lender’s actual objective is expected profit under policy constraints (e.g., minimum approval rate, maximum bad rate) and compliance requirements (e.g., adverse-action reason codes). This paper presents a profit-maximizing, cost-sensitive credit scoring framework in which credit policy text is converted into a profit function and decision constraints by a policy language model (Policy-LLM). The extracted parameters define an example-dependent utility matrix that uses each applicant’s credit limit as exposure-at-default, together with APR, funding rate, loss-given-default, and operational/collection costs. We then train probability-of-default (PD) models with cost-sensitive objectives and select cutoffs by directly maximizing empirical profit subject to the extracted policy constraints. On the UCI Default of Credit Card Clients dataset (30,000 applicants), we evaluate logistic regression, LightGBM, XGBoost, and a multilayer perceptron (MLP) under standard and cost-sensitive training, including weighted cross-entropy, focal loss, and a differentiable profit surrogate. Using a held-out tuning set to select cutoffs, XGBoost achieves the highest test-set mean profit of 4,046 NT$ per applicant at an approval rate of 48.2% and a bad rate of 8.1% under the base policy. The results show that (i) direct profit maximization yields materially different cutoffs than accuracy-driven thresholds, (ii) cost-sensitive training improves profitability most for linear models and neural networks, and (iii) policy constraints and compliant reason codes can be enforced without retraining by optimizing within a feasible threshold set.







