Sullivan Hué
Chercheur
,
Aix-Marseille Université
, Faculté d'économie et de gestion (FEG)
- Statut
- Maître de conférences
- Domaine(s) de recherche
- Économétrie, Finance
- Thèse
- 2020, Laboratoire d'Economie d'Orléans
- Téléchargement
- CV
- Contact
- sullivan.hue[at]univ-amu.fr
- Adresse
Maison de l'économie et de la gestion d'Aix
424 chemin du viaduc, CS80429
13097 Aix-en-Provence Cedex 2
Sullivan Hué, Christophe Hurlin, Christophe Pérignon, Sébastien Saurin, Management Science, 04/2026
Résumé
Because they play an increasingly important role in determining access to credit, credit scoring models are under growing scrutiny from banking supervisors and internal model validators. These authorities need to monitor the model performance and identify its key drivers. To facilitate this, we introduce the explainable performance (XPER) methodology to decompose a performance metric (e.g., area under the curve (AUC), [Formula: see text]) into specific contributions associated with the various features of a forecasting model. XPER is theoretically grounded on Shapley values and is both model-agnostic and performance metric-agnostic. Furthermore, it can be implemented either at the model level or at the individual level. Using a novel data set of car loans, we decompose the AUC of a machine-learning model trained to forecast the default probability of loan applicants. We show that a small number of features can explain a surprisingly large part of the model performance. Notably, the features that contribute the most to the predictive performance of the model may not be the ones that contribute the most to individual forecasts (Shapley additive explanation). Finally, we show how XPER can be used to deal with heterogeneity issues and improve performance. This paper was accepted by Kay Giesecke, finance. Funding: The authors thank the Institut Universitaire de France, the Autorité de contrôle prudentiel et de résolution Chair in Regulation and Systemic Risk, the HEC-Deloitte Chair on Artificial Intelligence for Business Innovation, the Excellence Initiative of Aix-Marseille University [A*MIDEX], and the French National Research Agency [Grants AMSE ANR-17-EURE-0020, Ecodec ANR-11-LABX-0047, and MLEforRisk ANR-21-CE26-0007] for supporting our research. Supplemental Material: The online appendices and data files are available at https://doi.org/10.1287/mnsc.2023.02025 .
Mots clés
Explainability, Credit scoring, Performance metrics, Shapley values
Emmanuel Flachaire, Sullivan Hué, Sébastien Laurent, Gilles Hacheme, Oxford Bulletin of Economics and Statistics, 12/2023
Résumé
Despite their high predictive performance, random forest and gradient boosting are often considered as black boxes which has raised concerns from practitioners and regulators. As an alternative, we suggest using partial linear models that are inherently interpretable. Specifically, we propose to combine parametric and non‐parametric functions to accurately capture linearities and non‐linearities prevailing between dependent and explanatory variables, and a variable selection procedure to control for overfitting issues. Estimation relies on a two‐step procedure building upon the double residual method. We illustrate the predictive performance and interpretability of our approach on a regression problem.
Mots clés
Machine leaning, Lasso, Autometrics, GAM
Elena Ivona Dumitrescu, Sullivan Hué, Christophe Hurlin, Sessi Tokpavi, European Journal of Operational Research, Vol. 297, No. 3, pp. 1178-1192, 01/2022
Résumé
In the context of credit scoring, ensemble methods based on decision trees, such as the random forest method, provide better classification performance than standard logistic regression models. However, logistic regression remains the benchmark in the credit risk industry mainly because the lack of interpretability of ensemble methods is incompatible with the requirements of financial regulators. In this paper, we propose a high-performance and interpretable credit scoring method called penalised logistic tree regression (PLTR), which uses information from decision trees to improve the performance of logistic regression. Formally, rules extracted from various short-depth decision trees built with original predictive variables are used as predictors in a penalised logistic regression model. PLTR allows us to capture non-linear effects that can arise in credit scoring data while preserving the intrinsic interpretability of the logistic regression model. Monte Carlo simulations and empirical applications using four real credit default datasets show that PLTR predicts credit risk significantly more accurately than logistic regression and compares competitively to the random forest method
Mots clés
Risk management, Credit scoring, Machine learning, Interpretability, Econometrics