Aller au contenu principal

Sullivan Hué

Chercheur Aix-Marseille UniversitéFaculté d'économie et de gestion (FEG)

Économétrie, finance et méthodes mathématiques
Hué
Statut
Maître de conférences
Domaine(s) de recherche
Économétrie, Finance
Thèse
2020, Laboratoire d'Economie d'Orléans
Téléchargement
CV
Adresse

Maison de l'économie et de la gestion d'Aix
424 chemin du viaduc, CS80429
13097 Aix-en-Provence Cedex 2

Résumé Despite their high predictive performance, random forest and gradient boosting are often considered as black boxes which has raised concerns from practitioners and regulators. As an alternative, we suggest using partial linear models that are inherently interpretable. Specifically, we propose to combine parametric and non‐parametric functions to accurately capture linearities and non‐linearities prevailing between dependent and explanatory variables, and a variable selection procedure to control for overfitting issues. Estimation relies on a two‐step procedure building upon the double residual method. We illustrate the predictive performance and interpretability of our approach on a regression problem.
Mots clés Machine leaning, Lasso, Autometrics, GAM
Résumé In the context of credit scoring, ensemble methods based on decision trees, such as the random forest method, provide better classification performance than standard logistic regression models. However, logistic regression remains the benchmark in the credit risk industry mainly because the lack of interpretability of ensemble methods is incompatible with the requirements of financial regulators. In this paper, we propose a high-performance and interpretable credit scoring method called penalised logistic tree regression (PLTR), which uses information from decision trees to improve the performance of logistic regression. Formally, rules extracted from various short-depth decision trees built with original predictive variables are used as predictors in a penalised logistic regression model. PLTR allows us to capture non-linear effects that can arise in credit scoring data while preserving the intrinsic interpretability of the logistic regression model. Monte Carlo simulations and empirical applications using four real credit default datasets show that PLTR predicts credit risk significantly more accurately than logistic regression and compares competitively to the random forest method
Mots clés Risk management, Credit scoring, Machine learning, Interpretability, Econometrics