Ghattas | AMSE | Aix-Marseille School of Economics

Article de revue

Deep learning-driven false-lumen volumes predict adverse remodeling better than diameter in patients with residual aortic dissection on CT

Joris Fournel, Mariangela de Masi, Charlotte Lu, Virgile Omnes, Baptiste Muselier, Badih Ghattas, Olivier Bouchot, Moundji Kafi, Alain Lalande, Marine Gaudry, Alexis Jacquier, Axel Bartoli, European Radiology, 11/2025

Résumé Objectives 1. To develop a deep-learning segmentation model for automated measurement of maximal aortic diameter (D max ) and volumes of aortic dissection components: true-lumen (TL), circulating false-lumen (CFL), and thrombus (Th) on CT angiography (CTA). 2. To assess the predictive value of these measures for adverse aortic remodeling in residual aortic dissection (RAD).Materials and methods This retrospective study included 322 patients from two centers. The segmentation model was trained on 120 patients (Center 1) and tested on an internal dataset (30 patients, Center 1) and an external dataset (10 patients, Center 2) in terms of Dice Similarity Coefficient (DSC). The model extracted D max , global false-lumen volume (FL Glo = CFL + Th), and local false-lumen volume (FL Loc , measured 3 cm around the largest diameter). Clinical validation was performed on 83 patients from Center1 (internal validation, 2-year follow-up) and 79 patients from Center2 (external validation, 4.5-year follow-up). ResultsThe segmentation model achieved high accuracy (Center 1, DSC: 0.93 TL, 0.93 CFL, 0.87 Th; Center 2, DSC: 0.92 TL, 0.93 CFL, 0.84 Th) with strong agreement between automated and manual measurements. Aortic remodeling occurred in 39/83 patients (46.9%) from Center1 and 33/79 patients (41.7%) from Center2. Aortic remodeling occurred in 39/83 patients (47%) from Center1 and 33/80 (42%) from Center2. FL Loc outperformed D max and FLGlo (Center 1: AUC = 0.83, 0.73, and 0.76; Center 2: AUC = 0.77, 0.64, and 0.70). At optimal thresholds, FL Loc showed good predictive performance (Center 1: Sensitivity = 0.87, Specificity = 0.68). Conclusion Deep-learning segmentation provides accurate aortic measurements. Local false-lumen volumes predict adverse aortic remodeling in RAD better than diameter and global false-lumen volumes. Key PointsQuestion In residual aortic dissection (RAD) after type-A dissection, early identification of high-risk patients on initial CT angiography is crucial for endovascular treatment decisions. Findings False-lumen local volumes (3 cm around aortic dissection maximal diameters), obtained with an automatic deeplearning method, predict adverse remodeling better than diameter or global false-lumen volumes. Clinical relevance A deep-learning segmentation method of aortic dissection components on CTA, enabling automatic measurements of diameters and volumes is feasible. It provides local false-lumen volumes, a better predictive marker of adverse aortic remodeling than the currently used diameters and global volumes.

Mots clés Aortic dissection, Computed tomography angiography, Deep-learning, Prognosis, Computer-assisted image processing

Publication HAL Google Scholar

Article de revue

A hypothesis test for comparing two partitions obtained from the same dataset

Mathias Bourel, Badih Ghattas, Meliza González, Communications in Statistics - Simulation and Computation, pp. 1-23, 02/2025

Résumé We propose a non parametric hypothesis test to compare two partitions of a same data set. The partitions may result from two different clustering approaches. The test may be done using any comparison index but we focus in particular on the Matching Error (ME) that is related to the misclassification error in supervised learning. Some properties of the ME and, especially, its distribution function for the case of two different partitions are analyzed. Extensive simulations and experiments show the efficiency of the test.

Mots clés Clustering, Comparing partitions, Hyposthesis test, Matching error

Publication HAL Google Scholar

Article de revue

Clustering Approaches for Mixed‐Type Data: A Comparative Study

Badih Ghattas, Alvaro Sanchez San-Benito, Journal of Probability and Statistics, Vol. 2025, No. 1, 01/2025

Résumé Clustering is widely used in unsupervised learning to fnd homogeneous groups of observations within a dataset. However, clustering mixed-type data remains a challenge, as few existing approaches are suited for this task. Tis study presents the state-of-the-art of these approaches and compares them using various simulation models. Te compared methods include the distance-based approaches k-prototypes, PDQ, and convex k-means, and the probabilistic methods KAy-means for MIxed LArge data (KAMILA), the mixture of Bayesian networks (MBNs), and latent class model (LCM). Te aim is to provide insights into the behavior of diferent methods across a wide range of scenarios by varying some experimental factors such as the number of clusters, cluster overlap, sample size, dimension, proportion of continuous variables in the dataset, and clusters' distribution. Te degree of cluster overlap and the proportion of continuous variables in the dataset and the sample size have a signifcant impact on the observed performances. When strong interactions exist between variables alongside an explicit dependence on cluster membership, none of the evaluated methods demonstrated satisfactory performance. In our experiments KAMILA, LCM, and k-prototypes exhibited the best performance, with respect to the adjusted rand index (ARI). All the methods are available in R.

Mots clés Bayesian networks, Clustering, KAMILA, LCM, Mixed-type data

Publication HAL Google Scholar

Article de revue

Fully automated epicardial adipose tissue volume quantification with deep learning and relationship with CAC score and micro/macrovascular complications in people living with type 2 diabetes: the multicenter EPIDIAB study

Bénédicte Gaborit, Jean Julla, Joris Fournel, Patricia Ancel, Astrid Soghomonian, Camille Deprade, Adèle Lasbleiz, Marie Houssays, Badih Ghattas, Pierre Gascon, Maud Righini, Frédéric Matonti, Nicolas Venteclef, Louis Potier, Jean Gautier, Noémie Resseguier, Axel Bartoli, Florian Mourre, Patrice Darmon, Alexis Jacquier, Anne Dutour, Cardiovascular Diabetology, Vol. 23, No. 1, pp. 328, 09/2024

Résumé Background: The aim of this study (EPIDIAB) was to assess the relationship between epicardial adipose tissue (EAT) and the micro and macrovascular complications (MVC) of type 2 diabetes (T2D). Methods: EPIDIAB is a post hoc analysis from the AngioSafe T2D study, which is a multicentric study aimed at determining the safety of antihyperglycemic drugs on retina and including patients with T2D screened for diabetic retinopathy (DR) (n = 7200) and deeply phenotyped for MVC. Patients included who had undergone cardiac CT for CAC (Coronary Artery Calcium) scoring after inclusion (n = 1253) were tested with a validated deep learning segmentation pipeline for EAT volume quantification. Results: Median age of the study population was 61 [54;67], with a majority of men (57%) a median duration of the disease 11 years [5;18] and a mean HbA1c of7.8 ± 1.4%. EAT was significantly associated with all traditional CV risk factors. EAT volume significantly increased with chronic kidney disease (CKD vs no CKD: 87.8 [63.5;118.6] vs 82.7 mL [58.8;110.8], p = 0.008), coronary artery disease (CAD vs no CAD: 112.2 [82.7;133.3] vs 83.8 mL [59.4;112.1], p = 0.0004, peripheral arterial disease (PAD vs no PAD: 107 [76.2;141] vs 84.6 mL[59.2; 114], p = 0.0005 and elevated CAC score (> 100 vs < 100 AU: 96.8 mL [69.1;130] vs 77.9 mL [53.8;107.7], p < 0.0001). By contrast, EAT volume was neither associated with DR, nor with peripheral neuropathy. We further evidenced a subgroup of patients with high EAT volume and a null CAC score. Interestingly, this group were more likely to be composed of young women with a high BMI, a lower duration of T2D, a lower prevalence of microvascular complications, and a higher inflammatory profile. Conclusions: Fully-automated EAT volume quantification could provide useful information about the risk of both renal and macrovascular complications in T2D patients.

Mots clés CAC score, Cardiac computed tomography, Deep learning, Epicardial adipose tissue, Type 2 diabetes

Publication HAL Google Scholar

Article de revue

Textual data for electricity load forecasting

David Obst, Sandra Claudel, Jairo Cugliari, Badih Ghattas, Yannig Goude, Georges Oppenheim, Quality and Reliability Engineering International, 08/2024

Résumé Abstract Traditional mid‐term electricity forecasting models rely on calendar and meteorological information such as temperature and wind speed to achieve high performance. However depending on such variables has drawbacks, as they may not be informative enough during extreme weather. While ubiquitous, textual sources of information are hardly included in prediction algorithms for time series, despite the relevant information they may contain. In this work, we propose to leverage openly accessible weather reports for electricity demand and meteorological time series prediction problems. Our experiments on French and British load data show that the considered textual sources allow to improve overall accuracy of the reference model, particularly during extreme weather events such as storms or abnormal temperatures. Additionally, we apply our approach to the problem of imputation of missing values in meteorological time series, and we show that our text‐based approach beats standard methods. Furthermore, the influence of words on the time series' predictions can be interpreted for the considered encoding schemes of the text, leading to a greater confidence in our results.

Publication HAL Google Scholar

Article de revue

Left Ventricular Trabeculations at Cardiac MRI: Reference Ranges and Association with Cardiovascular Risk Factors in UK Biobank

Nay Aung, Axel Bartoli, Elisa Rauseo, Sébastien Cortaredona, Mihir Sanghvi, Joris Fournel, Badih Ghattas, Mohammed Khanji, Steffen Petersen, Alexis Jacquier, Radiology, Vol. 311, No. 1, 04/2024

Résumé In an analysis applying automated segmentation to UK Biobank MRI scans, hypertension, higher body mass index, and higher physical activity level were associated with increased left ventricular trabeculations in healthy middle-aged White adults.

Publication HAL Google Scholar

Article de revue

Subsampling under distributional constraints

Florian Combes, Ricardo Fraiman, Badih Ghattas, Statistical Analysis and Data Mining, Vol. 17, No. 1, 02/2024

Résumé Abstract Some complex models are frequently employed to describe physical and mechanical phenomena. In this setting, we have an input in a general space, and an output where is a very complicated function, whose computational cost for every new input is very high, and may be also very expensive. We are given two sets of observations of , and of different sizes such that only is available. We tackle the problem of selecting a subset of smaller size on which to run the complex model , and such that the empirical distribution of is close to that of . We suggest three algorithms to solve this problem and show their efficiency using simulated datasets and the Airfoil self‐noise data set.

Publication HAL Google Scholar

Article de revue

Finding the best trade-off between performance and interpretability in predicting hospital length of stay using structured and unstructured data

Franck Jaotombo, Luca Adorni, Badih Ghattas, Laurent Boyer, PLoS ONE, No. 18 (11), pp. 22 p., 11/2023

Résumé Objective This study aims to develop high-performing Machine Learning and Deep Learning models in predicting hospital length of stay (LOS) while enhancing interpretability. We compare performance and interpretability of models trained only on structured tabular data with models trained only on unstructured clinical text data, and on mixed data. Methods The structured data was used to train fourteen classical Machine Learning models including advanced ensemble trees, neural networks and k-nearest neighbors. The unstructured data was used to fine-tune a pre-trained Bio Clinical BERT Transformer Deep Learning model. The structured and unstructured data were then merged into a tabular dataset after vectorization of the clinical text and a dimensional reduction through Latent Dirichlet Allocation. The study used the free and publicly available Medical Information Mart for Intensive Care (MIMIC) III database, on the open AutoML Library AutoGluon. Performance is evaluated with respect to two types of random classifiers, used as baselines. Results The best model from structured data demonstrates high performance (ROC AUC = 0.944, PRC AUC = 0.655) with limited interpretability, where the most important predictors of prolonged LOS are the level of blood urea nitrogen and of platelets. The Transformer model displays a good but lower performance (ROC AUC = 0.842, PRC AUC = 0.375) with a richer array of interpretability by providing more specific in-hospital factors including procedures, conditions, and medical history. The best model trained on mixed data satisfies both a high level of performance (ROC AUC = 0.963, PRC AUC = 0.746) and a much larger scope in interpretability including pathologies of the intestine, the colon, and the blood; infectious diseases, respiratory problems, procedures involving sedation and intubation, and vascular surgery. Conclusions Our results outperform most of the state-of-the-art models in LOS prediction both in terms of performance and of interpretability. Data fusion between structured and unstructured text data may significantly improve performance and interpretability.

Mots clés Clinical transformers, Structured and unstructured data, Data fusion, Explainable AI, Hospital length of stay

HAL Google Scholar

Article de revue

Machine Learning Alternatives to Response Surface Models

Badih Ghattas, Diane Manzon, Mathematics, Vol. 11, No. 15, pp. 3406, 08/2023

Résumé In the Design of Experiments , we seek to relate response variables to explanatory factors. Response Surface methodology (RSM) approximates the relation between output variables and a polynomial transform of the explanatory variables using a linear model. Some researchers have tried to adjust other types of models, mainly nonlinear and nonparametric. We present a large panel of Machine Learning approaches that may be good alternatives to the classical RSM approximation. The state of the art of such approaches is given, including classification and regression trees, ensemble methods, support vector machines, neural networks and also direct multi-output approaches. We survey the subject and illustrate the use of ten such approaches using simulations and a real use case. In our simulations, the underlying model is linear in the explanatory factors for one response and nonlinear for the others. We focus on the advantages and disadvantages of the different approaches and show how their hyperparameters may be tuned. Our simulations show that even when the underlying relation between the response and the explanatory variables is linear, the RSM approach is outperformed by the direct neural network multivariate model, for any sample size (

Mots clés Design of Experiments, Multi-output regression, Hyperparameter tuning

Publication HAL Google Scholar

Article de revue

Looking for a hyper polyhedron within the multidimensional space of Design Space from the results of Designs of Experiments

Diane Manzon, Badih Ghattas, Magalie Claeys-Bruno, Sophie Declomesnil, Christophe Carité, Michelle Sergent, Chemometrics and Intelligent Laboratory Systems, Vol. 232, pp. 104712, 01/2023

Résumé In pharmaceutical studies, the Quality by Design (QbD) approach is increasingly being implemented to improve product development. Product quality is tested at each step of the manufacturing process, allowing a better process understanding and a better risk management, thus avoiding manufacturing defects. A key element of QbD is the construction of a Design Space (DS), i.e., a region in which the specifications on the output parameters should be met. Among the various possible construction methods, Designs of Experiments (DoE), and more precisely Response Surface Methodology, represent a perfectly adapted tool. The DS obtained may have any geometrical shape; consequently, the acceptable variation range of an input may depend on the value of other inputs. However, the experimenters would like to directly know the variation range of each input so that their variation domains are independent. In this context, we developed a method to determine the “Proven Acceptable Independent Range” (PAIR). It consists of looking for all the hyper polyhedra included in the multidimensional DS and selecting a hyper polyhedron according to various strategies. We will illustrate the performance of our method on different DoE cases.

Mots clés Quality by Design QbD, Design of Experiments DoE, Response Surface Methodology RSM, Design Space DS, Proven Acceptable Independent Range PAIR

Publication HAL Google Scholar

Article de revue

Assessing variable importance in clustering: a new method based on unsupervised binary decision trees

Ghattas Badih, Michel Pierre, Boyer Laurent, Computational Statistics, Vol. 34, No. 1, pp. 301-321, 03/2019

Résumé We consider different approaches for assessing variable importance in clustering. We focus on clustering using binary decision trees (CUBT), which is a non-parametric top-down hierarchical clustering method designed for both continuous and nominal data. We suggest a measure of variable importance for this method similar to the one used in Breiman’s classification and regression trees. This score is useful to rank the variables in a dataset, to determine which variables are the most important or to detect the irrelevant ones. We analyze both stability and efficiency of this score on different data simulation models in the presence of noise, and compare it to other classical variable importance measures. Our experiments show that variable importance based on CUBT is much more efficient than other approaches in a large variety of situations.

Mots clés Variable importance, Deviance, CUBT, Unsupervised learning, Variables ranking

Publication HAL Google Scholar