Pierre Michel

Chercheurs

Aix-Marseille Université

Faculté d'économie et de gestion (FEG)

Économétrie, finance et méthodes mathématiques

Statut

Maître de conférences

Domaine(s) de recherche

Économétrie

Thèse

2016

Aix-Marseille Université

Téléchargement

Contact

pierre.michel[at]univ-amu.fr

Publications

Publications

Impact of socioeconomic determinants on the speed of epidemic diseases: a comparative analysisJournal articleGilles Dufrénot, Ewen Gallic, Pierre Michel, Norgile Midopkè Bonou, Ségui Gnaba et Iness Slaoui, Oxford Economic Papers, pp. gpae003, 2024

We study the impact of socioeconomic factors on two key parameters of epidemic dynamics. Specifically, we investigate a parameter capturing the rate of deceleration at the very start of an epidemic, and a parameter that reflects the pre-peak and post-peak dynamics at the turning point of an epidemic like coronavirus disease 2019 (COVID-19). We find two important results. The policies to fight COVID-19 (such as social distancing and containment) have been effective in reducing the overall number of new infections, because they influence not only the epidemic peaks, but also the speed of spread of the disease in its early stages. The second important result of our research concerns the role of healthcare infrastructure. They are just as effective as anti-COVID policies, not only in preventing an epidemic from spreading too quickly at the outset, but also in creating the desired dynamic around peaks: slow spreading, then rapid disappearance.

Optimal lockdowns for COVID-19 pandemics: Analyzing the efficiency of sanitary policies in EuropeJournal articleEwen Gallic, Michel Lubrano et Pierre Michel, Journal of Public Economic Theory, Volume 24, Issue 5, pp. 944-967, 2022

Two main nonpharmaceutical policy strategies have been used in Europe in response to the COVID-19 epidemic: one aimed at natural herd immunity and the other at avoiding saturation of hospital capacity by crushing the curve. The two strategies lead to different results in terms of the number of lives saved on the one hand and production loss on the other hand. Using a susceptible–infected–recovered–dead model, we investigate and compare these two strategies. As the results are sensitive to the initial reproduction number, we estimate the latter for 10 European countries for each wave from January 2020 till March 2021 using a double sigmoid statistical model and the Oxford COVID-19 Government Response Tracker data set. Our results show that Denmark, which opted for crushing the curve, managed to minimize both economic and human losses. Natural herd immunity, sought by Sweden and the Netherlands does not appear to have been a particularly effective strategy, especially for Sweden, both in economic terms and in terms of lives saved. The results are more mixed for other countries, but with no evident trade-off between deaths and production losses.

Development and Calibration of the PREMIUM Item Bank for Measuring Respect and Dignity for Patients with Severe Mental IllnessJournal articleSara Fernandes, Guillaume Fond, Xavier Zendjidjian, Pierre Michel, Karine Baumstarck, Christophe Lançon, Ludovic Samalin, Pierre-Michel Llorca, Magali Coldefy, Pascal Auquier, et al., Journal of Clinical Medicine, Volume 11, Issue 6, pp. 1644, 2022

Most patient-reported experience measures (PREMs) are paper-based, leading to a high burden for patients and care providers. The aim of this study was to (1) calibrate an item bank to measure patients’ experience of respect and dignity for adult patients with serious mental illnesses and (2) develop computerized adaptive testing (CAT) to improve the use of this PREM in routine practice. Patients with schizophrenia, bipolar disorder, and major depressive disorder were enrolled in this multicenter and cross-sectional study. Psychometric analyses were based on classical test and item response theories and included evaluations of unidimensionality, local independence, and monotonicity; calibration and evaluation of model fit; analyses of differential item functioning (DIF); testing of external validity; and finally, CAT development. A total of 458 patients participated in the study. Of the 24 items, 2 highly inter-correlated items were deleted. Factor analysis showed that the remaining items met the unidimensional assumption (RMSEA = 0.054, CFI = 0.988, TLI = 0.986). DIF analyses revealed no biases by sex, age, care setting, or diagnosis. External validity testing has generally supported our assumptions. CAT showed satisfactory accuracy and precision. This work provides a more accurate and flexible measure of patients’ experience of respect and dignity than that obtained from standard questionnaires.

A filter approach for feature selection in classification: application to automatic atrial fibrillation detection in electrocardiogram recordingsJournal articlePierre Michel, Nicolas Ngo, Jean-François Pons, Stéphane Delliaux et Roch Giorgi, BMC Medical Informatics and Decision Making, Volume 21, Issue Suppl 4, pp. 130, 2021

BACKGROUND:
In high-dimensional data analysis, the complexity of predictive models can be reduced by selecting the most relevant features, which is crucial to reduce data noise and increase model accuracy and interpretability. Thus, in the field of clinical decision making, only the most relevant features from a set of medical descriptors should be considered when determining whether a patient is healthy or not. This statistical approach known as feature selection can be performed through regression or classification, in a supervised or unsupervised manner. Several feature selection approaches using different mathematical concepts have been described in the literature. In the field of classification, a new approach has recently been proposed that uses the y-metric, an index measuring separability between different classes in heart rhythm characterization. The present study proposes a filter approach for feature selection in classification using this y-metric, and evaluates its application to automatic atrial fibrillation detection.

METHODS:
The stability and prediction performance of the [Formula: see text]-metric feature selection approach was evaluated using the support vector machine model on two heart rhythm datasets, one extracted from the PhysioNet database and the other from the database of Marseille University Hospital Center, France (Timone Hospital). Both datasets contained electrocardiogram recordings grouped into two classes: normal sinus rhythm and atrial fibrillation. The performance of this feature selection approach was compared to that of three other approaches, with the first two based on the Random Forest technique and the other on receiver operating characteristic curve analysis.

RESULTS:
The [Formula: see text]-metric approach showed satisfactory results, especially for models with a smaller number of features. For the training dataset, all prediction indicators were higher for our approach (accuracy greater than 99% for models with 5 to 17 features), as was stability (greater than 0.925 regardless of the number of features included in the model). For the validation dataset, the features selected with the y-metric approach differed from those selected with the other approaches; sensitivity was higher for our approach, but other indicators were similar.

CONCLUSION:
This filter approach for feature selection in classification opens up new methodological avenues for atrial fibrillation detection using short electrocardiogram recordings.

Application of Functional Data Analysis to Identify Patterns of Malaria Incidence, to Guide Targeted Control StrategiesJournal articleSokhna Dieng, Pierre Michel, Abdoulaye Guindo, Kankoe Sallah, El-Hadj Ba, Badara Cissé, Maria Patrizia Carrieri, Cheikh Sokhna, Paul Milligan et Jean Gaudart, International Journal of Environmental Research and Public Health, Volume 17, Issue 11, pp. 4168, 2020

We introduce an approach based on functional data analysis to identify patterns of malaria incidence to guide effective targeting of malaria control in a seasonal transmission area. Using functional data method, a smooth function (functional data or curve) was fitted from the time series of observed malaria incidence for each of 575 villages in west-central Senegal from 2008 to 2012. These 575 smooth functions were classified using hierarchical clustering (Ward’s method), and several different dissimilarity measures. Validity indices were used to determine the number of distinct temporal patterns of malaria incidence. Epidemiological indicators characterizing the resulting malaria incidence patterns were determined from the velocity and acceleration of their incidences over time. We identified three distinct patterns of malaria incidence: high-, intermediate-, and low-incidence patterns in respectively 2% (12/575), 17% (97/575), and 81% (466/575) of villages. Epidemiological indicators characterizing the fluctuations in malaria incidence showed that seasonal outbreaks started later, and ended earlier, in the low-incidence pattern. Functional data analysis can be used to identify patterns of malaria incidence, by considering their temporal dynamics. Epidemiological indicators derived from their velocities and accelerations, may guide to target control measures according to patterns.

Predicting musculoskeletal disorders risk using tree-based ensemble methodsJournal articleAlain Paraponaris, A. Ba, Ewen Gallic, Q. Liance et Pierre Michel, European Journal of Public Health, Volume 29, Issue Supplement_4, 2019

Background:
Musculoskeletal disorders (MSD) can cause short-term disorders and permanent disabilities which may all result in serious limitations in ac

Analyse du discours médical sur Twitter®. Étude d’un corpus de tweets émis par des médecins généralistes entre juin 2012 et mars 2017 et contenant le hashtag #DocTocTocJournal articleAdrien Salles, Jean-Charles Dufour, P. Hassanaly, Pierre Michel, Chloé Cabot et Julien Grosjean, Revue d'Épidémiologie et de Santé Publique, Volume 67, Issue 3, pp. S152-S153, 2019

Introduction:
Les technologies de l’information et de la communication ont permis la naissance du web 2.0, caractérisé par la mise en place et l’utilisation de nouveaux outils collaboratifs de communication tels que les blogs, les wikis, les fils RSS et les réseaux sociaux. En s’appropriant ces outils, une médecine participative basée sur le partage d’informations et d’expériences entre professionnels, patients et tout acteur de la santé s’est développée. Depuis juin 2012, une communauté médicale échange sur Twitter avec le hashtag #DocTocToc et contribue à la naissance de la e-santé sur ce réseau social. L’objectif de cette étude est d’analyser les principales thématiques des demandes effectuées via le hashtag #DocTocToc par les médecins généralistes entre juin 2012 et mars 2017.

Méthodes:
Une collecte de données par une méthode de « web scraping » a permis de constituer un corpus de tweets dont les auteurs ont été identifiés manuellement afin de procéder à un échantillonnage, de façon à ne conserver que les tweets émis par les médecins généralistes. Une étape de prétraitement a permis de transformer les formes potentiellement non reconnues par les logiciels de traitement du langage naturel. Le corpus a été appréhendé à l’aide de deux approches : une approche lexicale via le logiciel Iramuteq® et une indexation terminologique par l’extracteur de concepts multi-terminologiques (ECMT) du Catalogue et index des sites médicaux francophones (CISMeF).

Résultats:
Sur les 12 716 tweets recueillis, 7366 étaient rédigés par des médecins généralistes et ont été analysés. L’approche lexicale détermine deux grands mondes lexicaux représentés sous forme de dendrogramme, l’un en lien avec les demandes médico administratives relatives à la gestion du cabinet et à la prise en charge sociale du patient, l’autre en lien avec les demandes d’ordre purement médicales. La méthode d’indexation terminologique met en évidence les spécialités médicales pourvoyeuses de demandes de télé-expertise : gynécologie, neurologie, infectiologie, pédiatrie, cardiologie, dermatologie ; et permet de les croiser avec l’objectif de la demande : diagnostic, thérapeutique.

Conclusion:
Sur Twitter®, le hashtag #DocTocToc est utilisé par les médecins généralistes comme un espace de partage informel d’informations en matière de santé mais aussi de gestion de problèmes administratifs et sociaux. Le DocsTocToc se présente comme un groupe d’échange de pratique à grande échelle ou le médecin compte sur l’avis de ses pairs.(Fig. 1)

Assessing variable importance in clustering: a new method based on unsupervised binary decision treesJournal articleGhattas Badih, Pierre Michel et Boyer Laurent, Computational Statistics, Volume 34, Issue 1, pp. 301-321, 2019

We consider different approaches for assessing variable importance in clustering. We focus on clustering using binary decision trees (CUBT), which is a non-parametric top-down hierarchical clustering method designed for both continuous and nominal data. We suggest a measure of variable importance for this method similar to the one used in Breiman’s classification and regression trees. This score is useful to rank the variables in a dataset, to determine which variables are the most important or to detect the irrelevant ones. We analyze both stability and efficiency of this score on different data simulation models in the presence of noise, and compare it to other classical variable importance measures. Our experiments show that variable importance based on CUBT is much more efficient than other approaches in a large variety of situations.

The Patient-Reported Experience Measure for Improving qUality of care in Mental health (PREMIUM) project in France: study protocol for the development and implementation strategyJournal articleSara Fernandes, Guillaume Fond, Xavier Zendjidjian, Pierre Michel, Karine Baumstarck, Christophe Lançon, Fabrice Berna, Franck Schurhoff, Bruno Aouizerate, Chantal Henry, et al., Patient Preference and Adherence, Volume 13, pp. 165-177, 2019

Background:
Measuring the quality and performance of health care is a major challenge in improving the efficiency of a health system. Patient experience is one important measure of the quality of health care, and the use of patient-reported experience measures (PREMs) is recommended. The aims of this project are 1) to develop item banks of PREMs that assess the quality of health care for adult patients with psychiatric disorders (schizophrenia, bipolar disorder, and depression) and to validate computerized adaptive testing (CAT) to support the routine use of PREMs; and 2) to analyze the implementation and acceptability of the CAT among patients, professionals, and health authorities.

Methods:
This multicenter and cross-sectional study is based on a mixed method approach, integrating qualitative and quantitative methodologies in two main phases: 1) item bank and CAT development based on a standardized procedure, including conceptual work and definition of the domain mapping, item selection, calibration of the item bank and CAT simulations to elaborate the administration algorithm, and CAT validation; and 2) a qualitative study exploring the implementation and acceptability of the CAT among patients, professionals, and health authorities.

Discussion:
The development of a set of PREMs on quality of care in mental health that overcomes the limitations of previous works (ie, allowing national comparisons regardless of the characteristics of patients and care and based on modern testing using item banks and CAT) could help health care professionals and health system policymakers to identify strategies to improve the quality and efficiency of mental health care.

Évaluation empirique d’une nouvelle méthode multivariée de sélection de variables en classification supervisée : la métrique γJournal articlePierre Michel, J. - F. Pons, R. Giorgi et Stéphane Delliaux, Revue d'Épidémiologie et de Santé Publique, Volume 66, Issue 3, pp. S137-S138, 2018

Introduction :
Dans l’analyse de données massives en santé, il est préférable de ne considérer que les variables les plus importantes pour un modèle donné afin de réduire les temps de calcul. Par exemple, pour qualifier l’état physiologique d’un patient à partir de descripteurs de nature médicale, seules les variables les plus pertinentes devraient être conservées afin d’améliorer l’aide à la décision clinique. Cette approche, appelée sélection de variables, peut être envisagée dans la régression ou la classification, de façon supervisée ou non supervisée. De nombreuses méthodes existent, reposant sur différentes approches ou métriques ayant des propriétés mathématiques spécifiques. Dans le cadre de la classification supervisée, une nouvelle méthode de sélection de variables basée sur un indice de séparabilité, la métrique γ a récemment été proposée (Pons et al., 2017). L’objectif de ce travail est d’étudier, de manière empirique, les performances de cette méthode.

Méthodes :
La métrique γ mesure la séparabilité entre plusieurs classes d’observations. Elle repose sur le calcul des vecteurs et valeurs propres de la matrice de covariance de chaque classe afin de sélectionner le sous-ensemble de variables qui maximise la séparabilité interclasse. Nous avons comparé cette métrique, par validation croisée, avec des méthodes classiques. Toutes les méthodes ont été appliquées sur trois jeux de données médicales de référence dans le domaine de la prédiction de diagnostic. Pour chaque jeu de données, nous avons évalué l’efficacité de cette méthode vis-à-vis de ses concurrentes, au regard d’indices de performance de classification et du nombre de variables sélectionnées.

Résultats :
Le Tableau 1 contient les moyennes des indices de performances obtenues pour chaque jeu de données. Les résultats de la validation croisée font apparaître une meilleure performance de la méthode basée sur la métrique γ, pour deux des trois jeux de données utilisés. Dans le cas des données de patients atteints de cancer, cette méthode est toujours meilleure que ses concurrentes en termes d’indices de performance et améliore le modèle contenant les variables initiales.

Conclusion :
Sur ces données empiriques servant régulièrement de banc de test, la métrique γ a obtenu de bonnes performances. Ces résultats préliminaires présentent un intérêt pour la mise en place future de stratégies de diagnostic automatique, basées sur d’autres types de données massives, issues par exemple d’objets connectés.