Julieta Peveri : julieta.peveri[at]univ-amu.fr
Bertille Picard : bertille.picard[at]univ-amu.fr
Mathias Silva : mathias.silva-vazquez[at]univ-amu.fr
With the explosion of the Global DataSphere, reams of data sources emerged. This opens the doors for researchers to explore new horizons thanks to the huge amounts of data, most of the time coming from the internet. The web and social media are overwhelmed with textual data for instance. This major source of information has been for a while neglected in statistical studies, but increasingly scholars are getting aware of that and try to account for textual information in their works. However, this trend is still shy and that is why I gathered key materials to share some insights of the exciting world of text processing and analysis, and to show how these tools can be used in statistical modelling. The applications in the materials are run with Python, the most used programming language for these tasks.