Goals

The objective of this course is to provide the classic tools of mathematical statistics which includes the choice of the probabilistic model, its estimation and its evaluation. We will be particularly interested in the linear model and its extensions in the context of high-dimensional statistical learning (LASSO, RIDGE, PCR PLS), the logistic model and tree-based models (CART, RF, Boosting etc. ). The aim of this course is also to provide training in the manipulation of data and the practical implementation of the studied models. For this, a substantial part of the course is oriented towards the implementation of the different models using the R software through the study of a large number of examples.

Programme

  1. Linear regression. Validities and limitations of the method. Model selection.
  2. Design of experiments: screening and response surface
  3. Logistic regression
  4. Elements of statistical learning in high dimension
  5. Clustering

PRACTICAL ACTIVITIES The three activities will be devoted to learning the techniques of regression models on the R software. Numerous data sets will be studied.

Sustainable development

Sustainable Development Goals

Level 1: Activity contextualised through environmentally sustainable development and social responsibility and/or supported by examples, exercises, applications.

DD&RS level 1

Activity contextualised through environmentally sustainable development and social responsibility and/or supported by examples, exercises, applications.

Study
12h
 
Course
16h
 

Responsibles

  • Céline HARTWEG-HELBERT
  • Alexandre SAIDI

Language

French

Keywords

Linear and logistic regression. Model selection. Design of experiments. L1 L2 Penalized regression. Regression trees.