Cette formation n’est pas publiée
Aide à l’édition

Goals

In the field of data classification, we focus on the family of Bayesian methods, which is distinguished by its optimality in the sense of certain criteria, by its reduced cost from an algorithmic point of view and by the interpretability of its results. We will also study the solutions available to the data scientist when the learning sample is small in relation to the number of parameters to be learned, or when the learning must be done in an unsupervised manner. In terms of application, we will focus on the exploration of a textual corpus to discover, for example, new customers eligible for the sale of a service/product, to predict the feelings (opinions) of customers or to understand the behaviours that predict fraud.

Programme

  • Bayesian decision (2h)
  • Gaussian mixture model (2h)
  • Hidden Markov chain (2h)
  • Practical work on Bayesian learning (4h)
  • Computational linguistic, NLP and practical Text Mining (8h)
  • Restitution of a scientific reading by group (4h)

Assessment method

Grade = 50% knowledge + 50% know-how Knowledge mark = 100% final exam Know-how mark = 50% for practical and 50% scientific paper restitution

Bibliography

  • M. R. Gupta and Y. Chen, Theory and Use of the EM Algorithm, Foundations and Trends in Signal Processing, Vol. 4(3), pp. 223–296, 2011.0
  • M. Watanabe and K. Yamaguchi, The EM algorithm and related statistical models, Statistics: Dekker series of textbooks and monographs, 2004.0
  • Michael W. Berry, Jacob Kogan, Text Mining: Applications and Theory, Willey, 2010.0
Study
8h
 
Course
12h
 

Code

24_I_G_S09_MSO_INFO_3_7

Responsibles

  • Alexandre SAIDI
  • Stéphane DERRODE

Language

French

Keywords

Bayesian decision theory, Unsupervised learning, Hidden Markov models, Text mining, Sentiment analysis, Chatbot, Natural Language Processing, Automatic translation.