Goals

In many scientific fields, such as biology or environmental sciences, the rapid evolution of scientific instruments, as well as the intensive use of computer simulation, has led to a significant production of data in recent years. Scientific applications are now facing new problems related to the storage and use of these large volumes of data. The problem is much the same for the management of data collected by social networks, this time with the objective of commercial optimization.

The proposed teaching will allow students to discover 3 major technologies emblematic of big-data processing (MongoDB, Hadoop and Spark), which are widely used by companies or institutions that have to manage such volumes of data.

Programme

  • 3 sessions of 2 hours each on MongoDB, Hadoop and Spark.
  • 4 sessions of practical works on MongoDB, Hadoop and Spark.
  • 1 practical work session of 2 hours on Spark MLlib.
Study
14h
 
Course
6h
 

Responsibles

  • Alexandre SAIDI
  • Daniel MULLER
  • Mohsen ARDABILIAN
  • Stéphane DERRODE

Language

French

Keywords

Big Data, NoSQL, MongoDB, Hadoop, Spark, python