Statistics for Big Data

A.Y. 2018/2019
6
Max ECTS
40
Overall hours
SSD
SECS-S/01
Language
Italian
Learning objectives
Il corso si propone di introdurre ed illustrare specifiche metodologie statistiche, informatiche e di data mining per l'analisi di Big Data. L'implementazione di tali tecniche avverrà mediante l'impiego del software statistico R. Al termine del corso, lo studente dovrà aver acquisito adeguate competenze statistiche e di programmazione che gli consentano di padroneggiare gli strumenti statistici ed informatici necessari per l'analisi dei dati e l'estrapolazione delle informazioni di interesse derivante dai dati stessi.
Expected learning outcomes
Undefined
Single course

This course cannot be attended as a single course. Please check our list of single courses to find the ones available for enrolment.

Course syllabus and organization

Single session

Lesson period
Third trimester
ATTENDING STUDENTS
Course syllabus
The course will be organized according to the following topics:

FIRST PART :

1) DATA MINING TECHNIQUES 1: supervised models
1.1 generalized linear models (logit, probit and tobit)
1.2 multilevel models

2) DATA MINING 2 TECHNIQUES: unsupervised models
2.1 cluster analysis
2.2 principal component analysis
2.3 factor analysis
2.4 cross-validation
2.5 text mining

SECOND PART :

1) Introduction to programming in R and Python
2) Data mash up techniques
3) Cloud computing techniques
4) Web scraping techniques
5) Interaction with relational and non-relational databases
6) Big data analytics
NON-ATTENDING STUDENTS
Course syllabus
The course will be organized according to the following topics:

FIRST PART :

1) DATA MINING TECHNIQUES 1: supervised models
1.1 generalized linear models (logit, probit and tobit)
1.2 multilevel models

2) DATA MINING 2 TECHNIQUES: unsupervised models
2.1 cluster analysis
2.2 principal component analysis
2.3 factor analysis
2.4 cross-validation
2.5 text mining

SECOND PART :

1) Introduction to programming in R and Python
2) Data mash up techniques
3) Cloud computing techniques
4) Web scraping techniques
5) Interaction with relational and non-relational databases
6) Big data analytics
SECS-S/01 - STATISTICS - University credits: 6
Lessons: 40 hours
Professor: Manzi Giancarlo