Advanced Mathematical Statistics
A.Y. 2024/2025
Learning objectives
Nozioni e teoremi base della Statistica Matematica multivariata e computazionale, che lo studente sarà poi in grado di approfondire in ambito sia teorico che applicativo. Lo studente sarà inoltre in grado di applicare tali competenze all'analisi statistica di dati multivariati o di grandi dimensioni.
Expected learning outcomes
Basic notions and theorems of Multivariate Mathematical and Computational Statistics.
The student will then be able to apply and broaden his/her knowledge of the subjects in different areas of interest, both in theoretical and applied contexts, and to perform statistical data analyses, both in the multivariate and big data case.
The student will then be able to apply and broaden his/her knowledge of the subjects in different areas of interest, both in theoretical and applied contexts, and to perform statistical data analyses, both in the multivariate and big data case.
Lesson period: First semester
Single course
This course can be attended as a single course.
Course syllabus and organization
Single session
Responsible
Lesson period
First semester
Course syllabus
Here an indication of the chapters that should be developed is provided. The teachers could operate a selection due to lack of time.
Part A. Statistical methods to treat small samples of big dimension (dimensionality reduction)
1. Ridge regression
2. Shrinkage methods to estimate the covariance matrix
3. Methods of penalized regression LASSO
4. Principal Components Analysis (PCA)
Part B. Statistical Methods for the analysis of Big Data
5. Locality Sensitive Hashing (LSH)
6. Finding Similar Items
7. Frequent Itemsets
8. Cluster analysis
9. Techniques for dimensionality reduction
10. Analysis of data streams
11. Analysis of social networks
12. Computer Lab
Data analysis by statistical softwares (Python and Spark)
Part A. Statistical methods to treat small samples of big dimension (dimensionality reduction)
1. Ridge regression
2. Shrinkage methods to estimate the covariance matrix
3. Methods of penalized regression LASSO
4. Principal Components Analysis (PCA)
Part B. Statistical Methods for the analysis of Big Data
5. Locality Sensitive Hashing (LSH)
6. Finding Similar Items
7. Frequent Itemsets
8. Cluster analysis
9. Techniques for dimensionality reduction
10. Analysis of data streams
11. Analysis of social networks
12. Computer Lab
Data analysis by statistical softwares (Python and Spark)
Prerequisites for admission
The students should have followed an introductory course to Mathematical Statistics, with particular reference to statistical hypotheses tests and Linear Regression.
Teaching methods
Frontal lectures and computer labs
Teaching Resources
Wessel N. van Wieringen, Lecture notes on ridge regression, https://arxiv.org/pdf/1509.09169.pdf
I.T.Jolliffe, Principal Component Analysis. 2nd Edition. Springer, 2002
Jure Leskovec, Anand Rajaraman, Jeff Ullman, Mining of massive datasets, Cambridge University Press, 2014. Versione online: http://www.mmds.org/
Lecture notes of the teachers
I.T.Jolliffe, Principal Component Analysis. 2nd Edition. Springer, 2002
Jure Leskovec, Anand Rajaraman, Jeff Ullman, Mining of massive datasets, Cambridge University Press, 2014. Versione online: http://www.mmds.org/
Lecture notes of the teachers
Assessment methods and Criteria
The exam is composed by a set of homeworks that will be assigned by the teachers during the course, composed by both multivariate and big dimensional data analysis and guided development of methodologies for big data analysis.
The homeworks are dedicated to the students who follow the course in real time, thus the attendance of the course is highly recommended.
The non attending students, or the students that will reject the grade resulting from the homeworks, will have to pass an oral exam on the entire program of the course.
The aim of the exams is to ascertain the achievement of the objectives in terms of knowledge and comprehension and of the ability of the students to solve problems of multivariate and big data analysis with suitable mathematical-statistical instruments.
The homeworks are dedicated to the students who follow the course in real time, thus the attendance of the course is highly recommended.
The non attending students, or the students that will reject the grade resulting from the homeworks, will have to pass an oral exam on the entire program of the course.
The aim of the exams is to ascertain the achievement of the objectives in terms of knowledge and comprehension and of the ability of the students to solve problems of multivariate and big data analysis with suitable mathematical-statistical instruments.
MAT/06 - PROBABILITY AND STATISTICS - University credits: 9
Laboratories: 36 hours
Lessons: 42 hours
Lessons: 42 hours
Professors:
Aletti Giacomo, Micheletti Alessandra
Shifts:
Educational website(s)
Professor(s)
Reception:
on appointment
office 2099