Machine Learning and Statistical Learning
A.Y. 2022/2023
Learning objectives
Partner company: Ammagamma
This Lab is provided within the Data Science for Economics (DSE) degree program.
A small number of students can be admitted due to logistics constraints.
The students (either DSE or non-DSE) must apply for admission. Candidates will be selected by the involved institutions/companies according to CV and motivations.
For application, students must respond to a call that is posted on this website: https://dse.cdl.unimi.it/en/courses/laboratories
The call is typically published a few weeks before the Lab starts.
Comprehension of AI key concepts in the financial area, with an in-depth analysis of Time Series forecasting and Natural Processing methods.
This Lab is provided within the Data Science for Economics (DSE) degree program.
A small number of students can be admitted due to logistics constraints.
The students (either DSE or non-DSE) must apply for admission. Candidates will be selected by the involved institutions/companies according to CV and motivations.
For application, students must respond to a call that is posted on this website: https://dse.cdl.unimi.it/en/courses/laboratories
The call is typically published a few weeks before the Lab starts.
Comprehension of AI key concepts in the financial area, with an in-depth analysis of Time Series forecasting and Natural Processing methods.
Expected learning outcomes
Upon completion of the course students will be able to:
1. understand the notion of overfitting and its role in controlling the statistical risk
2. describe some of the most important machine learning algorithms and explain how they avoid overfitting
3. run machine learning experiments using the correct statistical methodology
4. provide statistical interpretations of the results.
1. understand the notion of overfitting and its role in controlling the statistical risk
2. describe some of the most important machine learning algorithms and explain how they avoid overfitting
3. run machine learning experiments using the correct statistical methodology
4. provide statistical interpretations of the results.
Lesson period: Third trimester
Assessment methods: Esame
Assessment result: voto verbalizzato in trentesimi
Single course
This course cannot be attended as a single course. Please check our list of single courses to find the ones available for enrolment.
Course syllabus and organization
Single session
Responsible
Lesson period
Third trimester
Prerequisites for admission
The course requires basic knowledge in calculus, linear algebra, programming and statistics.
Assessment methods and Criteria
For the module Machine learning, The exam consists in writing a paper of about 10-15 pages containing either a report describing experimental results (experimental project) or an in-depth analysis of a theoretical topic (theory project). The final grade is computed by combining the project evaluation with the result of a written test on the syllabus covered in class. Depending on the number of students, the written test may be replaced by an oral discussion.
For the Module Statistical Learning, the exam consists in preparing two individual projects using the package R, one on supervised and one on unsupervised learning. The projects will be discussed in an oral examination, in which students will be asked to explain and discuss the methodological choices, the code, the results. The ability to communicate and the critical ability to interpret the results will be evaluated. The grade is computed by combining the projects evaluation and the oral examination.
The final grade is the mean of the grades obtained in each module.
For the Module Statistical Learning, the exam consists in preparing two individual projects using the package R, one on supervised and one on unsupervised learning. The projects will be discussed in an oral examination, in which students will be asked to explain and discuss the methodological choices, the code, the results. The ability to communicate and the critical ability to interpret the results will be evaluated. The grade is computed by combining the projects evaluation and the oral examination.
The final grade is the mean of the grades obtained in each module.
Machine learning and Statistical Learning-Module Machine Learning
Course syllabus
1. Introduction
2. The Nearest Neighbour algorithm
3. Tree predictors
4. Statistical learning
5. Hyperparameter tuning and risk estimates
6. Risk analysis of Nearest Neighbour
7. Risk analysis of tree predictors
8. Consistency, surrogate functions, nonparametric algorithms
9. Linear predictors
10. Online gradient descent
11. From sequential risk to statistical risk
12. Kernel functions
13. Support Vector Machines
14. Stability bounds and risk control for SVM
15. Boosting and ensemble methods
16. Neural networks and deep learning
2. The Nearest Neighbour algorithm
3. Tree predictors
4. Statistical learning
5. Hyperparameter tuning and risk estimates
6. Risk analysis of Nearest Neighbour
7. Risk analysis of tree predictors
8. Consistency, surrogate functions, nonparametric algorithms
9. Linear predictors
10. Online gradient descent
11. From sequential risk to statistical risk
12. Kernel functions
13. Support Vector Machines
14. Stability bounds and risk control for SVM
15. Boosting and ensemble methods
16. Neural networks and deep learning
Teaching methods
Lectures
The goal of this course is to provide a methodological foundation to machine learning. The emphasis is on the design and analysis of learning algorithms with theoretical performance guarantees.
The goal of this course is to provide a methodological foundation to machine learning. The emphasis is on the design and analysis of learning algorithms with theoretical performance guarantees.
Teaching Resources
The main reference are the lecture notes available through the link ncesa-bianchismml.ariel.ctu.unimi.it/
A further reference is the textbook: Shai Shalev-Shwartz e Shai Ben-David, Understanding Machine Learning: From Theory to Algorithms, Cambridge University Press, 2014.
A further reference is the textbook: Shai Shalev-Shwartz e Shai Ben-David, Understanding Machine Learning: From Theory to Algorithms, Cambridge University Press, 2014.
Machine learning and Statistical Learning-Module Statistical Learning
Course syllabus
1.Introduction to Statistical Learning
2. Cross Validation and Bootstrap
3. Variable Selection, Ridge and Lasso Regression
4. Linear Models
5. Non Linear Models
6. Logistic Regression and classification Methods
7. Classification and Regression Trees, bagging, boosting and Random Forest
8. Unsupervised learning (Clustering, PCA)
9. Brief notes on neural networks (tentative)
10. Brief notes on the association rules (tentative)
2. Cross Validation and Bootstrap
3. Variable Selection, Ridge and Lasso Regression
4. Linear Models
5. Non Linear Models
6. Logistic Regression and classification Methods
7. Classification and Regression Trees, bagging, boosting and Random Forest
8. Unsupervised learning (Clustering, PCA)
9. Brief notes on neural networks (tentative)
10. Brief notes on the association rules (tentative)
Teaching methods
Lectures and Lab sessions
The goal of this module is to provide a methodological and practical overview to statistical learning methods. The emphasis is on the applications.
Optional group work will be offered to get familiar with the software and increase practical skills.
The goal of this module is to provide a methodological and practical overview to statistical learning methods. The emphasis is on the applications.
Optional group work will be offered to get familiar with the software and increase practical skills.
Teaching Resources
James, G., Witten, D., Hastie, T., & Tibshirani, R. (2021). An introduction to statistical learning, Springer.
A further reference is the textbook:
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: data mining, inference, and prediction. Springer Science & Business Media.
A further reference is the textbook:
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: data mining, inference, and prediction. Springer Science & Business Media.
Machine learning and Statistical Learning-Module Machine Learning
INF/01 - INFORMATICS - University credits: 6
Lessons: 40 hours
Professor:
Cesa Bianchi Nicolo' Antonio
Machine learning and Statistical Learning-Module Statistical Learning
SECS-S/01 - STATISTICS - University credits: 6
Lessons: 40 hours
Professor:
Salini Silvia
Professor(s)
Reception:
The student reception is in attendance, by appointment, on Friday from 09.30 to 11.00 and via Teams, by appointment, on Monday from 15.00 to 16.30.
DEMM, room 30, 3° floor