Laboratory "reinforcement Learning"

A.Y. 2022/2023
3
Max ECTS
20
Overall hours
SSD
INF/01
Language
English
Learning objectives
This Lab is provided within the Data Science for Economics (DSE) degree program.
A small number of students can be admitted due to logistics constraints.
The students (either DSE or non-DSE) must apply for admission. Candidates will be selected by the involved institutions/companies according to CV and motivations.
For application, students must respond to a call that is posted on this website: https://dse.cdl.unimi.it/en/courses/laboratories
The call is typically published a few weeks before the Lab starts.

This laboratory provides a of Reinforcement Learning, the subfield of Machine Learning studying adaptive agents that take actions and interact with an unknown environment. Reinforcement learning is a powerful paradigm for the study of autonomous AI systems, and has been applied to a wide range of tasks, including self-driving cars, game playing, customer management, and healthcare.
Expected learning outcomes
Upon completion of the course students will be able to:
-understand Markov Decision Processes,
-understand some basic learning algorithms for MDP
-run experiments in simulated environments.
These objectives are measured via a combination of two components:the project report and the oral discussion. The final grade is formed byassessing the project report, and then using the oral discussion for finetuning.
Single course

This course cannot be attended as a single course. Please check our list of single courses to find the ones available for enrolment.

Course syllabus and organization

Single session

Lesson period
Second trimester
Course syllabus
1 Fundamentals
1.1 Markov Decision Processes and Bellman optimality equations
1.2 Value iteration and policy iteration
1.3 Linear programming formulation
1.4 Sample complexity
2 Exploration
2.1 Multi-armed bandits
2.2 Efficient exploration in tabular MDPs
2.3 Linear bandits
2.4 Efficient exploration in linearly parameterized MDPs
3 Policy optimization
3.1 Policy gradient methods
3.2 Regularized methods
Prerequisites for admission
The course requires basic knowledge in calculus, linear algebra, and statistics.
Knowledge of the Python programming language is also required.
Teaching methods
Lecture-style instruction with worked-out examples.
Teaching Resources
Shie Mannor, Yishay Mansour, and Aviv Tamar
Reinforcement Learning: Foundations
(Working Draft: https://sites.google.com/view/rlfoundations/home)

Lecture notes and Jupyter notebooks provided by the instructor.
Assessment methods and Criteria
Experimental project. The project will be evaluated through a discussion which will also include questions on the theory covered in the course. The final grade will take into account both the project and the oral examination.
INF/01 - INFORMATICS - University credits: 3
Lessons: 20 hours
Professor(s)
Reception:
By appointment
18, via Celoria. Room 7007