Laboratory "reinforcement learning" | Università degli Studi di Milano Statale

A.A. 2022/2023

Crediti massimi

Ore totali

SSD

INF/01

Lingua

Inglese

Corsi di laurea che utilizzano l'insegnamento

Data science and economics (DSE) classe lm-91-enrolled untill 2021/2022academic year

Data science for economics (Classe LM-data)-enrolled from 2022/23 academic year

Obiettivi formativi

This Lab is provided within the Data Science for Economics (DSE) degree program.
A small number of students can be admitted due to logistics constraints.
The students (either DSE or non-DSE) must apply for admission. Candidates will be selected by the involved institutions/companies according to CV and motivations.
For application, students must respond to a call that is posted on this website: https://dse.cdl.unimi.it/en/courses/laboratories
The call is typically published a few weeks before the Lab starts.

This laboratory provides a of Reinforcement Learning, the subfield of Machine Learning studying adaptive agents that take actions and interact with an unknown environment. Reinforcement learning is a powerful paradigm for the study of autonomous AI systems, and has been applied to a wide range of tasks, including self-driving cars, game playing, customer management, and healthcare.

Risultati apprendimento attesi

Upon completion of the course students will be able to:
-understand Markov Decision Processes,
-understand some basic learning algorithms for MDP
-run experiments in simulated environments.
These objectives are measured via a combination of two components:the project report and the oral discussion. The final grade is formed byassessing the project report, and then using the oral discussion for finetuning.

Periodo: Secondo trimestre

Orari delle lezioni

Modalità di valutazione: Giudizio di approvazione
Giudizio di valutazione: superato/non superato

Calendario degli appelli

Corso singolo

Questo insegnamento non può essere seguito come corso singolo. Puoi trovare gli insegnamenti disponibili consultando il catalogo corsi singoli.

Cerca un corso singolo

Programma e organizzazione didattica

Edizione unica

Responsabile

Cesa Bianchi Nicolo' Antonio

Periodo

Secondo trimestre

Programma

Programma

1 Fundamentals
1.1 Markov Decision Processes and Bellman optimality equations
1.2 Value iteration and policy iteration
1.3 Linear programming formulation
1.4 Sample complexity
2 Exploration
2.1 Multi-armed bandits
2.2 Efficient exploration in tabular MDPs
2.3 Linear bandits
2.4 Efficient exploration in linearly parameterized MDPs
3 Policy optimization
3.1 Policy gradient methods
3.2 Regularized methods

Prerequisiti

Il corso richiede delle conoscenze di base di analisi, algebra lineare e statistica.
E` anche richiesta la conoscenza del linguaggio di programmazione Python.

Metodi didattici

Lezioni frontali con esempi svolti.

Materiale di riferimento

Shie Mannor, Yishay Mansour, and Aviv Tamar
Reinforcement Learning: Foundations
(Working Draft: https://sites.google.com/view/rlfoundations/home)

Dispense e Jupyter notebooks forniti dal docente.

Modalità di verifica dell’apprendimento e criteri di valutazione

Progetto sperimentale. Il progetto verrà valutato mediante una discussione che riguarderà anche argomenti di teoria svolti nell'insegnamento. Il voto finale terrà conto sia del progetto sia dell'esame orale.

Organizzazione didattica

INF/01 - INFORMATICA - CFU: 3

Lezioni: 20 ore

Docente: Cesa Bianchi Nicolo' Antonio

Docente/i

Cesa Bianchi Nicolo' Antonio

Sito web

Ricevimento:

Su appuntamento

via Celoria 18. Stanza 7007