Methods and Languages for Data Management
A.Y. 2024/2025
Learning objectives
The aim of the course is the introduction to methods and techniques for describing, summarizing and finding a structure in a data set, with particular attention to cultural heritage datasets. Both classical statistical methods and artificial intelligence methods will be considered.
Expected learning outcomes
Students will know how to perform explorative analysis, some basic inferences and the most common statistical tests using a statistical analysis software. Moreover, they will know the main techniques of machine learning both for regression and classification problems, and will be aware of the main machine learning issues.
Lesson period: Second semester
Assessment methods: Esame
Assessment result: voto verbalizzato in trentesimi
Single course
This course can be attended as a single course.
Course syllabus and organization
Single session
Responsible
Lesson period
Second semester
Course syllabus
PART 1: Review of exploratory data analysis
1.1 Basic concepts of exploratory analysis of univariate data
- Frequency distributions and their graphical representation
- Position indices
- Dispersion and heterogeneity indices
- Comparison between groups
- Data transformations
- "normality" of the data, the qq-plot.
1.2 Basic concepts of exploratory analysis of bivariate data
- Contingency tables
- Scatter plots
- Association and correlation indices
PART 2: Probabilistic models
- Discrete models: uniform, Bernoulli, binomial, geometric, Poisson
- Continuous models: uniform, exponential, normal
- Normal approximations
PART 3: Inferential Statistics
- Point estimate: sample mean, sample variance
- Estimation by intervals: confidence intervals
- Hypothesis tests: tests for equality of means, normality tests, independence tests
PART 4: Machine learning
- What does it mean to "learn" from a set of data
- Supervised and unsupervised learning
- Regression models
- Classification models
- Resampling techniques to learn robust models
1.1 Basic concepts of exploratory analysis of univariate data
- Frequency distributions and their graphical representation
- Position indices
- Dispersion and heterogeneity indices
- Comparison between groups
- Data transformations
- "normality" of the data, the qq-plot.
1.2 Basic concepts of exploratory analysis of bivariate data
- Contingency tables
- Scatter plots
- Association and correlation indices
PART 2: Probabilistic models
- Discrete models: uniform, Bernoulli, binomial, geometric, Poisson
- Continuous models: uniform, exponential, normal
- Normal approximations
PART 3: Inferential Statistics
- Point estimate: sample mean, sample variance
- Estimation by intervals: confidence intervals
- Hypothesis tests: tests for equality of means, normality tests, independence tests
PART 4: Machine learning
- What does it mean to "learn" from a set of data
- Supervised and unsupervised learning
- Regression models
- Classification models
- Resampling techniques to learn robust models
Prerequisites for admission
elementary statistics is required
Teaching methods
the course is organized in a part of frontal lessons accompanied by laboratory lessons
Teaching Resources
slides presented in class, scripts to run in the software environment
Assessment methods and Criteria
The exam consists of two phases: it starts with a written test in which you the student be asked to analyze a case study using the software environment seen during the course; if the student passes the written test s/he will be admitted to the oral test. The evaluation is expressed in thirtieths.
INF/01 - INFORMATICS - University credits: 6
Laboratories: 32 hours
Lessons: 32 hours
Lessons: 32 hours
Professor:
Zanaboni Anna Maria
Professor(s)
Reception:
Wednesday 10:30-12:30 -- by appointment
via Celoria 18, 5th floor