Statistics
A.Y. 2022/2023
Learning objectives
The main objective of the course is to ensure that students acquire an adequate knowledge and degree of understanding of the appropriate tools to synthetically describe one or more characters of interest that are found in the most various fields (political, administrative, sociological, historical, legal, economic, etc.). This description can be made by aggregating the data observed in tables, giving an adequate graphical representation, constructing appropriate position and variability indices, identifying the most appropriate measures that highlight the relationships. The statistical description must be accompanied by statistical induction, when the survey is not total but partial; in this case, the knowledge of the aforesaid characters is not in "certain" terms but only "probable" and has the purpose of providing indications on the entire population of reference. The basic topics of Probability and Statistical inference are therefore provided. Knowledge and understanding of these tools require a strong application capacity. Students will have to develop a marked independence of judgment, in order to be able to adequately choose the most suitable techniques for solving the proposed problems, and they will have to demonstrate that they also possess communication skills, essential to be able to explain the methodologies and logical paths used in solving the questions. Finally, they must acquire a more and more refined learning ability, which will allow them to face new situations with a high degree of autonomy.
Expected learning outcomes
At the end of this course, the student is expected to know and use the main statistical tools necessary for the analysis of phenomena in different fields (social, economic, etc) and in their various manifestations. The student will be able to organize the observed data of one or more phenomena of interest in a frequency or contingency table, to synthesize their main features and relationships through appropriate univariate or bivariate indices, and to infer more general results about the population from the observed sample data.
Lesson period: Second trimester
Assessment methods: Esame
Assessment result: voto verbalizzato in trentesimi
Single course
This course cannot be attended as a single course. Please check our list of single courses to find the ones available for enrolment.
Course syllabus and organization
A-K
Responsible
Lesson period
Second trimester
For the training activity in the academic year 2022/23, more specific information will be provided in the coming months, based on the evolution of the public health situation.
Course syllabus
Statistics: definition and fields of application.
The classification of statistical phenomena and the concept of reference statistical population.
The description of statistical data.
Organization of data in frequency tables and graphical representation.
Position indexes: mode, median, quantiles, arithmetic mean
Dispersion indexes: range, variance and standard deviation, coefficient of variation.
The concept of random event, probability of an event and hints about random variables
Bernoulli and Binomial random variables.
The Normal (or Gaussian) random variable and the use of statistical tables.
Sampling and sampling distributions.
Introduction to sample estimation.
Hypothesis testing for one or more samples.
Regression model in a deterministic and inferential context.
Multiple regression model.
ANOVA: analysis of variance
The classification of statistical phenomena and the concept of reference statistical population.
The description of statistical data.
Organization of data in frequency tables and graphical representation.
Position indexes: mode, median, quantiles, arithmetic mean
Dispersion indexes: range, variance and standard deviation, coefficient of variation.
The concept of random event, probability of an event and hints about random variables
Bernoulli and Binomial random variables.
The Normal (or Gaussian) random variable and the use of statistical tables.
Sampling and sampling distributions.
Introduction to sample estimation.
Hypothesis testing for one or more samples.
Regression model in a deterministic and inferential context.
Multiple regression model.
ANOVA: analysis of variance
Prerequisites for admission
No prior knowledge is required
Teaching methods
Lectures and classroom exercises.
During the lectures the teacher uses both the blackboard and slides to be shown on the PC. Lectures focus on the most theoretical issues but are always accompanied by numerical examples.
During the exercises the teacher, after possibly recalling the necessary theoretical references seen in class, solves numerical exercises, which require the use of the scientific calculator and the statistical tables.
During the lectures the teacher uses both the blackboard and slides to be shown on the PC. Lectures focus on the most theoretical issues but are always accompanied by numerical examples.
During the exercises the teacher, after possibly recalling the necessary theoretical references seen in class, solves numerical exercises, which require the use of the scientific calculator and the statistical tables.
Teaching Resources
Statistica, Iacus, McGraw-Hill.
Chapters
1. Il mondo aleatorio
2. Il mondo dei dati
3. Modelli probabilistici a fini previsivi
4. Inferenza statistica
5. Relazioni tra più fenomeni
Chapters
1. Il mondo aleatorio
2. Il mondo dei dati
3. Modelli probabilistici a fini previsivi
4. Inferenza statistica
5. Relazioni tra più fenomeni
Assessment methods and Criteria
Written exam, with an optional mid-term exam.
The complete written exam, lasting 90 minutes, consists of (4+4) multiple choice questions with 4 possible answers, of which only one is correct; and (2+2) numerical exercises. The structure of the exam allows the student to check the theoretical and practical skills learned by the students during the lessons and exercises.
The optional mid-term test consists of a 45-minute written exam with the same structure as the full exam, referred to the part of the program carried out up to the date of the test (4+2).
If the mid-term exam is passed, it allows the student to perform a final 45-minute test on the second part of the course in any of the 6 official rounds. If the second test is also passed, the student passes the exam (but can decide to retake the second test or the complete exam if he/she is not satisfied with the final grade).
If the first mid-term exam is not passed, the student will have to take the complete written exam in one of the official rounds.
The complete written exam, lasting 90 minutes, consists of (4+4) multiple choice questions with 4 possible answers, of which only one is correct; and (2+2) numerical exercises. The structure of the exam allows the student to check the theoretical and practical skills learned by the students during the lessons and exercises.
The optional mid-term test consists of a 45-minute written exam with the same structure as the full exam, referred to the part of the program carried out up to the date of the test (4+2).
If the mid-term exam is passed, it allows the student to perform a final 45-minute test on the second part of the course in any of the 6 official rounds. If the second test is also passed, the student passes the exam (but can decide to retake the second test or the complete exam if he/she is not satisfied with the final grade).
If the first mid-term exam is not passed, the student will have to take the complete written exam in one of the official rounds.
L-Z
Responsible
Lesson period
Second trimester
Course syllabus
Descriptive statistics
1) Classification of statistical phenomena (types of characters and scales of measurement) and frequency distributions (absolute, relative and cumulative frequencies).
2) Graphical representations: bar graph, stick graph, histogram.
3) Calculation of a mode, a median and a sample mean when the data are classified in a frequency table. Theorems and properties of the mean.
4) Some indices of variability and dispersion: range, interquartile difference, variance and standard deviation. The variation coefficient.
5) Contingency tables and bivariate analysis: definition of joint absolute and relative, marginal and conditioned frequency distributions; the Pearson index for independence; dependence in mean; covariance and the linear correlation coefficient.
Probability and random variables
1) Introduction to probability theory: classical, frequentist, subjective and axiomatic probability definitions; elementary, compound and disjoint events; stochastic independence; Bayes theorem; principle of total probabilities; types of sampling (extractions with and without replacement).
2) Definition of discrete and continuous random variables: probability distribution, probability density, distribution function; expected value (or mean), mode, median, variance of a random variable. Definition of independence between random variables.
3) Central limit theorem and law of large numbers.
4) Bernoulli random variable, Normal random variable and Binomial random variable; Normal approximation to Binomial distribution.
Inferential statistics
1) Point estimation: definition of unbiased estimator; the standard error as an accuracy measure of an estimator. The sample mean and variance; the sample proportion.
2) Confidence intervals for a mean (with Normal observations and known or unknown variance). Confidence intervals for a proportion.
3) General definition of statistical hypothesis testing: null and alternative hypotheses; type 1 and type 2 errors; rejection region; p-value. Hypothesis testing for a mean, with Normal observations and known or unknown variance; the t-test for the comparison between 2 means; the ANOVA test for comparison among multiple means.
4) Hypothesis testing for a proportion. Chi-square test for comparison among multiple proportions and to verify the independence between two variables.
Simple linear regression
1) Presentation of the statistical package R: how to install it; basic commands.
2) Definition of linear regression model; estimation of the parameters (slope and intercept coefficients) with the least square method; goodness of fit and determination coefficient; confidence interval for the coefficients of the linear regression model; hypothesis testing on the intercept and on the slope coefficients; pointwise and interval prediction.
1) Classification of statistical phenomena (types of characters and scales of measurement) and frequency distributions (absolute, relative and cumulative frequencies).
2) Graphical representations: bar graph, stick graph, histogram.
3) Calculation of a mode, a median and a sample mean when the data are classified in a frequency table. Theorems and properties of the mean.
4) Some indices of variability and dispersion: range, interquartile difference, variance and standard deviation. The variation coefficient.
5) Contingency tables and bivariate analysis: definition of joint absolute and relative, marginal and conditioned frequency distributions; the Pearson index for independence; dependence in mean; covariance and the linear correlation coefficient.
Probability and random variables
1) Introduction to probability theory: classical, frequentist, subjective and axiomatic probability definitions; elementary, compound and disjoint events; stochastic independence; Bayes theorem; principle of total probabilities; types of sampling (extractions with and without replacement).
2) Definition of discrete and continuous random variables: probability distribution, probability density, distribution function; expected value (or mean), mode, median, variance of a random variable. Definition of independence between random variables.
3) Central limit theorem and law of large numbers.
4) Bernoulli random variable, Normal random variable and Binomial random variable; Normal approximation to Binomial distribution.
Inferential statistics
1) Point estimation: definition of unbiased estimator; the standard error as an accuracy measure of an estimator. The sample mean and variance; the sample proportion.
2) Confidence intervals for a mean (with Normal observations and known or unknown variance). Confidence intervals for a proportion.
3) General definition of statistical hypothesis testing: null and alternative hypotheses; type 1 and type 2 errors; rejection region; p-value. Hypothesis testing for a mean, with Normal observations and known or unknown variance; the t-test for the comparison between 2 means; the ANOVA test for comparison among multiple means.
4) Hypothesis testing for a proportion. Chi-square test for comparison among multiple proportions and to verify the independence between two variables.
Simple linear regression
1) Presentation of the statistical package R: how to install it; basic commands.
2) Definition of linear regression model; estimation of the parameters (slope and intercept coefficients) with the least square method; goodness of fit and determination coefficient; confidence interval for the coefficients of the linear regression model; hypothesis testing on the intercept and on the slope coefficients; pointwise and interval prediction.
Prerequisites for admission
The standard knowledge of Math, adquired at the high school, is enough to attend this course.
Teaching methods
About the theoretical part, the teacher explains on the blackboard basically without the use of slides, the lesson in this way is more interactive and is adapted to the needs of the classroom. Students who cannot attend can find everything in the reference material (textbook and lecture notes on ARIEL).
After the introduction of any new concept, various numerical examples are presented to fully understand its meaning and to practice the calculations.
In addition to the theoretical lessons, classroom exercises are also carried out. The exercises carried out during the classes are available on the course web page (ARIEL) to facilitate non-attending students.
Comments and requests for clarification during the lessons / exercises by the students are always welcome, because they make the lessons more lively and certainly more useful for everyone.
After the introduction of any new concept, various numerical examples are presented to fully understand its meaning and to practice the calculations.
In addition to the theoretical lessons, classroom exercises are also carried out. The exercises carried out during the classes are available on the course web page (ARIEL) to facilitate non-attending students.
Comments and requests for clarification during the lessons / exercises by the students are always welcome, because they make the lessons more lively and certainly more useful for everyone.
Teaching Resources
I) Descriptive statistics: two lecture notes will be available on the ARIEL page of the course.
II) Probability and random variables: Introduzione all'inferenza statistca. Authors: Ferrari, Nicolini and Tommasi, Giappichelli Editore - Turin (2009) - CHAPTERS: 1-2.
III) Inferential statistics: Introduzione all'inferenza statistica. Authors: Ferrari, Nicolini and Tommasi, Giappichelli Editore - Turin (2009) - CHAPTERS: 3-4-5
and the following supplementary notes:
1) "la stima puntuale"
2) "confronto tra due o più medie (ANOVA)"
3) "Il test del chi-quadrato per l'indipendenza e per il confronto tra più proporzioni. Il test Z per il confronto tra due proporzioni"
which will be available on the ARIEL page of the course.
IV) Simple linear regression: Introduzione all'inferenza statistica. Authors: Ferrari, Nicolini and Tommasi, Giappichelli Editore - Turin (2009) - CHAPTER 6.
II) Probability and random variables: Introduzione all'inferenza statistca. Authors: Ferrari, Nicolini and Tommasi, Giappichelli Editore - Turin (2009) - CHAPTERS: 1-2.
III) Inferential statistics: Introduzione all'inferenza statistica. Authors: Ferrari, Nicolini and Tommasi, Giappichelli Editore - Turin (2009) - CHAPTERS: 3-4-5
and the following supplementary notes:
1) "la stima puntuale"
2) "confronto tra due o più medie (ANOVA)"
3) "Il test del chi-quadrato per l'indipendenza e per il confronto tra più proporzioni. Il test Z per il confronto tra due proporzioni"
which will be available on the ARIEL page of the course.
IV) Simple linear regression: Introduzione all'inferenza statistica. Authors: Ferrari, Nicolini and Tommasi, Giappichelli Editore - Turin (2009) - CHAPTER 6.
Assessment methods and Criteria
The exam consists of a written test lasting an hour and a half, consisting of 3 exercises and 6 multiple choice questions, concerning the topics listed in the program. The exam is rated from 0 to 33 and is considered sufficient if a score of at least 18 is obtained.
The structure of the exam allows the student to check the theoretical and practical skills learned during the lessons and exercises.
Remark: to carry out the written test you need to bring a calculator with you.
The structure of the exam allows the student to check the theoretical and practical skills learned during the lessons and exercises.
Remark: to carry out the written test you need to bring a calculator with you.
Educational website(s)
Professor(s)
Reception:
Wednesday from 9:00 to 12:00
Via Conservatorio, III floor, Room n. 35