Data Analysis for Agriculture
A.Y. 2024/2025
Learning objectives
To provide the methodological and practical basis for a rigorous management of quantitative data in agriculture, the design of sample surveys, field surveys and experimental tests in agriculture.
Develop advanced skills in the use of spreadsheets and dedicated software for the collection and statistical analysis of the increasingly significant mass of management and biological data in agriculture.
Provide a solid theoretical and practical foundation for the reading, analysis, interpretation and presentation of data from industry databases, sample surveys, field measurements and experimental test results.
Develop advanced skills in the use of spreadsheets and dedicated software for the collection and statistical analysis of the increasingly significant mass of management and biological data in agriculture.
Provide a solid theoretical and practical foundation for the reading, analysis, interpretation and presentation of data from industry databases, sample surveys, field measurements and experimental test results.
Expected learning outcomes
Ability to manage data from national and international databases of the sector, and / or from sampling surveys in agriculture.
Ability to carry out graphic and quantitative descriptive analyzes, to analyze trends, variation factors and confounding factors through the use of appropriate graphic and statistical techniques with spreadsheets and statistical processing software.
Ability to present, read and interpret data from both the field and the laboratory.
Ability to carry out graphic and quantitative descriptive analyzes, to analyze trends, variation factors and confounding factors through the use of appropriate graphic and statistical techniques with spreadsheets and statistical processing software.
Ability to present, read and interpret data from both the field and the laboratory.
Lesson period: First semester
Assessment methods: Esame
Assessment result: voto verbalizzato in trentesimi
Single course
This course can be attended as a single course.
Course syllabus and organization
Single session
Responsible
Lesson period
First semester
Course syllabus
Short course description: The course, after recalling the basic concepts of descriptive statistics, deals with the main data analysis techniques useful for the evaluation of quantitative and qualitative information in agriculture. The lessons present a theoretical section and an applied section carried out in computer classroom.
In short, the course will address description and representation of qualitative and quantitative data. Sample distributions and theoretical distributions. Relationship between quantitative variables: covariance, correlation and regression. Relationships between quantitative variables. Standardization and standardization. Hypothesis testing and statistical tests for quantitative and qualitative data. Analysis of variance with one and more ways. The concept of interaction, fixed and random effects. Simple regression and outlineo f multiple linear regression. Design of experiments and experimental schemes. Outline of non-parametric tests and multivariate analysis techniques.
Detailed course program:
Creation, management and preparation of quantitative datasets. 4 h
Descriptive statistics recalls: Graphical representation of data: histograms, box plots, scatter plots 4h
Indices of position and dispersion 4h
Measures of relationship between variables: covariance. The analysis of correlation and linear regression. The least squares method for parameter estimation.4h
Characteristics of populations and samples. Estimation of the parameters of a population: point and interval estimation. Bias, efficiency and consistency of an estimator. The statistical test: concepts of null hypothesis, two-sided and one-sided tests, significance level and its critical evaluation, type I and type II errors, power of the test. 3 h
Sample and theoretical distributions: binomial, Poisson, normal, standardized normal, log-normal. 6h
Qualitative variables: their graphical representation and significance tests. Chi-square tests for evaluating the goodness of fit of observed data versus theoretical distributions (equiprobable, binomial, Poisson) and for testing hypotheses of independence of qualitative variables. Yates correction. 6h
Tests on a mean. Z score. confidence intervals, t test. Techniques for comparisons between two sample means. The t test for paired data and for independent samples. Assumptions. Issues of indirect comparisons. 6 h
Techniques for comparing multiple means: the analysis of variance. Assumptions of the ANOVA (tests of normality and homogeneity of variances). Analysis of factorial variance and interaction: 2- and 3-way ANOVA, and related interpretation of results. ANCOVA 6 h
The hierarchical analysis of variance. General linear model. Fixed factor model and random factor model. Multiple comparison techniques between means (contrasts and post-hoc tests). 4 h
The analysis of variance of regression. Goodness of fit of the model. Assumptions for regression and related tests. The regression coefficient and its standard error. Predicted values and residuals, analysis of residuals. Analysis of outliers. Significance tests for regression coefficient and intercept. Trust intervals around the regression line. The coefficient of determination. Statistical significance of correlation and regression. 4h
Hints at data transformation, ranking and nonparametric tests. 2h
Hints of multiple regression analysis The choice of the optimal model (backward, forward and stepwise regression). 2h
Design of experiments: estimating experiment size and effect size. The experimental schemes: randomized blocks, Latin square, split-plot. 2 h
Introduction to multivariate analysis. 2h
Running, reading and interpreting results of different methods with dedicated statistical software. 9h
In short, the course will address description and representation of qualitative and quantitative data. Sample distributions and theoretical distributions. Relationship between quantitative variables: covariance, correlation and regression. Relationships between quantitative variables. Standardization and standardization. Hypothesis testing and statistical tests for quantitative and qualitative data. Analysis of variance with one and more ways. The concept of interaction, fixed and random effects. Simple regression and outlineo f multiple linear regression. Design of experiments and experimental schemes. Outline of non-parametric tests and multivariate analysis techniques.
Detailed course program:
Creation, management and preparation of quantitative datasets. 4 h
Descriptive statistics recalls: Graphical representation of data: histograms, box plots, scatter plots 4h
Indices of position and dispersion 4h
Measures of relationship between variables: covariance. The analysis of correlation and linear regression. The least squares method for parameter estimation.4h
Characteristics of populations and samples. Estimation of the parameters of a population: point and interval estimation. Bias, efficiency and consistency of an estimator. The statistical test: concepts of null hypothesis, two-sided and one-sided tests, significance level and its critical evaluation, type I and type II errors, power of the test. 3 h
Sample and theoretical distributions: binomial, Poisson, normal, standardized normal, log-normal. 6h
Qualitative variables: their graphical representation and significance tests. Chi-square tests for evaluating the goodness of fit of observed data versus theoretical distributions (equiprobable, binomial, Poisson) and for testing hypotheses of independence of qualitative variables. Yates correction. 6h
Tests on a mean. Z score. confidence intervals, t test. Techniques for comparisons between two sample means. The t test for paired data and for independent samples. Assumptions. Issues of indirect comparisons. 6 h
Techniques for comparing multiple means: the analysis of variance. Assumptions of the ANOVA (tests of normality and homogeneity of variances). Analysis of factorial variance and interaction: 2- and 3-way ANOVA, and related interpretation of results. ANCOVA 6 h
The hierarchical analysis of variance. General linear model. Fixed factor model and random factor model. Multiple comparison techniques between means (contrasts and post-hoc tests). 4 h
The analysis of variance of regression. Goodness of fit of the model. Assumptions for regression and related tests. The regression coefficient and its standard error. Predicted values and residuals, analysis of residuals. Analysis of outliers. Significance tests for regression coefficient and intercept. Trust intervals around the regression line. The coefficient of determination. Statistical significance of correlation and regression. 4h
Hints at data transformation, ranking and nonparametric tests. 2h
Hints of multiple regression analysis The choice of the optimal model (backward, forward and stepwise regression). 2h
Design of experiments: estimating experiment size and effect size. The experimental schemes: randomized blocks, Latin square, split-plot. 2 h
Introduction to multivariate analysis. 2h
Running, reading and interpreting results of different methods with dedicated statistical software. 9h
Prerequisites for admission
Basic knowledge of descriptive statistics. Ability to use electronic spreadsheets.
Prerequisites for non-attending students are the same as for attending students.
Prerequisites for non-attending students are the same as for attending students.
Teaching methods
Besides theoretical classes (3.5 CFU, 28 hours), close attention will be paid to computer sessions (2.5 CFU, 28 hours). Biostatistics will be presented with a practical approach emphasizing the rationale of statistical theory and methods rather than mathematical proofs and formalisms. Each theoretical lecture will be combined with practical applications and exercises using spreadsheets and statistical software to analyze data. Besides theoretical classes, there will also be intensive computer sessions.
Assiduous attendance at theoretical and practical lectures is strongly recommended. The interactive mode of the course will enable students to better understand the areas of application of the various topics covered in the course.
Assiduous attendance at theoretical and practical lectures is strongly recommended. The interactive mode of the course will enable students to better understand the areas of application of the various topics covered in the course.
Teaching Resources
ecture slides and supporting material for the course will be made available to students on the course Ariel site: datasets, links to scientific articles and web pages of interest. Reference texts will be given during the course and in the slides of the first lecture.
Assessment methods and Criteria
The exam consists of a written and a practical examination. The written part consists of 6 questions (1 multiple choice question, 2 open theoretical questions; 2 practical problems; 1 statistical analysis to comment on (points 0-4 for each question), and a computer session with 1 dataset to analyze. The biostatistical test will assess the ability to organize, summarize, and represent biotechnological data by choosing the appropriate experimental methodologies and statistical tests. The final score will be the summation of the points obtained in the different parts (written part , points 0-4 for each questions, computer session points 0-6), aimed at ascertaining:
(a) your knowledge and ability to understand the topics of the course as well as mastery of the specific language related to the use of statistical techniques and the ability to present topics in a clear and orderly manner
(b) your ability to apply knowledge and understanding through the analysis of a dataset or the execution and discussion of a statistical problem using excel
(c) your ability to understand and interpret the results of statistical analysis by commenting on an analytical output of statistical software used during the course and your ability in organize an experimental plan.
The final evaluation is based on a total of 30 points.
Activities carried out by students during the course, such as commentaries on analyses, completion of exercises, a compilation of a dictionary of terms, etc. will also be taken into account when determining the final grade.
Students with SLD or disability certifications are kindly requested to contact the teacher at least 15 days before the date of the exam session to agree on individual exam requirements. In the email please make sure to add in cc the competent offices: [email protected] (for students with SLD) o [email protected] (for students with disability).
The learning verification methods for non-attending students are the same as for attending students.
(a) your knowledge and ability to understand the topics of the course as well as mastery of the specific language related to the use of statistical techniques and the ability to present topics in a clear and orderly manner
(b) your ability to apply knowledge and understanding through the analysis of a dataset or the execution and discussion of a statistical problem using excel
(c) your ability to understand and interpret the results of statistical analysis by commenting on an analytical output of statistical software used during the course and your ability in organize an experimental plan.
The final evaluation is based on a total of 30 points.
Activities carried out by students during the course, such as commentaries on analyses, completion of exercises, a compilation of a dictionary of terms, etc. will also be taken into account when determining the final grade.
Students with SLD or disability certifications are kindly requested to contact the teacher at least 15 days before the date of the exam session to agree on individual exam requirements. In the email please make sure to add in cc the competent offices: [email protected] (for students with SLD) o [email protected] (for students with disability).
The learning verification methods for non-attending students are the same as for attending students.
AGR/17 - LIVESTOCK SYSTEMS, ANIMAL BREEDING AND GENETICS - University credits: 6
Computer room practicals: 32 hours
Lessons: 32 hours
Lessons: 32 hours
Professor:
Crepaldi Paola
Shifts:
Turno
Professor:
Crepaldi PaolaEducational website(s)
Professor(s)
Reception:
keeping an appointment by e-mail
Sezione di Zootecnica Agraria, 1st floor, Via Celoria 2