Informatics and Statistics for Biotechnologies (common)
A.Y. 2024/2025
Learning objectives
This course is composed by two tightly integrated units (modules). The main learning objective of the course is to enable students to design and perform statistical tests using a computer. To this end lessons belonging to the Computer science and Statistics modules are structured according to weeks comprising a Computer science lesson, a Statistics lesson, and a lab session realizing all the topics covered by the Statistics lesson using the R language for statistical computing.
Expected learning outcomes
The student is expected to be able to understand pros and cons of the statistical methods presented during the course and to plan and carry out statistical tests using the R programming language. In addition, he is also expected to be able to clearly present the results of the aforementioned tests by using the graphic functionalities offered by the R language.
Lesson period: Second semester
Assessment methods: Esame
Assessment result: voto verbalizzato in trentesimi
Single course
This course can be attended as a single course.
Course syllabus and organization
Linea AK
Responsible
Lesson period
Second semester
Course syllabus
Computer science
This module will begin with an introduction on how do systems for data processing work, while the rest of thecourse will be dedicated to the principles of imperative and modular programming, using the R language as areference.
- Architecture of a computer
- Notes on the representation of information
- Main concepts of a generic programming language
- Programs and processes
- R Environment
- Variables and assignments
- ector data structure. Numeric, character, logical vectors
- Selection and access, different types
- Generation of regular sequences
- Matrices: construction, concatenation, product. Selection and access
- Comand table
- Heterogeneous data management: lists, data frames, their types of access
- Flow control
- Conditional instructions and cycles
- Cycle efficiency in R
- Functions and Scripts
- R graphical environment
- Plot, barplot, hist, boxplot
- 2D e 3D pie charts
- Figures saving
- Generation of pseudorandom numeric samples
- Probability distributions
- qqnorm, qqline, qqplot functions to compare probability distributions
Statistics
This module is designed to provide the basics of statistics for biotechnology. Particular attention is given to the applied statistics, with special emphasis on the interpretation of the data and the tests used in the analysis of biological data. The course includes numerous practical examples and provides the student with basic knowledge that allows to outline a logical workflow to support the choice of the most appropriate statistical approach when approaching a specific biological problem.
- Introduction to statistics. Basic concepts, populations and samples, sampling, data types, types of variables, types of studies
- Data visualization. Frequency tables, charts and histograms
- Statistical indexes and data description
- Estimation and uncertainty
- Probability. Event, probability of an event. Probability of complex events
- Probability distributions
- Hypothesis testing
- Statistical tests for nominal variables. Goodness-of-fit chi square test. Contingency tables, Odds Ratio, chi square test for independence, Fisher's exact test
- Statistical tests for continuous and discrete variables
This module will begin with an introduction on how do systems for data processing work, while the rest of thecourse will be dedicated to the principles of imperative and modular programming, using the R language as areference.
- Architecture of a computer
- Notes on the representation of information
- Main concepts of a generic programming language
- Programs and processes
- R Environment
- Variables and assignments
- ector data structure. Numeric, character, logical vectors
- Selection and access, different types
- Generation of regular sequences
- Matrices: construction, concatenation, product. Selection and access
- Comand table
- Heterogeneous data management: lists, data frames, their types of access
- Flow control
- Conditional instructions and cycles
- Cycle efficiency in R
- Functions and Scripts
- R graphical environment
- Plot, barplot, hist, boxplot
- 2D e 3D pie charts
- Figures saving
- Generation of pseudorandom numeric samples
- Probability distributions
- qqnorm, qqline, qqplot functions to compare probability distributions
Statistics
This module is designed to provide the basics of statistics for biotechnology. Particular attention is given to the applied statistics, with special emphasis on the interpretation of the data and the tests used in the analysis of biological data. The course includes numerous practical examples and provides the student with basic knowledge that allows to outline a logical workflow to support the choice of the most appropriate statistical approach when approaching a specific biological problem.
- Introduction to statistics. Basic concepts, populations and samples, sampling, data types, types of variables, types of studies
- Data visualization. Frequency tables, charts and histograms
- Statistical indexes and data description
- Estimation and uncertainty
- Probability. Event, probability of an event. Probability of complex events
- Probability distributions
- Hypothesis testing
- Statistical tests for nominal variables. Goodness-of-fit chi square test. Contingency tables, Odds Ratio, chi square test for independence, Fisher's exact test
- Statistical tests for continuous and discrete variables
Prerequisites for admission
The exam of this course can be taken only after having successfully taken the exam of Mathematics of the Degree course in Biotechnology (Class L-2), which obviously does not preclude the possibility of following the course. However, the course will give students the basic mathematical knowledge necessary to understand the teaching (especially in relation to the topics covered in the Statistics module).Students are expected to use their personal lapdops during lessons,
Teaching methods
Statistics: Teaching will be organized in frontal lessons. Computer science: Teaching will be organized in frontal lessons.
Teaching Resources
Computer science:
- R for data science. H. Wickham, G. Grolemund. Editor: O'Reilly, ISBN-13: 978-1491910399
- Lesson slides and other teaching materials (ALL the items needed to pass the exams are covered)
- On-line Manuals:
- http://cran.r-project.org/doc/manuals/R-intro.pdf
- http://cran.r-project.org/doc/manuals/R-lang.pdf
- http://cran.r-project.org/doc/manuals/R-admin.pdf
- http://cran.r-project.org/doc/manuals/R-data.pdf
- http://cran.r-project.org/doc/contrib/Verzani-SimpleR.pdf
Statistics:
Main texts
- Analisi Statistica dei Dati Biologici. Whitlock MC, Schulter D. Zanichelli
- Handbook of Biological Statistics. John H. MacDonald. Printed version and online
- Lesson slides
Other sources:
- Introductory Statistics. Ross SM. Elsevier AP - Third Edition (some concepts from chapters 5 and 6 about continuous and discrete distributions)
- R for data science. H. Wickham, G. Grolemund. Editor: O'Reilly, ISBN-13: 978-1491910399
- Lesson slides and other teaching materials (ALL the items needed to pass the exams are covered)
- On-line Manuals:
- http://cran.r-project.org/doc/manuals/R-intro.pdf
- http://cran.r-project.org/doc/manuals/R-lang.pdf
- http://cran.r-project.org/doc/manuals/R-admin.pdf
- http://cran.r-project.org/doc/manuals/R-data.pdf
- http://cran.r-project.org/doc/contrib/Verzani-SimpleR.pdf
Statistics:
Main texts
- Analisi Statistica dei Dati Biologici. Whitlock MC, Schulter D. Zanichelli
- Handbook of Biological Statistics. John H. MacDonald. Printed version and online
- Lesson slides
Other sources:
- Introductory Statistics. Ross SM. Elsevier AP - Third Edition (some concepts from chapters 5 and 6 about continuous and discrete distributions)
Assessment methods and Criteria
The exam sessions consist of a single test for the IT and Statistics modules and will take place in computerized laboratories. The exam topic will contain questions concerning topics of both the modules. The maximum degree will be 30. It will be NOT possible to take the exam referred to only one of the two modules. The exam will be succesfull if the degree will be equal or higher than 18. During the exam it will be possible to use the material distributed during the course and your notes.
FIS/07 - APPLIED PHYSICS - University credits: 1
INF/01 - INFORMATICS - University credits: 1
MAT/03 - GEOMETRY - University credits: 1
SECS-S/01 - STATISTICS - University credits: 1
SECS-S/02 - STATISTICS FOR EXPERIMENTAL AND TECHNOLOGICAL RESEARCH - University credits: 2
INF/01 - INFORMATICS - University credits: 1
MAT/03 - GEOMETRY - University credits: 1
SECS-S/01 - STATISTICS - University credits: 1
SECS-S/02 - STATISTICS FOR EXPERIMENTAL AND TECHNOLOGICAL RESEARCH - University credits: 2
Lessons: 48 hours
Professor:
Re' Matteo
Linea LZ
Responsible
Lesson period
Second semester
Course syllabus
Informatics
- Sketch of Computer architectures
- Some notes on information representation
- Program languages, main definitions
- Programs and processes
- R language environment
- Variables and assignments
- Vector data structure
- Accessing to vector elements
- Generating numeric sequences
- Matrices
- table command
- Heterogeneous data: lists and data frames
- Conditional and loop statements
- Loop efficiency in R
- Functions and scripts in R
- R graphical environment
- plot, barplot, hist, boxplot commands
- Pie charts
- Saving figures
- Graphical approach: qqnorm, qqline and qqplot functions
Statistics
- Introduction to statistics. Basic concepts, populations and samples, sampling, data types, types of variables, types of studies
- Data visualization. Frequency tables, charts and histograms
- Statistical indexes and data description
- Estimation and uncertainty
- Probability. Event, probability of an event. Probability of complex events
- Probability distributions
- Hypothesis testing
- Statistical tests for nominal variables. Goodness-of-fit chi square test. Contingency tables, Odds Ratio, chi square test for independence, Fisher's exact test
- Statistical tests for continuous and discrete variables
- Sketch of Computer architectures
- Some notes on information representation
- Program languages, main definitions
- Programs and processes
- R language environment
- Variables and assignments
- Vector data structure
- Accessing to vector elements
- Generating numeric sequences
- Matrices
- table command
- Heterogeneous data: lists and data frames
- Conditional and loop statements
- Loop efficiency in R
- Functions and scripts in R
- R graphical environment
- plot, barplot, hist, boxplot commands
- Pie charts
- Saving figures
- Graphical approach: qqnorm, qqline and qqplot functions
Statistics
- Introduction to statistics. Basic concepts, populations and samples, sampling, data types, types of variables, types of studies
- Data visualization. Frequency tables, charts and histograms
- Statistical indexes and data description
- Estimation and uncertainty
- Probability. Event, probability of an event. Probability of complex events
- Probability distributions
- Hypothesis testing
- Statistical tests for nominal variables. Goodness-of-fit chi square test. Contingency tables, Odds Ratio, chi square test for independence, Fisher's exact test
- Statistical tests for continuous and discrete variables
Prerequisites for admission
To access the final exam, it is required the exam of Matematica of the Laurea in Biotecnologia (Classe L-2) degree to be passed. However the class attendance is always possible.
Teaching methods
The course is made up by lectures and practical exercises, individual or collective, which take place in computer labs. It is possible to attend the class with the own laptop.
Teaching Resources
Informatics
The following on-line tutorials are suggested:
1. http://www.r-project.it/books/nozioniR.pdf
2. http://cran.r-project.org/doc/manuals/R-intro.pdf
3. http://cran.r-project.org/doc/manuals/R-lang.pdf
4. http://cran.r-project.org/doc/manuals/R-admin.pdf
5. http://cran.r-project.org/doc/manuals/R-data.pdf
6. http://cran.r-project.org/doc/contrib/Verzani-SimpleR.pdf
Statistics
1. Analisi Statistica dei Dati Biologici. Whitlock MC, Schulter D. Zanichelli
2. Handbook of Biological Statistics. John H. MacDonald. Printed version and online
Other suggested books:
3. Intuitive Biostatistics: a non-mathematical guide to statistical thinking, Fourth EditionMotulsky H. Oxford University Press.
4. Introductory Statistics. Ross SM. Elsevier AP - Third Edition (alcuni concetti introdotti nei capitoli 5 e 6 relativi a variabili casuali discrete e continue)
The following on-line tutorials are suggested:
1. http://www.r-project.it/books/nozioniR.pdf
2. http://cran.r-project.org/doc/manuals/R-intro.pdf
3. http://cran.r-project.org/doc/manuals/R-lang.pdf
4. http://cran.r-project.org/doc/manuals/R-admin.pdf
5. http://cran.r-project.org/doc/manuals/R-data.pdf
6. http://cran.r-project.org/doc/contrib/Verzani-SimpleR.pdf
Statistics
1. Analisi Statistica dei Dati Biologici. Whitlock MC, Schulter D. Zanichelli
2. Handbook of Biological Statistics. John H. MacDonald. Printed version and online
Other suggested books:
3. Intuitive Biostatistics: a non-mathematical guide to statistical thinking, Fourth EditionMotulsky H. Oxford University Press.
4. Introductory Statistics. Ross SM. Elsevier AP - Third Edition (alcuni concetti introdotti nei capitoli 5 e 6 relativi a variabili casuali discrete e continue)
Assessment methods and Criteria
The final exam, which is unique for both Informatics and Statistics modules, is composed of some exercises including questions from both modules, to be solved in laboratory by using a PC.
Duration: 1h 30m.
During the exam it is not allowed to use electronic devices (phones, tablet, etc.). All the material publicly available about the course can be accessed during the exam, in addition to some personal student notes.
The overall evaluation is expressed with a grade from 1 to 30. To pass the exam it is necessary to get a score of at least 9 in each module.
Duration: 1h 30m.
During the exam it is not allowed to use electronic devices (phones, tablet, etc.). All the material publicly available about the course can be accessed during the exam, in addition to some personal student notes.
The overall evaluation is expressed with a grade from 1 to 30. To pass the exam it is necessary to get a score of at least 9 in each module.
FIS/07 - APPLIED PHYSICS - University credits: 1
INF/01 - INFORMATICS - University credits: 1
MAT/03 - GEOMETRY - University credits: 1
SECS-S/01 - STATISTICS - University credits: 1
SECS-S/02 - STATISTICS FOR EXPERIMENTAL AND TECHNOLOGICAL RESEARCH - University credits: 2
INF/01 - INFORMATICS - University credits: 1
MAT/03 - GEOMETRY - University credits: 1
SECS-S/01 - STATISTICS - University credits: 1
SECS-S/02 - STATISTICS FOR EXPERIMENTAL AND TECHNOLOGICAL RESEARCH - University credits: 2
Lessons: 48 hours
Professor:
Frasca Marco
Professor(s)