Bioinformatics and Computational Biology

A.Y. 2024/2025
6
Max ECTS
48
Overall hours
SSD
BIO/11 BIO/19
Language
Italian
Learning objectives
Recent methodological advances in molecular biology and genetics have led to a significant increase in the volumes of available data, allowing previously impossible studies on biodiversity and evolution. However, exploiting this enormous amount of information requires knowing how to handle some basic statistical and informatic tools. This course will provide a comprehensive introduction to R, a widely used programming language for statistical computing and graphics, with a focus on simple statistical methodologies of immediate relevance to studies on biodiversity and evolution. Students will learn to install R Studio (the R graphical interface), perform basic statistical tests (average comparison, linear regressions, etc.) and prepare graphs. The course will also include a case study on the analysis of environmental DNA (eDNA) sequencing data to study microbial diversity.
Expected learning outcomes
The course aims to provide students with: i) an introduction to the application of bioinformatics in evolutionary and ecological studies; ii) ability to manage and analyze data using R; iii) the development of skills in choosing and carrying out basic statistical analyses; iv) the ability to prepare high-quality graphs in R, such as heatmaps and Principal Component Analysis (PCA). Furthermore, the course aims to stimulate the student to develop their IT/statistical skills also in the rest of their university and post-university growth path, through the exploration of the myriad of R packages available.
Single course

This course can be attended as a single course.

Course syllabus and organization

Single session

Responsible
Lesson period
First semester
Course syllabus
First, teachings will introduce students to principles, concepts and statistical methods commonly used for the analysis and interpretation of large scale biological data.
Including:
Introduction to the R environment for the analysis of biological data - 1 cfu (8hrs)
How to import data in R
Basic data structures, data.frames, vectors, matrices
Installation and management of software packages
Introduction to the R graphical environment
Use of statistical software. Examples in R.
Techniques for data summarization and representation - 1 cfu (8hrs)
Dimensionality reduction and principal compoment analylsis
Violin plots/boxplots and visual comparison of data distributions
Heatmaps and clustering of data
Markdown, formatting and generation of reports - 1 cfu (8hrs)
RMarkdown

This first part of the course will be followed by an introduction to the analysis of Next Generation Sequencing (NGS) data using R, with insights on the theoretical and practical principles underlying state-of-the-art methods for the study on the analysis of environmental DNA (eDNA) sequencing data to study microbial diversity.
Prerequisites for admission
Knowledge of basic molecular biology topics, with particular reference to nucleic acid sequencing, structure of prokaryotic and eukaryotic genes and genomes are is highly recommended for attending the course.
Teaching methods
Teaching mode: classroom lectures supported by practicals on real or realistic datasets. Teachers will assign exercises at the end of most lessons to help in fixing concepts between classes. Attendance is highly recommended.
Teaching Resources
Copies of the slides projected during the classes, as well as additional materials and datasets will be made available through the course website on the ARIEL platform of the University of Milano. This material is intended as a support for lectures, and its study cannot be considered as a full alternative to constant attendance of classes. The material is made available only to registered students of the Degree Course in Molecular Biology of the Cell and should not be distributed to others without express consent of the teachers.
Assessment methods and Criteria
Group project + oral exam
Students will be required to complete a small project consisting of the analysis real data, through the techniques and methods learned throughout the course. The students will produce and submit a report describing their results to the teachers.
Delivery of the report is due at least 48h before the selected exam session.
The exam will entail a critical discussion of the main results and techninque used with the teachers. Rhe exam will be considered passed equal or over the 18/30 mark.
BIO/11 - MOLECULAR BIOLOGY - University credits: 3
BIO/19 - MICROBIOLOGY - University credits: 3
Lessons: 48 hours