Genomic Big Data Management and Computing | Università degli Studi di Milano Statale

A.Y. 2025/2026

Max ECTS

Overall hours

SSD

BIO/11 ING-INF/05

Language

English

Included in the following degree programmes

Bioinformatics for Computational Genomics (Classe LM-8 R)-Enrolled in the 2025/2026 Academic Year

Learning objectives

Many projects in the genomics field rely on increasingly large data sets, analyzing, for example, genomes of thousands of individuals affected by a particular disease. It is paramount to understand how large data sets can be managed and processed in an efficient way and how next-generation sequencing processing pipelines and workflows can be used to benefit such large-scale projects.

The objective of the course is to illustrate and discuss key aspects regarding the management, processing and analysis of big data for genomics (mainly data obtained by Next-Generation Sequencing), as well as introduce some of the existing approaches, analysis systems and technologies used. Practical applications will be illustrated using both dedicated programming and query languages (PySpark, GMQL), and specific computational platforms and distributed systems (Galaxy, Apache Spark, Cloud Computing). Also "downstream" analysis examples to underscore the necessity of big data management and computing in genomics will be illustrated.

Expected learning outcomes

Given the vastness of the topics presented, the ultimate goal of the course is not an in-depth knowledge of specific data analysis approaches, but rather to provide a broad overview of different solutions paired with the understanding of strengths and weaknesses of different methodologies and computing environments for managing scientific workflows used for big data analysis in the field of genomics.

Lesson period: First semester

Lessons timetable

Assessment methods: Esame
Assessment result: voto verbalizzato in trentesimi

Exams calendar

Single course

This course cannot be attended as a single course. Please check our list of single courses to find the ones available for enrolment.

Search a single course

Course syllabus and organization

Single session

Course currently not available

Lesson period

First semester

Course structure

BIO/11 - MOLECULAR BIOLOGY - University credits: 1
ING-INF/05 - INFORMATION PROCESSING SYSTEMS - University credits: 5

Lectures: 48 hours