Coding for data science and data management
A.A. 2022/2023
Obiettivi formativi
The course aims at providing technical skills about coding/scripting aspects for data analysis and to manage persistent data storage of sources and results involved in analysis. On the one side, the Python programming language and the R framework are illustrated. The goal is to deal with essential notions about data structures and control structures of both Python and R. On the other side, the goal is to present the core notions of relational databases, such as keys, integrity, and primary/foreign key constraints, as well as the SQL language for data definition, manipulation, and query. Recent and innovative NoSQL solutions are also discussed, with special focus on a document-oriented system called MongoDB.
Risultati apprendimento attesi
Upon completion of the course, students will be able to:
- manage data using R and R Studio;
- solve coding challenges using R libraries and functions;
- make statistical inference and graphics using R;
- writing an apply family of functions in R;
- understand the Python data model and the flow control statements;
- use the built-in Python data structures;
- perform basic linear algebra operations using Numpy;
- perform basic data set manipulations using Pandas:
- perform simple machine learning experiments using Scikit-learn;
- understand and apply the core notions of data modeling in relational databases;
- use the SQL language for creating and querying relational database structures;
- understand and apply the principles of data organization in NoSQL systems;
- use MongoDB for data retrieval and aggregation in a document-oriented NoSQL system.
- manage data using R and R Studio;
- solve coding challenges using R libraries and functions;
- make statistical inference and graphics using R;
- writing an apply family of functions in R;
- understand the Python data model and the flow control statements;
- use the built-in Python data structures;
- perform basic linear algebra operations using Numpy;
- perform basic data set manipulations using Pandas:
- perform simple machine learning experiments using Scikit-learn;
- understand and apply the core notions of data modeling in relational databases;
- use the SQL language for creating and querying relational database structures;
- understand and apply the principles of data organization in NoSQL systems;
- use MongoDB for data retrieval and aggregation in a document-oriented NoSQL system.
Modalità di valutazione: Esame
Giudizio di valutazione: voto verbalizzato in trentesimi
Corso singolo
Questo insegnamento non può essere seguito come corso singolo. Puoi trovare gli insegnamenti disponibili consultando il catalogo corsi singoli.
Programma e organizzazione didattica
Edizione unica
Edizione non attiva
Moduli o unità didattiche
Module Coding for Data Science
INF/01 - INFORMATICA - CFU: 6
Lezioni: 40 ore
Module Data Management
SECS-S/01 - STATISTICA - CFU: 6
Lezioni: 40 ore