Digital Tecnologie for Organisations
A.Y. 2023/2024
Learning objectives
The aim of the course is twofold: 1) to familiarize student with widely used professional technologies for the organization, analysis, and visualization of structured data; 2) to introduce at the logic and usage of sequences of commands and control constructs (scripting) for data analysis.
More specific objectives are:
1) Introduce students to data analysis for the Social Science and to open source technologies;
2) Learn principles of data analysis with R: R languages, libraries, RStudio;
3) Familiarize with principles of computational logic through command-line tools;
4) Learn the main phases of a data analysis: data tidying and data transformation operations;
5) Introduce to data visualization and to the main graph types (scatterplot, lineplot, bar chart, histogram, boxplot, marginals with variants) by means of the ggplot2 library;
6) Introduce to open data usage through exercises with public domain dataset of medium-low complexity;
7) Use of open format and online books and technical documentation in English.
More specific objectives are:
1) Introduce students to data analysis for the Social Science and to open source technologies;
2) Learn principles of data analysis with R: R languages, libraries, RStudio;
3) Familiarize with principles of computational logic through command-line tools;
4) Learn the main phases of a data analysis: data tidying and data transformation operations;
5) Introduce to data visualization and to the main graph types (scatterplot, lineplot, bar chart, histogram, boxplot, marginals with variants) by means of the ggplot2 library;
6) Introduce to open data usage through exercises with public domain dataset of medium-low complexity;
7) Use of open format and online books and technical documentation in English.
Expected learning outcomes
A student should be able to recognize the meaning and the expected effects of command sequences for data organization, analysis, and visualization. She/he should also be able to code scripts corresponding to data selection, transformation, and visualization regarding predefined dataset. The ability to recognize and fix syntax and semantic errors produced in the command language usage is also required. Finally, the student should be able to discuss how predefined dataset could be analyzed together with expected outcomes and possible applications of the considered technology.
Lesson period: First trimester
Assessment methods: Esame
Assessment result: voto verbalizzato in trentesimi
Single course
This course cannot be attended as a single course. Please check our list of single courses to find the ones available for enrolment.
Course syllabus and organization
Single session
Responsible
Lesson period
First trimester
Course syllabus
1. Introduction to data science
2. Open Data, Open Access, Open Source
3. R language and RStudio
4. Data Wrangling operations
5. Data import and main transformation operations
6. Date, strings, and missing values operations
7. Groups and aggregation operations
8. Functions and multicolumn operations
9. Join operation between data frames
10. Operations on lists
The study will be strongly focused on exercising with case studies from Open Data publicly available, in addition to more didactic exercises from books and teaching material used for the course. Data and exercises from Open Data will be both discussed during classes and left as autonomous homework. Completing numerous exercises is indispensable for the required preparation.
2. Open Data, Open Access, Open Source
3. R language and RStudio
4. Data Wrangling operations
5. Data import and main transformation operations
6. Date, strings, and missing values operations
7. Groups and aggregation operations
8. Functions and multicolumn operations
9. Join operation between data frames
10. Operations on lists
The study will be strongly focused on exercising with case studies from Open Data publicly available, in addition to more didactic exercises from books and teaching material used for the course. Data and exercises from Open Data will be both discussed during classes and left as autonomous homework. Completing numerous exercises is indispensable for the required preparation.
Prerequisites for admission
English reading and understanding: basic knowledge
Basic usage of a personal computer and of the internet (e.g., file and directory creation and management, rules for file naming, program installation, browser and online search usage, etc.).
Basic usage of a personal computer and of the internet (e.g., file and directory creation and management, rules for file naming, program installation, browser and online search usage, etc.).
Teaching methods
Classes are in person and it is suggested to bring a laptop in order to follow examples and exercises discussed during classes.
Some additional exercises will be taught by the course tutor. Note that these are extra hours in addition to 40 hours of the course, they will not include new contents with respect to the official program, and therefore they are not mandatory for the exam preparation. However, they are a useful learning support for several students.
Some additional exercises will be taught by the course tutor. Note that these are extra hours in addition to 40 hours of the course, they will not include new contents with respect to the official program, and therefore they are not mandatory for the exam preparation. However, they are a useful learning support for several students.
Teaching Resources
LIBRO DI TESTO
FONDAMENTI DI DATA SCIENCE - Python, R e OpenData
Marco Cremonini, Egea Editore, Giugno 2023. ISBN/EAN: 9788823823501
https://www.egeaeditore.it/ita/prodotti/ict-e-sistemi-informativi/fondamenti-di-data-science.aspx
Of this book, we will use sections dedicated to the R language.
Additional material will be from online sources, regarding open data or technical documentation.
FONDAMENTI DI DATA SCIENCE - Python, R e OpenData
Marco Cremonini, Egea Editore, Giugno 2023. ISBN/EAN: 9788823823501
https://www.egeaeditore.it/ita/prodotti/ict-e-sistemi-informativi/fondamenti-di-data-science.aspx
Of this book, we will use sections dedicated to the R language.
Additional material will be from online sources, regarding open data or technical documentation.
Assessment methods and Criteria
The exam is exclusively in written form with practical exercises requiring to use a personal computer and softwares employed during the course.
No intermediate exams are provided.
The evaluation will consider to what extent computational logic has been understood, the familiarity achieved with data analysis principles, and usage of software employed during classes.
No intermediate exams are provided.
The evaluation will consider to what extent computational logic has been understood, the familiarity achieved with data analysis principles, and usage of software employed during classes.
Educational website(s)
Professor(s)