Information and Web Communication Science
A.Y. 2021/2022
Learning objectives
The understanding and usage of the techniques and tools that determine the visibility of web content is a cross-cutting and pervasive theme within the world of Information and Communication Technology (ICT). It is of interest not only to its professionals within the discipline, but also to communication experts and humanist scholars in general. The first part of the course provides basic notions on the structure of the web, search engines and techniques for information retrieval. The second part discusses techniques for improving the visibility of content published on the web, more commonly known as Search Engine Optimization (SEO) techniques. Beyond this, the second part presents the emerging theme of the "web of data", describing its main features and differences with respect to the model traditionally known as the "web of information" and ultimately providing the necessary knowledge to assess the impact that these new tools have on conventional SEO techniques.
Expected learning outcomes
The student will need to have an adequate measure of Knowledge and Skills.
Knowledge: the student will acquire knowledge about web information retrieval with a particular emphasis on search engines and related techniques for text transformation, indexing and query processing. They will be able to illustrate i) the basic features of the two main information retrieval models (boolean vs. vector) and ii) the techniques to evaluate information retrieval effectiveness through the metrics of precision, recall and f-measure. In addition, the student will have knowledge about website visibility in terms of search engine retrieval (the so-called SEO - Search Engine Optimization). To this end, the student will need to know the main constructs of the HTML5 language, link analysis techniques, as well as the definition of "web of data", the various types of microdata and the impact that these ideas have on both on-site and off-site SEO techniques.
Skills: The student will acquire the following skills:
· distinguish the features and the applicative issues of the Boolean retrieval model with respect to the Vector retrieval model;
· know how to apply text transformation, indexing and query processing techniques of an information retrieval system;
· know how to apply TF-IDF-based techniques in a vector retrieval system;
· know how to apply the evaluation techniques of an information retrieval system based on precision, recall and f-measure;
· know how to use HTML5 for webpage creation;
· know how to describe link analysis techniques and the main link analysis algorithms for ranking search engine results;
· distinguish the features of the main formats for the "web of data", such as microdata, RDFa and JSON-LD;
· know how to apply SEO techniques.
Knowledge: the student will acquire knowledge about web information retrieval with a particular emphasis on search engines and related techniques for text transformation, indexing and query processing. They will be able to illustrate i) the basic features of the two main information retrieval models (boolean vs. vector) and ii) the techniques to evaluate information retrieval effectiveness through the metrics of precision, recall and f-measure. In addition, the student will have knowledge about website visibility in terms of search engine retrieval (the so-called SEO - Search Engine Optimization). To this end, the student will need to know the main constructs of the HTML5 language, link analysis techniques, as well as the definition of "web of data", the various types of microdata and the impact that these ideas have on both on-site and off-site SEO techniques.
Skills: The student will acquire the following skills:
· distinguish the features and the applicative issues of the Boolean retrieval model with respect to the Vector retrieval model;
· know how to apply text transformation, indexing and query processing techniques of an information retrieval system;
· know how to apply TF-IDF-based techniques in a vector retrieval system;
· know how to apply the evaluation techniques of an information retrieval system based on precision, recall and f-measure;
· know how to use HTML5 for webpage creation;
· know how to describe link analysis techniques and the main link analysis algorithms for ranking search engine results;
· distinguish the features of the main formats for the "web of data", such as microdata, RDFa and JSON-LD;
· know how to apply SEO techniques.
Lesson period: Second semester
Assessment methods: Esame
Assessment result: voto verbalizzato in trentesimi
Single course
This course cannot be attended as a single course. Please check our list of single courses to find the ones available for enrolment.
Course syllabus and organization
Single session
Responsible
Lesson period
Second semester
The following recommendations are valid only if teaching restrictions are applied by the Public Authority and the University (lockdown) due to health security reasons.
Teaching methods:
Lectures will be given through the Microsoft Teams platform according to the official lecture scheduling (synchronous attendance). Moreover, lectures are recorded and made available via streaming on the Microsoft Teams platform (asynchronous attendance). The lecture scheduling and the latest news are published on the course website (Ariel platform).
Syllabus:
The syllabus as well as the bibliography are unchanged.
Assessment method and evaluation criteria:
A remote assessment modality is enforced according to the guidelines provided by the University and published on the course website (Ariel platform). The exam structure and the evaluation criteria are unchanged.
Teaching methods:
Lectures will be given through the Microsoft Teams platform according to the official lecture scheduling (synchronous attendance). Moreover, lectures are recorded and made available via streaming on the Microsoft Teams platform (asynchronous attendance). The lecture scheduling and the latest news are published on the course website (Ariel platform).
Syllabus:
The syllabus as well as the bibliography are unchanged.
Assessment method and evaluation criteria:
A remote assessment modality is enforced according to the guidelines provided by the University and published on the course website (Ariel platform). The exam structure and the evaluation criteria are unchanged.
Course syllabus
The course is organized in two parts.
The syllabus of part A (20 hours - 3CFU) is about the following topics:
· Information retrieval and search engines
· Retrieval models (boolean model vs. vector space model)
· Text analysis techniques
· Web content indexing
· Query processing
· Evaluation of an information retrieval system
The syllabus of part B (20 hours - 3CFU) is about the following topics:
· Link analysis techniques
· HTML5 notions
· Languages for the web of data
· SEO techniques (Search Engine Optimization)
· "on-site" SEO techniques
· "off-site" SEO techniques
The syllabus of part A (20 hours - 3CFU) is about the following topics:
· Information retrieval and search engines
· Retrieval models (boolean model vs. vector space model)
· Text analysis techniques
· Web content indexing
· Query processing
· Evaluation of an information retrieval system
The syllabus of part B (20 hours - 3CFU) is about the following topics:
· Link analysis techniques
· HTML5 notions
· Languages for the web of data
· SEO techniques (Search Engine Optimization)
· "on-site" SEO techniques
· "off-site" SEO techniques
Prerequisites for admission
No prerequisites
Teaching methods
Lectures are based on frontal teaching with the support of slides and handouts that are progressively published on the reference course website (Ariel platform). Throughout the lectures, the analysis of real case-studies is enforced with the aim at illustrating the application of theoretical course contents to concrete scenarios. Seminars are also planned with the participation of professional experts on web-communication topics.
Lecture attendance is not mandatory, but it is strongly recommended.
Lecture attendance is not mandatory, but it is strongly recommended.
Teaching Resources
The bibliography coincides for attending and non-attending students.
For the part A, choose a book between:
· C.D. Manning, P. Raghavan, H. Schütze. Introduction to Information Retrieval. Cambridge University Press. 2008. Text in English, free download online (http://nlp.stanford.edu/IR-book/information-retrieval-book.html).
Chapters: 1 (excluded "query optimization" in 1.3), 2 (excluded 2.3 and 2.4.3), 3 (excluded 3.2.1 and 3.4), 4 (excluded "logarithmic merging" in 4.5 and 4.6), 6 (excluded 6.1, 6.3.3 and 6.4), 8 (included 8.4 only "precision at k" and "r-precision", excluded 8.5 - 8.7).
· W.B. Croft, D. Metzler, T. Strohman. Search Engines, Information Retrieval in Practice. Pearson Education. 2015. Text in English, free download online (http://ciir.cs.umass.edu/downloads/SEIRiP.pdf).
Chapters: 1, 2, 3 (excluded 3.6 - 3.8), 4 (excluded 4.2.2, 4.6, 4.7), 5 (excluded 5.4 - 5.7), 6, 7.1, 8 (excluded 8.5 - 8.7).
For the part B:
· C.D. Manning, P. Raghavan, H. Schütze. Introduction to Information Retrieval. Cambridge University Press. 2008. Text in English, free download online (http://nlp.stanford.edu/IR-book/information-retrieval-book.html).
Chapters: 19 (included 19.5 without math details, excluded 19.6), 20 (excluded 20.3, 20.4), 21 (up to 21.2.1 excluded).
· Introduction to HTML5 (https://www.web-link.it/intro-html5.html). Focus on the following topics of the tutorial: 01. Introduction, 02. Document structure, 03. Semantic elements, 06. Video, 07. Audio, 08. Canvas, 09. Conclusions.
· Linked data and semantic web (http://www.bibliotecheoggi.it/pdf.php?filepdf=20120300701.pdf).
· Introduction to structured web data (https://developers.google.com/search/docs/guides/intro-structured-data).
· Introduction to microdata (https://schema.org/docs/gs.html, text in English).
Moreover, choose a book between:
· M. Maltraversi. SEO e SEM. Guida avanzata al Web marketing (fourth edition). LSWR publishing. 2016. Text in Italian.
· E. Enge, S. Spencer, J.C. Stricchiola. The Art of SEO: Mastering Search Engine Optimization. O'Reilly publishing, third edition. 2015. Text in English. About this book, an Italian version from Flacowski publishing is also available with Jacopo Matteuzzi and Flavio Mazzanti (editors).
For the part A, choose a book between:
· C.D. Manning, P. Raghavan, H. Schütze. Introduction to Information Retrieval. Cambridge University Press. 2008. Text in English, free download online (http://nlp.stanford.edu/IR-book/information-retrieval-book.html).
Chapters: 1 (excluded "query optimization" in 1.3), 2 (excluded 2.3 and 2.4.3), 3 (excluded 3.2.1 and 3.4), 4 (excluded "logarithmic merging" in 4.5 and 4.6), 6 (excluded 6.1, 6.3.3 and 6.4), 8 (included 8.4 only "precision at k" and "r-precision", excluded 8.5 - 8.7).
· W.B. Croft, D. Metzler, T. Strohman. Search Engines, Information Retrieval in Practice. Pearson Education. 2015. Text in English, free download online (http://ciir.cs.umass.edu/downloads/SEIRiP.pdf).
Chapters: 1, 2, 3 (excluded 3.6 - 3.8), 4 (excluded 4.2.2, 4.6, 4.7), 5 (excluded 5.4 - 5.7), 6, 7.1, 8 (excluded 8.5 - 8.7).
For the part B:
· C.D. Manning, P. Raghavan, H. Schütze. Introduction to Information Retrieval. Cambridge University Press. 2008. Text in English, free download online (http://nlp.stanford.edu/IR-book/information-retrieval-book.html).
Chapters: 19 (included 19.5 without math details, excluded 19.6), 20 (excluded 20.3, 20.4), 21 (up to 21.2.1 excluded).
· Introduction to HTML5 (https://www.web-link.it/intro-html5.html). Focus on the following topics of the tutorial: 01. Introduction, 02. Document structure, 03. Semantic elements, 06. Video, 07. Audio, 08. Canvas, 09. Conclusions.
· Linked data and semantic web (http://www.bibliotecheoggi.it/pdf.php?filepdf=20120300701.pdf).
· Introduction to structured web data (https://developers.google.com/search/docs/guides/intro-structured-data).
· Introduction to microdata (https://schema.org/docs/gs.html, text in English).
Moreover, choose a book between:
· M. Maltraversi. SEO e SEM. Guida avanzata al Web marketing (fourth edition). LSWR publishing. 2016. Text in Italian.
· E. Enge, S. Spencer, J.C. Stricchiola. The Art of SEO: Mastering Search Engine Optimization. O'Reilly publishing, third edition. 2015. Text in English. About this book, an Italian version from Flacowski publishing is also available with Jacopo Matteuzzi and Flavio Mazzanti (editors).
Assessment methods and Criteria
Attending and non-attending students: the assessment method consists in a written exam on the syllabus of the whole course and in a practical case-study. The written exam is composed of quizzes, open-ended questions, and exercises. The assessment criteria are the capability to clearly present knowledge, the completeness of answers, the correctness of reasoning in carrying out exercises.
The practical case-study consists in the analysis of a case-study assigned by the teacher upon successful passing the written exam. The assessment criteria are the capability to fluidly present the knowledge and the critical positions/claims with respect to the considered case-study, the pertinence in using a specialized terminology. The final result is expressed in thirtieths and it summarizes the results obtained in the written exam and the practical case-study.
Incoming Erasmus students can take the exam in English according to a syllabus previously endorsed and confirmed by the teacher.
The assessment methods for students with handicap and/or learning disability is case-by-case defined by the teacher with the support of the competent authority.
The practical case-study consists in the analysis of a case-study assigned by the teacher upon successful passing the written exam. The assessment criteria are the capability to fluidly present the knowledge and the critical positions/claims with respect to the considered case-study, the pertinence in using a specialized terminology. The final result is expressed in thirtieths and it summarizes the results obtained in the written exam and the practical case-study.
Incoming Erasmus students can take the exam in English according to a syllabus previously endorsed and confirmed by the teacher.
The assessment methods for students with handicap and/or learning disability is case-by-case defined by the teacher with the support of the competent authority.
Unita' didattica A
INF/01 - INFORMATICS - University credits: 3
Lessons: 20 hours
Unita' didattica B
INF/01 - INFORMATICS - University credits: 3
Lessons: 20 hours
Professor(s)
Reception:
By appointment to be arranged by email
Room 7015, Dipartimento di Informatica "Giovanni degli Antoni", Via Celoria 18 - 20133 Milano