Multimedia Information Retrieval

This is a "seminar" course for primarily multimedia retrieval and the highly similar and relevant areas such as computer vision and deep learning where the two main activities are

(1) class participation to the seminar discussion (presenting and asking questions) and
(2) doing a focussed final programming project.

For the *class participation* it is important to have a working webcam and microphone!.

Students: In general I respond to emails within 2 workdays (Mon-Fri). If you do not get a response, please resend the email - Dr. Lew

It is necessary to register on the LML Course Manager:

LML Course Manager, lcm.liacs.nl

Joint Leiden University & TU Delft BioInformatics Course (IN4174)

Period: (tentative - may change) Wednesdays, 3 Feb 2021 - 12 May 2021
Time: 14:15 - 16:00
Place: Kaltura room - Lectures and Discussions (see lcm.liacs.nl notes for link)

Organizer:

Dr. Michael S. Lew, (Lecturer) email: lewmsk@gmail.com (Email to make an appointment)

Assistance and contributions given by:

Wei Chen, email: w.chen@liacs.leidenuniv.nl

Students: In emails, please use subject lines which start with MIR::

Admission requirements:

The student must be fluent in C and C++ programming (Python is also useful) and in image processing. There will be many programming assignments requiring C/C++ and using pixel-level analysis.

Goals:

Briefly, this course is research seminar that covers the fundamentals of understanding multimedia information retrieval by using computer vision and deep learning techniques. In this course, the students will present important or recent research in multimedia understanding toward providing relevant information for searching.

The state-of-the-art techniques for understanding the multimedia data are currently based in the fields of deep learning and computer vision. Therefore, this course will be discussing the strengths and weaknesses, challenges and issues and future directions of computer vision and deep learning as methods of understanding diverse multimedia data.

At the end of the Multimedia Information Retrieval course, the student should be able to

- understand the fundamental principles of multimedia information retrieval
- understand the primary paradigms in computer vision and deep learning
- have insight into traditional and state-of-the-art multimedia features.
- have insight into traditional and state-of-the-art multimedia learning algorithms.
- analyze a multimedia information retrieval system with regard to strengths and weaknesses and potential areas for improvements.
- explain the differences between modern search engines and database systems.
- have insight into traditional and state-of-the-art multimedia features
- have insight into scientifically evaluating an information retrieval system
- have insight into the integration of intelligent algorithms into the retrieval process
- develop and write scientific reports
- develop and give scientific presentations

Description:

Multimedia information exists in diverse forms, from personal images to movies to MRI and X-ray imagery. It is frequently combined with other types of media such as text (on the WWW) or audio and more recently location and range sensing data (e.g. smartphones and self-driving automobiles), which is why we call it multimedia. Because multimedia sensors are everywhere, there is a worldwide need for algorithms and systems for finding and understanding the multimedia data (e.g. all the images on the WWW, digital libraries, medical imagery, shop cameras, smartphones, etc.)

The state-of-the-art techniques for understanding the multimedia data are currently based in the fields of computer vision and deep learning. Therefore, this course will be discussing the strengths and weaknesses, challenges and issues and future directions of computer vision and deep learning as methods of understanding diverse multimedia data.

Work-forms

- lectures
- seminar
- student discussions
- presentations
- homework and software assignments

Examination (for 6 ECs):

- 50% for Paper Presentation(s)/Seminar (class participation & questions & homework).

- 50% for Software/Programming (including Final Project).

All programming assignments must work on Linux (Ubuntu 18 or 20)

For programming homework/assignments, the grading focuses on the software: source code, documentation, and how well the program works. Non-programming homework may be a blend of theoretical questions and/or scientific assessments of particular algorithms, where the main focus is not on programming, but on assessing and evaluating methods.

Assignments turned in late: grade penalty of -1 per 24 hours (1 day)

Source code for assignments must include instructions for compiling and execution in the machines in rooms 302, 303. This is necessary for grading/evaluating the work by the class organizers.

As this is a seminar, attendance is mandatory

University Leiden students do the work (see Examination above) including a Final Project for 6 ECTS.

Final Project

In principle, this is meant to be a scientific toned study between at least 2 competing algorithms for the same task such as VGG vs Resnet in image classification. The goal is to gain insights into the strengths and weaknesses of each algorithm and also the challenges in performing quantitative fair comparisons.

Recommended Reading: Principles of Visual Information Retrieval, M. S. Lew, Springer, 2001, ISBN: 978-1-85233-381-2

Suggestions for class presentations

- Introduce the problem and main idea - Explain the motivation for the paper
- Cover the main points of the algorithm
- Make sure you have a slide which clearly states the main contribution of the paper at the end
- There should be a little humor in the presentation to make it interesting for the audience

Table of contents/Topics

- Introduction to Multimedia Retrieval using Computer Vision and Multimedia Deep Learning
- Major Computer Vision Paradigms
- Major Deep Learning Paradigms
- Challenges in information retrieval
- Color, texture, and shape features
- Multi-modal features and salient points
- Similarity and Ranking
- Building search engines
- Bag-of-words approaches
- Concept detection
- Interactive and intelligent search algorithms
- Advanced browsing paradigms
- High performance search systems

Schedule (TENTATIVE):

Week 1: 3 Feb - Kaltura - Course Org, Paper Assignments and Overview Computer Vision and Deep Learning I

Week 2: 10 Feb - Kaltura - Overview Computer Vision and Deep Learning II

Week 3: 17 Feb - Reading week for scientific papers (no class)

Week 4: 24 Feb - Kaltura - Question & Answer (Q&A) Session - Final Project and Presentations

Week 5: 3 Mar - Kaltura - Student Presentations on Planned Final Projects: Goals and Evaluation Methods

Week 6: 10 Mar - "Science Career Event" - see University Schedule (no class)

Week 7: 17 Mar - Kaltura - Student Presentations of Scientific Papers

Week 8: 24 Mar - Kaltura - Student Presentations of Scientific Papers

Week 9: 31 Mar - Kaltura - Final Project Advice and Discussion - in time slots per student

Week 10: 7 April - Kaltura - Student Presentations of Scientific Papers

Week 11: 14 April - Kaltura - Student Presentations of Scientific Papers

Week 12: 21 April - Kaltura - Final Project Advice and Discussion (optional)

Week 13: 28 April - Kaltura - Final Project Presentations

Week 14: 5 May - "Liberation Day" - see University Schedule (no class)

Week 15: 12 May - Final reports and projects due on LML Course Manager