Computer Vision and Artificial Intelligence for Autonomous Cars

Autumn 2024 · ETH Zurich · 6 ECTS

About the course

This course introduces the core computer vision techniques and algorithms that autonomous cars use to perceive the semantics and geometry of their driving environment, localize themselves in it, and predict its dynamic evolution. Emphasis is placed on techniques tailored for real-world settings, such as multi-modal fusion, domain-adaptive and outlier-aware architectures, and multi-agent methods.

Lecturer: Christos Sakaridis
When: Lectures 14:15–17:00; practical sessions 10:15–12:00
Where: HG D 5.2 (lectures, also streamed via Zoom); practical sessions held online via Zoom

Slides, exercise sheets, recordings, and exam materials are password-protected. Access credentials were distributed to enrolled students.

ETH course catalogue Piazza forum

Lecture team

Lecturer

Christos Sakaridis

Teaching Assistants

Lectures

Date	Time	Room	Topic	Slides	Video
20.09.2024	14:15–17:00	HG D 5.2 / Zoom	Fundamentals of Autonomous Cars	PDF	Video
27.09.2024	14:15–17:00	HG D 5.2 / Zoom	Fundamental Computer Vision Architectures and Algorithms for Autonomous Cars	PDF	Video
04.10.2024	14:15–17:00	HG D 5.2 / Zoom	Fundamental Computer Vision Architectures and Algorithms for Autonomous Cars (continued)	PDF	Video
11.10.2024	14:15–17:00	HG D 5.2 / Zoom	Semantic Segmentation	PDF	Video
18.10.2024	14:15–17:00	HG D 5.2 / Zoom	Depth Estimation	PDF	Video
25.10.2024	14:15–17:00	HG D 5.2 / Zoom	Object Detection	PDF	Video
01.11.2024	14:15–17:00	HG D 5.2 / Zoom	Instance Segmentation and Panoptic Segmentation	PDF	Video
08.11.2024	14:15–17:00	HG D 5.2 / Zoom	Unimodal 3D Object Detection	PDF	Video
15.11.2024	—	—	No lecture - CVPR conference deadline	—	—
22.11.2024	14:15–17:00	HG D 5.2 / Zoom	3D Reconstruction and Localization	PDF	Video
29.11.2024	14:15–17:00	HG D 5.2 / Zoom	Domain Adaptation	PDF	Video
06.12.2024	14:15–17:00	HG D 5.2 / Zoom	Multi-modal 2D and 3D Object Detection	PDF	Video
13.12.2024	14:15–17:00	HG D 5.2 / Zoom	Visual Grounding, Anomaly Segmentation and Vehicle-to-Vehicle Communication	PDF	Video
20.12.2024	14:15–17:00	HG D 5.2 / Zoom	Multiple Object Tracking and Motion Prediction	PDF	Video

Practical sessions

Date	Time	Room	Topic	Slides	Video
20.09.2024	—	—	No practical session	—	—
27.09.2024	—	—	No practical session	—	—
04.10.2024	10:15–12:00	Online (Zoom)	Getting Started with Python and SLURM	PDF	Video
11.10.2024	10:15–12:00	Online (Zoom)	Project 1 Introduction: Semantic Segmentation and Depth Estimation	PDF	Video
18.10.2024	10:15–12:00	Online (Zoom)	Project 1: Attention Mechanisms and Transformers	PDF	Video
25.10.2024	10:15–12:00	Online (Zoom)	Project 1: Q&A	—	Video
01.11.2024	10:15–12:00	Online (Zoom)	Project 1: Q&A	—	Video
08.11.2024	10:15–12:00	Online (Zoom)	Project 1: Q&A	—	Video
15.11.2024	10:15–12:00	Online (Zoom)	Project 2 Introduction: 3D Detection from Point Clouds	PDF	Video
22.11.2024	10:15–12:00	Online (Zoom)	Project 2 Introduction: 3D Detection from Point Clouds	PDF	Video
29.11.2024	10:15–12:00	Online (Zoom)	Project 2 Q&A	—	Video
06.12.2024	10:15–12:00	Online (Zoom)	Project 2 Q&A	—	Video
13.12.2024	10:15–12:00	Online (Zoom)	Project 2 Q&A	—	Video
20.12.2024	10:15–12:00	Online (Zoom)	Project 2 Q&A	—	—

Projects

Project 1: Semantic Segmentation and Depth Estimation

Implement complex computer vision architectures and algorithms and apply them to real-world, multi-modal driving datasets: develop models and algorithms for semantic segmentation and depth estimation.
Handout (PDF)
Project 2: 3D Object Detection using LiDARs

Implement complex computer vision architectures and algorithms and apply them to real-world, multi-modal driving datasets: develop models and algorithms for 3D object detection using LiDARs (3D detection from point clouds).
Handout (PDF)

Prerequisites

Solid basic knowledge of linear algebra, multivariate calculus, and probability theory
Basic background in computer vision and machine learning
Solid background in programming; practical projects are based on Python and libraries such as PyTorch, scikit-learn and scikit-image

Exam & grading

Exam

Examiner: Christos Sakaridis
Format: Written session examination
Duration: 120 minutes
Language: English
Permitted: One A4 sheet of paper and simple non-programmable calculator

Mock exam Mock exam · solutions

The performance assessment is only offered in the session after the course unit. Repetition is only possible after re-enrolling.
A short mock exam with sample, representative multiple-choice and true-false questions is available without and with solutions for practicing; its volume is shorter than (and not representative of) that of the actual exam.
Questions on the solutions of the mock exam were discussed in the lecture of 06.12.2024.

Grading

Projects 50% · Exam 50%

The final grade is calculated from the session examination grade and the overall projects grade, with each of the two elements weighing 50%.
The projects are an integral part of the course, they are group-based and their completion is compulsory.
Receiving a failing overall projects grade results in a failing final grade for the course.
Students who do not pass the projects are required to de-register from the exam.

Learning objectives

Understand the operating principles of visual sensors in autonomous cars
Differentiate between the core architectural paradigms and components of modern visual perception models and describe their logic and the role of their parameters
Systematically categorize the main visual tasks related to automated driving and understand the primary representations and algorithms which are used for solving them
Critically analyze and evaluate current research in the area of computer vision for autonomous cars
Practically reproduce state-of-the-art computer vision methods in automated driving
Independently develop new models for visual perception