Teaching CVAIAC

Computer Vision and Artificial Intelligence for Autonomous Cars

Autumn 2024 · ETH Zurich · 6 ECTS

About the course

This course introduces the core computer vision techniques and algorithms that autonomous cars use to perceive the semantics and geometry of their driving environment, localize themselves in it, and predict its dynamic evolution. Emphasis is placed on techniques tailored for real-world settings, such as multi-modal fusion, domain-adaptive and outlier-aware architectures, and multi-agent methods.

Lecturer
Christos Sakaridis
When
Lectures 14:15–17:00; practical sessions 10:15–12:00
Where
HG D 5.2 (lectures, also streamed via Zoom); practical sessions held online via Zoom

Slides, exercise sheets, recordings, and exam materials are password-protected. Access credentials were distributed to enrolled students.

Lecture team

Teaching Assistants

Autonomous car

Lectures

Date Time Room Topic Slides Video
20.09.2024 14:15–17:00 HG D 5.2 / Zoom Fundamentals of Autonomous Cars
27.09.2024 14:15–17:00 HG D 5.2 / Zoom Fundamental Computer Vision Architectures and Algorithms for Autonomous Cars
04.10.2024 14:15–17:00 HG D 5.2 / Zoom Fundamental Computer Vision Architectures and Algorithms for Autonomous Cars (continued)
11.10.2024 14:15–17:00 HG D 5.2 / Zoom Semantic Segmentation
18.10.2024 14:15–17:00 HG D 5.2 / Zoom Depth Estimation
25.10.2024 14:15–17:00 HG D 5.2 / Zoom Object Detection
01.11.2024 14:15–17:00 HG D 5.2 / Zoom Instance Segmentation and Panoptic Segmentation
08.11.2024 14:15–17:00 HG D 5.2 / Zoom Unimodal 3D Object Detection
15.11.2024 No lecture - CVPR conference deadline
22.11.2024 14:15–17:00 HG D 5.2 / Zoom 3D Reconstruction and Localization
29.11.2024 14:15–17:00 HG D 5.2 / Zoom Domain Adaptation
06.12.2024 14:15–17:00 HG D 5.2 / Zoom Multi-modal 2D and 3D Object Detection
13.12.2024 14:15–17:00 HG D 5.2 / Zoom Visual Grounding, Anomaly Segmentation and Vehicle-to-Vehicle Communication
20.12.2024 14:15–17:00 HG D 5.2 / Zoom Multiple Object Tracking and Motion Prediction

Practical sessions

Date Time Room Topic Slides Video
20.09.2024 No practical session
27.09.2024 No practical session
04.10.2024 10:15–12:00 Online (Zoom) Getting Started with Python and SLURM
11.10.2024 10:15–12:00 Online (Zoom) Project 1 Introduction: Semantic Segmentation and Depth Estimation
18.10.2024 10:15–12:00 Online (Zoom) Project 1: Attention Mechanisms and Transformers
25.10.2024 10:15–12:00 Online (Zoom) Project 1: Q&A
01.11.2024 10:15–12:00 Online (Zoom) Project 1: Q&A
08.11.2024 10:15–12:00 Online (Zoom) Project 1: Q&A
15.11.2024 10:15–12:00 Online (Zoom) Project 2 Introduction: 3D Detection from Point Clouds
22.11.2024 10:15–12:00 Online (Zoom) Project 2 Introduction: 3D Detection from Point Clouds
29.11.2024 10:15–12:00 Online (Zoom) Project 2 Q&A
06.12.2024 10:15–12:00 Online (Zoom) Project 2 Q&A
13.12.2024 10:15–12:00 Online (Zoom) Project 2 Q&A
20.12.2024 10:15–12:00 Online (Zoom) Project 2 Q&A

Projects

  1. Project 1: Semantic Segmentation and Depth Estimation

    Implement complex computer vision architectures and algorithms and apply them to real-world, multi-modal driving datasets: develop models and algorithms for semantic segmentation and depth estimation.

    Handout (PDF)
  2. Project 2: 3D Object Detection using LiDARs

    Implement complex computer vision architectures and algorithms and apply them to real-world, multi-modal driving datasets: develop models and algorithms for 3D object detection using LiDARs (3D detection from point clouds).

    Handout (PDF)

Prerequisites

  • Solid basic knowledge of linear algebra, multivariate calculus, and probability theory
  • Basic background in computer vision and machine learning
  • Solid background in programming; practical projects are based on Python and libraries such as PyTorch, scikit-learn and scikit-image

Exam & grading

Exam

Examiner
Christos Sakaridis
Format
Written session examination
Duration
120 minutes
Language
English
Permitted
One A4 sheet of paper and simple non-programmable calculator
  • The performance assessment is only offered in the session after the course unit. Repetition is only possible after re-enrolling.
  • A short mock exam with sample, representative multiple-choice and true-false questions is available without and with solutions for practicing; its volume is shorter than (and not representative of) that of the actual exam.
  • Questions on the solutions of the mock exam were discussed in the lecture of 06.12.2024.

Grading

Projects 50% · Exam 50%

  • The final grade is calculated from the session examination grade and the overall projects grade, with each of the two elements weighing 50%.
  • The projects are an integral part of the course, they are group-based and their completion is compulsory.
  • Receiving a failing overall projects grade results in a failing final grade for the course.
  • Students who do not pass the projects are required to de-register from the exam.

Learning objectives

  • Understand the operating principles of visual sensors in autonomous cars
  • Differentiate between the core architectural paradigms and components of modern visual perception models and describe their logic and the role of their parameters
  • Systematically categorize the main visual tasks related to automated driving and understand the primary representations and algorithms which are used for solving them
  • Critically analyze and evaluate current research in the area of computer vision for autonomous cars
  • Practically reproduce state-of-the-art computer vision methods in automated driving
  • Independently develop new models for visual perception