Teaching CVAIAC
Computer Vision and Artificial Intelligence for Autonomous Cars
Autumn 2024 · ETH Zurich · 6 ECTS
About the course
This course introduces the core computer vision techniques and algorithms that autonomous cars use to perceive the semantics and geometry of their driving environment, localize themselves in it, and predict its dynamic evolution. Emphasis is placed on techniques tailored for real-world settings, such as multi-modal fusion, domain-adaptive and outlier-aware architectures, and multi-agent methods.
- Lecturer
- Christos Sakaridis
- When
- Lectures 14:15–17:00; practical sessions 10:15–12:00
- Where
- HG D 5.2 (lectures, also streamed via Zoom); practical sessions held online via Zoom
Slides, exercise sheets, recordings, and exam materials are password-protected. Access credentials were distributed to enrolled students.
Lecture team
Lecturer
Lectures
| Date | Time | Room | Topic | Slides | Video |
|---|---|---|---|---|---|
| 20.09.2024 | 14:15–17:00 | HG D 5.2 / Zoom | Fundamentals of Autonomous Cars | Video | |
| 27.09.2024 | 14:15–17:00 | HG D 5.2 / Zoom | Fundamental Computer Vision Architectures and Algorithms for Autonomous Cars | Video | |
| 04.10.2024 | 14:15–17:00 | HG D 5.2 / Zoom | Fundamental Computer Vision Architectures and Algorithms for Autonomous Cars (continued) | Video | |
| 11.10.2024 | 14:15–17:00 | HG D 5.2 / Zoom | Semantic Segmentation | Video | |
| 18.10.2024 | 14:15–17:00 | HG D 5.2 / Zoom | Depth Estimation | Video | |
| 25.10.2024 | 14:15–17:00 | HG D 5.2 / Zoom | Object Detection | Video | |
| 01.11.2024 | 14:15–17:00 | HG D 5.2 / Zoom | Instance Segmentation and Panoptic Segmentation | Video | |
| 08.11.2024 | 14:15–17:00 | HG D 5.2 / Zoom | Unimodal 3D Object Detection | Video | |
| 15.11.2024 | — | — | No lecture - CVPR conference deadline | — | — |
| 22.11.2024 | 14:15–17:00 | HG D 5.2 / Zoom | 3D Reconstruction and Localization | Video | |
| 29.11.2024 | 14:15–17:00 | HG D 5.2 / Zoom | Domain Adaptation | Video | |
| 06.12.2024 | 14:15–17:00 | HG D 5.2 / Zoom | Multi-modal 2D and 3D Object Detection | Video | |
| 13.12.2024 | 14:15–17:00 | HG D 5.2 / Zoom | Visual Grounding, Anomaly Segmentation and Vehicle-to-Vehicle Communication | Video | |
| 20.12.2024 | 14:15–17:00 | HG D 5.2 / Zoom | Multiple Object Tracking and Motion Prediction | Video |
Practical sessions
| Date | Time | Room | Topic | Slides | Video |
|---|---|---|---|---|---|
| 20.09.2024 | — | — | No practical session | — | — |
| 27.09.2024 | — | — | No practical session | — | — |
| 04.10.2024 | 10:15–12:00 | Online (Zoom) | Getting Started with Python and SLURM | Video | |
| 11.10.2024 | 10:15–12:00 | Online (Zoom) | Project 1 Introduction: Semantic Segmentation and Depth Estimation | Video | |
| 18.10.2024 | 10:15–12:00 | Online (Zoom) | Project 1: Attention Mechanisms and Transformers | Video | |
| 25.10.2024 | 10:15–12:00 | Online (Zoom) | Project 1: Q&A | — | Video |
| 01.11.2024 | 10:15–12:00 | Online (Zoom) | Project 1: Q&A | — | Video |
| 08.11.2024 | 10:15–12:00 | Online (Zoom) | Project 1: Q&A | — | Video |
| 15.11.2024 | 10:15–12:00 | Online (Zoom) | Project 2 Introduction: 3D Detection from Point Clouds | Video | |
| 22.11.2024 | 10:15–12:00 | Online (Zoom) | Project 2 Introduction: 3D Detection from Point Clouds | Video | |
| 29.11.2024 | 10:15–12:00 | Online (Zoom) | Project 2 Q&A | — | Video |
| 06.12.2024 | 10:15–12:00 | Online (Zoom) | Project 2 Q&A | — | Video |
| 13.12.2024 | 10:15–12:00 | Online (Zoom) | Project 2 Q&A | — | Video |
| 20.12.2024 | 10:15–12:00 | Online (Zoom) | Project 2 Q&A | — | — |
Projects
-
Project 1: Semantic Segmentation and Depth Estimation
Implement complex computer vision architectures and algorithms and apply them to real-world, multi-modal driving datasets: develop models and algorithms for semantic segmentation and depth estimation.
Handout (PDF) -
Project 2: 3D Object Detection using LiDARs
Implement complex computer vision architectures and algorithms and apply them to real-world, multi-modal driving datasets: develop models and algorithms for 3D object detection using LiDARs (3D detection from point clouds).
Handout (PDF)
Prerequisites
- Solid basic knowledge of linear algebra, multivariate calculus, and probability theory
- Basic background in computer vision and machine learning
- Solid background in programming; practical projects are based on Python and libraries such as PyTorch, scikit-learn and scikit-image
Exam & grading
Exam
- Examiner
- Christos Sakaridis
- Format
- Written session examination
- Duration
- 120 minutes
- Language
- English
- Permitted
- One A4 sheet of paper and simple non-programmable calculator
- The performance assessment is only offered in the session after the course unit. Repetition is only possible after re-enrolling.
- A short mock exam with sample, representative multiple-choice and true-false questions is available without and with solutions for practicing; its volume is shorter than (and not representative of) that of the actual exam.
- Questions on the solutions of the mock exam were discussed in the lecture of 06.12.2024.
Grading
Projects 50% · Exam 50%
- The final grade is calculated from the session examination grade and the overall projects grade, with each of the two elements weighing 50%.
- The projects are an integral part of the course, they are group-based and their completion is compulsory.
- Receiving a failing overall projects grade results in a failing final grade for the course.
- Students who do not pass the projects are required to de-register from the exam.
Learning objectives
- Understand the operating principles of visual sensors in autonomous cars
- Differentiate between the core architectural paradigms and components of modern visual perception models and describe their logic and the role of their parameters
- Systematically categorize the main visual tasks related to automated driving and understand the primary representations and algorithms which are used for solving them
- Critically analyze and evaluate current research in the area of computer vision for autonomous cars
- Practically reproduce state-of-the-art computer vision methods in automated driving
- Independently develop new models for visual perception