Teaching CVAIAC
Computer Vision and Artificial Intelligence for Autonomous Cars
Autumn 2023 · ETH Zurich · 6 ECTS
About the course
This course introduces the core computer vision techniques and algorithms that autonomous cars use to perceive the semantics and geometry of their driving environment, localize themselves in it, and predict its dynamic evolution. Emphasis is placed on techniques tailored for real-world settings, such as multi-modal fusion, domain-adaptive and outlier-aware architectures, and multi-agent methods.
- Lecturer
- Christos Sakaridis
- When
- Lectures: 14:15–17:00; Practical sessions: 10:15–12:00 (11:00–12:00 on 08.12.2023)
- Where
- Lectures: HG D 5.2; Practical sessions: online via Zoom
Slides, exercise sheets, recordings, and exam materials are password-protected. Access credentials were distributed to enrolled students.
Lecture team
Lecturer
Lectures
| Date | Time | Room | Topic | Slides |
|---|---|---|---|---|
| 22.09.2023 | 14:15–17:00 | HG D 5.2 | Fundamentals of Autonomous Cars | |
| 29.09.2023 | 14:15–17:00 | HG D 5.2 | Fundamental Computer Vision Architectures and Algorithms for Autonomous Cars | |
| 06.10.2023 | 14:15–17:00 | HG D 5.2 | Fundamental Computer Vision Architectures and Algorithms for Autonomous Cars (continued) | |
| 13.10.2023 | 14:15–17:00 | HG D 5.2 | Semantic Segmentation | |
| 20.10.2023 | 14:15–17:00 | HG D 5.2 | Depth Estimation | |
| 27.10.2023 | 14:15–17:00 | HG D 5.2 | Object Detection | |
| 03.11.2023 | 14:15–17:00 | HG D 5.2 | Instance Segmentation and Panoptic Segmentation | |
| 10.11.2023 | 14:15–17:00 | HG D 5.2 | Unimodal 3D Object Detection | |
| 17.11.2023 | — | — | No lecture - CVPR conference deadline | — |
| 24.11.2023 | 14:15–17:00 | HG D 5.2 | 3D Reconstruction and Localization | |
| 01.12.2023 | 14:15–17:00 | HG D 5.2 | Domain Adaptation | |
| 08.12.2023 | 14:15–17:00 | HG D 5.2 | Multi-modal 2D and 3D Object Detection (last updated 14.12.23) | |
| 15.12.2023 | 14:15–17:00 | HG D 5.2 | Visual Grounding, Anomaly Segmentation and Vehicle-to-Vehicle Communication | |
| 22.12.2023 | 14:15–17:00 | HG D 5.2 | Multiple Object Tracking and Motion Prediction |
Practical sessions
| Date | Time | Room | Topic | Slides |
|---|---|---|---|---|
| 22.09.2023 | — | — | No practical session | — |
| 29.09.2023 | — | — | No practical session | — |
| 06.10.2023 | 10:15–12:00 | Online (Zoom) | Getting Started with Python and SLURM | |
| 13.10.2023 | 10:15–12:00 | Online (Zoom) | Project 1: Semantic Segmentation and Depth Estimation (Introduction) | |
| 20.10.2023 | 10:15–12:00 | Online (Zoom) | Project 1: Semantic Segmentation and Depth Estimation (Attention) | |
| 27.10.2023 | 10:15–12:00 | Online (Zoom) | Project 1: Q&A | — |
| 03.11.2023 | 10:15–12:00 | Online (Zoom) | Project 1: Q&A | — |
| 10.11.2023 | 10:15–12:00 | Online (Zoom) | Project 1: Q&A | — |
| 17.11.2023 | 10:15–12:00 | Online (Zoom) | Project 1: Hand-in | — |
| 24.11.2023 | 10:15–12:00 | Online (Zoom) | Project 2: Introduction | |
| 01.12.2023 | 10:15–12:00 | Online (Zoom) | Project 2: Q&A | — |
| 08.12.2023 | 11:00–12:00 | Online (Zoom) | Project 2: Q&A | — |
| 15.12.2023 | 10:15–12:00 | Online (Zoom) | Project 2: Q&A | — |
| 22.12.2023 | 10:15–12:00 | Online (Zoom) | Project 2: Hand-in | — |
Projects
-
Project 1: Semantic Segmentation and Depth Estimation
- Starts
- 13.10.2023
- Due
- 17.11.2023
Develop models and algorithms for semantic segmentation and depth estimation, applied to real-world, multi-modal driving datasets. Group-based and compulsory.
Handout (PDF) -
Project 2: 3D Object Detection using LiDARs
- Starts
- 24.11.2023
- Due
- 22.12.2023
Develop models and algorithms for 3D object detection using LiDARs, applied to real-world, multi-modal driving datasets. Group-based and compulsory.
Handout (PDF)
Prerequisites
- Solid basic knowledge of linear algebra, multivariate calculus, and probability theory
- Basic background in computer vision and machine learning
- Solid background in programming for the practical projects, which are based on Python and libraries of it such as PyTorch, scikit-learn and scikit-image
Exam & grading
Exam
- Examiner
- Christos Sakaridis
- Format
- Written session examination
- Duration
- 120 minutes
- Language
- English
- Permitted
- Two A4 pages (i.e. one A4 sheet of paper), either handwritten or 11-point font size minimum; simple non-programmable calculator.
- The performance assessment is only offered in the session after the course unit; repetition is only possible after re-enrolling.
- No mock exam or mock exam solutions are linked on this page.
Grading
Projects 50% · Exam 50%
- Final grade = 50% session examination grade + 50% overall projects grade.
- Projects are an integral part of the course, group-based, and their completion is compulsory.
- A failing overall projects grade results in a failing final grade for the course; students who do not pass the projects are required to de-register from the exam.
Learning objectives
- Understand the operating principles of visual sensors in autonomous cars
- Differentiate between the core architectural paradigms and components of modern visual perception models and describe their logic and the role of their parameters
- Systematically categorize the main visual tasks related to automated driving and understand the primary representations and algorithms which are used for solving them
- Critically analyze and evaluate current research in the area of computer vision for autonomous cars
- Practically reproduce state-of-the-art computer vision methods in automated driving
- Independently develop new models for visual perception