TOPICS

Motivation

Why Class-Incremental Semantic Segmentation for Autonomous Driving?

Self-driving vehicles require comprehensive scene understanding capabilities to safely navigate in dynamic environments. Semantic segmentation predicts object categories for every pixel of camera image which enables safe navigation in correlation to annotated objects and terrains in a scene. Commonly, an image model is trained on a static dataset with a fixed set of pixel-wise labels for this task. However, such vehicles operate in an open-world scenario where training data with new object classes appear over time. Class-Incremental Learning (CIL) aims to update the model with new classes at periodic timesteps.

State-of-the-art memory-free Class-Incremental Semantic Segmentation (CISS) methods constrain features of the new model to imitate those of the prior model with direct feature distillation or frozen old class weights. We find that these restrictions significantly hinder the plasticity of the model as old class features cannot evolve. Additionally, all CISS methods assume that new classes originate from the prior background. This scenario is unrealistic for automated driving as a change in requirements for navigation could also entail bifurcations of previously observed classes. Our proposed TOPICS approach demonstrates exceptional performance for incremental classes which originate from known classes or the background. Find out more about our novel CISS framework in the approach section!

Technical Approach

TOPICS Architecture

Figure: During base training of TOPICS, features are mapped onto the Poincaré ball before the class hierarchy is explicitly enforced with \(L_{hier}\). In incremental steps, the old model is used to generate pseudo-labels of old classes and to regularize the last layer’s weights with \(L_{rel}\) and feature radii with \(L_{dist}\).

We leverage the class taxonomy and implicit relations between prior classes to avoid catastrophic forgetting in incremental learning steps. We first train the model on the base dataset. The class hierarchy is explicitly enforced in the final network layer which is mapped in hyperbolic space. This geometric space ensures that classes are equidistant to each other irrespective of their hierarchy level which facilitates learning tree-like class hierarchy structure. During the incremental steps, we leverage the old model's weights to create pseudo-labels for the background and employ scarcity and relation regularization losses to maintain important relations of old classes while learning the novel classes in a supervised manner.