PolarMOT: How Far Can Geometric Relations Take Us in 3D Multi-Object Tracking?

Aleksandr Kim Guillem Brasó Aljoša Ošep Laura Leal-Taixé

TLDR: State-of-the-art generalizable multi-object tracking posed as edge classfication on a continuously evolved temporal multiplex graph, which contains only pairwise geometric relationships between objects (temporal and spatial) as its initial features

Focus on object interactions and influences, without object information, e.g. appearance

Light graph neural network made of 3/4-layer MLPs with fully connected layers (only 71k parameters).
It is capable of near-real-time inference: 33 FPS on NuScenes and 170 FPS on KITTI

Polar parametrization of features enables better generalization across datasets and cities without re-training.
It also allows the model to perform well when trained only on 1% data

Features are time normalized to help handle occlusions and gaps

Paper Code

BibTex

 @InProceedings{polarmot,
    author={Kim, Aleksandr and Bras{\'o}, Guillem and O{\v{s}}ep, Aljo{\v{s}}a and Leal-Taix{\'e}, Laura},
    title={PolarMOT: How Far Can Geometric Relations Take Us in 3D Multi-Object Tracking?}, 
    booktitle={European Conference on Computer Vision (ECCV) 2022},
    publisher={Springer Nature Switzerland},
    address={Cham},
    doi = {10.1007/978-3-031-20047-2_3},
    year={2022},
    pages={41--58},
    organization={Springer},
    isbn={978-3-031-20047-2},
}

Polar parametrization of features enables better generalization across datasets and cities without re-training.
It also allows the model to perform well when trained only on 1% data

Features are time normalized to help handle occlusions and gaps

PolarMOT: How Far Can Geometric Relations Take Us in 3D Multi-Object Tracking?

Aleksandr Kim Guillem Brasó Aljoša Ošep Laura Leal-Taixé

TLDR: State-of-the-art generalizable multi-object tracking posed as edge classfication on a continuously evolved temporal multiplex graph, which contains only pairwise geometric relationships between objects (temporal and spatial) as its initial features

Focus on object interactions and influences, without object information, e.g. appearance

Light graph neural network made of 3/4-layer MLPs with fully connected layers (only 71k parameters).
It is capable of near-real-time inference: 33 FPS on NuScenes and 170 FPS on KITTI

Paper Code

BibTex

PolarMOT: How Far Can Geometric Relations Take Us in 3D Multi-Object Tracking?

Aleksandr Kim Guillem Brasó Aljoša Ošep Laura Leal-Taixé

TLDR: State-of-the-art generalizable multi-object tracking posed as edge classfication on a continuously evolved temporal multiplex graph, which contains only pairwise geometric relationships between objects (temporal and spatial) as its initial features

Focus on object interactions and influences, without object information, e.g. appearance

Light graph neural network made of 3/4-layer MLPs with fully connected layers (only 71k parameters). It is capable of near-real-time inference: 33 FPS on NuScenes and 170 FPS on KITTI

Polar parametrization of features enables better generalization across datasets and cities without re-training. It also allows the model to perform well when trained only on 1% data

Features are time normalized to help handle occlusions and gaps

Paper Code

BibTex

Light graph neural network made of 3/4-layer MLPs with fully connected layers (only 71k parameters).
It is capable of near-real-time inference: 33 FPS on NuScenes and 170 FPS on KITTI

Polar parametrization of features enables better generalization across datasets and cities without re-training.
It also allows the model to perform well when trained only on 1% data