The KITTI Vision Benchmark Suite

Method

Motion-DeepLab [on] [Motion-DeepLab]
https://github.com/google-research/deeplab2/

Submitted on 2 Aug. 2021 11:34 by
Mark Weber (Technical University Munich (TUM))

Running time:		1 s
Environment:		GPU @ 2.5 Ghz (Python)

Method Description:

Motion-DeepLab, a unified model for the task of video panoptic segmentation, which requires to segment and track every pixel. It is built on top of
Panoptic-DeepLab and uses an additional branch to regress each pixel to its center location in the previous frame. Instead of using a single RGB image as input, the network input contains two consecutive frames, i.e., the current and previous frame, as well as the center heatmap from the previous frame. The output is used to assign consistent track IDs to all instances throughout a video sequence.

Parameters:

n/a, runtime not measured.

Latex Bibtex:

@article{step_2021,
author={Mark Weber and Jun Xie and Maxwell Collins and Yukun Zhu and Paul Voigtlaender and Hartwig Adam and Bradley Green and Andreas Geiger and Bastian Leibe and Daniel Cremers and Aljosa Osep and Laura Leal-Taixe and Liang-Chieh Chen},
title={{STEP}: Segmenting and Tracking Every Pixel},
journal={arXiv:2102.11859},
year={2021}
}

Detailed Results

From all 29 test sequences, our benchmark computes the STQ segmentation and tracking metric (STQ, AQ, SQ (IoU)). The tables below show all of these metrics.

Benchmark	STQ	AQ	SQ (IoU)
KITTI-STEP	52.19 %	45.55 %	59.81 %

This table as LaTeX

The KITTI Vision Benchmark Suite

A project of Karlsruhe Institute of Technologyand Toyota Technological Institute at Chicago

Method

Detailed Results

A project of Karlsruhe Institute of Technology
and Toyota Technological Institute at Chicago