Method

ViP-DeepLab [ViP-DeepLab]


Submitted on 9 Nov. 2020 23:55 by
Siyuan Qiao (Johns Hopkins University)

Running time:0.1 s
Environment:1 core @ 2.5 Ghz (C/C++)

Method Description:
tbd
Parameters:
tbd
Latex Bibtex:
@article{vip_deeplab,
title={ViP-DeepLab: Learning Visual Perception
with Depth-aware Video Panoptic Segmentation},
author={Siyuan Qiao and Yukun Zhu and Hartwig
Adam and Alan Yuille and Liang-Chieh Chen},
journal={Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition},
year={2021},
}

Detailed Results

From all 29 test sequences, our benchmark computes the commonly used tracking metrics (adapted for the segmentation case): CLEARMOT, MT/PT/ML, identity switches, and fragmentations [1,2]. The tables below show all of these metrics.


Benchmark sMOTSA MOTSA MOTSP MODSA MODSP
CAR 81.00 % 90.70 % 89.90 % 91.80 % 92.20 %
PEDESTRIAN 68.70 % 84.50 % 82.30 % 85.50 % 93.90 %

Benchmark recall precision F1 TP FP FN FAR #objects #trajectories
CAR 95.90 % 95.90 % 95.90 % 35237 1498 1523 13.50 % 54397 1250
PEDESTRIAN 89.00 % 96.20 % 92.50 % 18426 737 2271 6.60 % 27830 671

Benchmark MT PT ML IDS FRAG
CAR 92.20 % 7.20 % 0.60 % 392 580
PEDESTRIAN 73.30 % 24.10 % 2.60 % 209 443

This table as LaTeX


[1] K. Bernardin, R. Stiefelhagen: Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics. JIVP 2008.
[2] Y. Li, C. Huang, R. Nevatia: Learning to associate: HybridBoosted multi-target tracker for crowded scene. CVPR 2009.


eXTReMe Tracker