U3D-MOLTS: Unified 3D Monocular Object Localization, Tracking and Segmentation [UW_IPL/ETRI_AIRL]

Submitted on 9 Oct. 2021 08:46 by
Haotian Zhang (University of Washington)

Running time:0.21 s
Environment:GPU @ 2.5 Ghz (Python)

Method Description:
For the first stage, we propose a unified
monocular 3D based framework that effectively
tracks detected moving objects over time and
estimates their 3D localization information as
well as instance segmentation masks from a
sequence of 2D images captured from a dash camera
on a moving vehicle. Our system contains an RCNN-
based Localization for Tracking Network (Loc4Trk-
Net). The object association leverages deep
pairwise contrastive learning to identify objects
in various poses and viewpoints with appearance
cues. A straightforward combination of a 3D Kalman
filter and the Hungarian algorithm is further
utilized for robust instance association via both
feature similarity and 3D localization
information. For the second stage, we adopt the
existing DeepLabV3+ for semantic segmentation and
further enhanced the performance with data
augmentation using label propagation.
Latex Bibtex:

Detailed Results

From all 29 test sequences, our benchmark computes the STQ segmentation and tracking metric (STQ, AQ, SQ (IoU)). The tables below show all of these metrics.

Benchmark STQ AQ SQ (IoU)
KITTI-STEP 67.55 % 71.26 % 64.04 %

This table as LaTeX

eXTReMe Tracker