The KITTI Vision Benchmark Suite

Method

Cascade TWiX [on] [C-TWiX]
https://github.com/Guepardow/TWiX/

Submitted on 16 Nov. 2024 16:18 by
Mehdi Miah (Polytechnique Montréal)

Running time:		0.01 s
Environment:		8 cores @ >3.5 Ghz (Python)

Method Description:

No CMC, no visual appearances, no ReID, no BEV, no LIDAR, no depth estimation; only spatio-temporal coordinates (xmin, ymin, xmax, ymax, t).

Detections : Permatrack
Pipeline : Online Cascade matching (cf C-BIoU)

We focused on the association step by training a Transformer-based network to predict the similarity matrix between past tracks and current detections.

Parameters:

Maximal temporal gap at STA/LTA : 0.0s/0.8s
Past window size : 0.4s
Future window size : 1/fps sec
Matching association thresholds at STA/LTA : 0.4/-0.6

Tracking maximum age : 0.8s
Tracking Minimal score : 50%

Latex Bibtex:

@article{miah2024learningdata,
title = {Learning data association for multi-object tracking using only coordinates},
journal = {Pattern Recognition},
volume = {160},
pages = {111169},
year = {2025},
issn = {0031-3203},
doi = {https://doi.org/10.1016/j.patcog.2024.111169},
url = {https://www.sciencedirect.com/science/article/pii/S0031320324009208},
author = {Mehdi Miah and Guillaume-Alexandre Bilodeau and Nicolas Saunier},
keywords = {Tracking, Transformer, Data association, Motion, Multi-object tracking}
}

Detailed Results

From all 29 test sequences, our benchmark computes the HOTA tracking metrics (HOTA, DetA, AssA, DetRe, DetPr, AssRe, AssPr, LocA) [1] as well as the CLEARMOT, MT/PT/ML, identity switches, and fragmentation [2,3] metrics. The tables below show all of these metrics.

Benchmark	HOTA	DetA	AssA	DetRe	DetPr	AssRe	AssPr	LocA
CAR	77.58 %	76.97 %	78.84 %	80.25 %	86.43 %	81.90 %	88.35 %	86.95 %
PEDESTRIAN	52.44 %	50.83 %	54.35 %	55.26 %	72.37 %	59.45 %	72.89 %	79.15 %

Benchmark	TP	FP	FN
CAR	31578	2814	355
PEDESTRIAN	16518	6632	1160

Benchmark	MOTA	MOTP	MODA	IDSW	sMOTA
CAR	89.68 %	85.50 %	90.79 %	381	76.36 %
PEDESTRIAN	64.95 %	75.28 %	66.34 %	322	47.31 %

Benchmark	MT rate	PT rate	ML rate	FRAG
CAR	81.39 %	15.54 %	3.08 %	335
PEDESTRIAN	42.61 %	39.17 %	18.21 %	684

Benchmark	# Dets	# Tracks
CAR	31933	773
PEDESTRIAN	17678	380

This table as LaTeX

This figure as: png pdf

[1] J. Luiten, A. Os̆ep, P. Dendorfer, P. Torr, A. Geiger, L. Leal-Taixé, B. Leibe: HOTA: A Higher Order Metric for Evaluating Multi-object Tracking. IJCV 2020.
[2] K. Bernardin, R. Stiefelhagen: Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics. JIVP 2008.
[3] Y. Li, C. Huang, R. Nevatia: Learning to associate: HybridBoosted multi-target tracker for crowded scene. CVPR 2009.

The KITTI Vision Benchmark Suite

A project of Karlsruhe Institute of Technologyand Toyota Technological Institute at Chicago

Method

Detailed Results

A project of Karlsruhe Institute of Technology
and Toyota Technological Institute at Chicago