The KITTI Vision Benchmark Suite

Method

A Multi-Modal Fusion-Based 3D Multi-Object Tracking Framework with Joint Detection [YONTD-MOTv2]
https://github.com/wangxiyang2022/YONTD-MOT

Submitted on 15 Jul. 2024 16:41 by
Xiyang Wang (Chongqing University (CQU SLAMMOT Team))

Running time:		0.1 s
Environment:		GPU @ >3.5 Ghz (Python)

Method Description:

Firstly, a new multi-object tracking framework is
proposed in this paper based on multi-modal
fusion. By integrating object detection and multi-
object tracking into the same model, this
framework avoids the complex data association
process in the classical TBD paradigm, and
requires no additional training. Secondly,
confidence of historical trajectory regression is
explored, possible states of a trajectory in the
current frame (weak object or strong object) are
analyzed and a confidence fusion module is
designed to guide non-maximum suppression of
trajectory and detection for ordered association.
Finally, extensive experiments are conducted on
the KITTI and Waymo datasets. The results show
that the proposed method can achieve robust
tracking by using only two modal detectors and it
is more accurate than many of the latest TBD
paradigm-based multi-modal tracking methods. The
source codes of the proposed method are available
at https://github.com/wangxiyang2022/YONTD-

Parameters:

TBD

Latex Bibtex:

@article{wang2024multi,
title={A Multi-Modal Fusion-Based 3D Multi-
Object Tracking Framework with Joint Detection},
author={Wang, Xiyang and Fu, Chunyun and He,
Jiawei and Huang, Mingguang and Meng, Ting and
Zhang, Siyu and Zhou, Hangning and Xu, Ziyao and
Zhang, Chi},
journal={IEEE Robotics and Automation Letters},
year={2024},
publisher={IEEE}
}

Detailed Results

From all 29 test sequences, our benchmark computes the commonly used tracking metrics CLEARMOT, MT/PT/ML, identity switches, and fragmentations [1,2]. The tables below show all of these metrics.

Benchmark	MOTA	MOTP	MODA	MODP
CAR	88.17 %	86.27 %	88.25 %	88.86 %

Benchmark	recall	precision	F1	TP	FP	FN	FAR	#objects	#trajectories
CAR	94.01 %	95.42 %	94.71 %	36168	1737	2303	15.61 %	44560	1126

Benchmark	MT	PT	ML	IDS	FRAG
CAR	80.31 %	17.08 %	2.62 %	30	327

This table as LaTeX

[1] K. Bernardin, R. Stiefelhagen: Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics. JIVP 2008.
[2] Y. Li, C. Huang, R. Nevatia: Learning to associate: HybridBoosted multi-target tracker for crowded scene. CVPR 2009.

The KITTI Vision Benchmark Suite

A project of Karlsruhe Institute of Technologyand Toyota Technological Institute at Chicago

Method

Detailed Results

A project of Karlsruhe Institute of Technology
and Toyota Technological Institute at Chicago