Method

SAM2-based Multi-object Tracking and Segmentation using Zero-shot Learning [Seg2Track-SAM2]
[Anonymous Submission]

Submitted on 9 Sep. 2025 18:15 by
[Anonymous Submission]

Running time:1 s
Environment:GPU @ 1.5 Ghz (Python)

Method Description:
This method extends SAM2 to multi-object tracking
and segmentation in a zero-shot setting. Objects are
initialized with a detector and refined over time
through object reinforcement, ensuring consistent
masks across frames without extra training.
Parameters:
\detection_threshold=0.5
\removal_threshold=0.1
Latex Bibtex:

Detailed Results

From all 29 test sequences, our benchmark computes the commonly used tracking metrics CLEARMOT, MT/PT/ML, identity switches, and fragmentations [1,2]. The tables below show all of these metrics.


Benchmark MOTA MOTP MODA MODP
CAR 61.52 % 76.66 % 62.23 % 81.01 %
PEDESTRIAN 37.48 % 69.41 % 39.10 % 90.45 %

Benchmark recall precision F1 TP FP FN FAR #objects #trajectories
CAR 79.62 % 85.75 % 82.58 % 30780 5113 7877 45.96 % 47510 989
PEDESTRIAN 65.80 % 71.60 % 68.58 % 15385 6103 7995 54.86 % 27345 417

Benchmark MT PT ML IDS FRAG
CAR 59.54 % 34.46 % 6.00 % 244 849
PEDESTRIAN 36.08 % 45.02 % 18.90 % 376 1271

This table as LaTeX


[1] K. Bernardin, R. Stiefelhagen: Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics. JIVP 2008.
[2] Y. Li, C. Huang, R. Nevatia: Learning to associate: HybridBoosted multi-target tracker for crowded scene. CVPR 2009.


eXTReMe Tracker