Instance Segmentation Benchmark

This is the KITTI instance segmentation benchmark. It consists of 200 semantically annotated train as well as 200 test images. The data format and metrics are conform with The Cityscapes Dataset.

The data can be downloaded here:

The instance segmentation task focuses on detecting, segmenting and classifzing object instances. To assess instance-level performance, we compute the average precision on the region level (AP) for each class and average it across a range of overlap thresholds to avoid a bias towards a specific value. As described in The Cityscapes Dataset, we use 10 different overlaps ranging from 0.5 to 0.95 in steps of 0.05. The overlap is computed at the region level, making it equivalent to the IoU of a single instance. We penalize multiple predictions of the same ground truth instance as false positives. To obtain a single, easy to compare compound score, we report the mean average precision AP, obtained by also averaging over the class label set. As minor scores, we add AP50% for an overlap value of 50 %.

  • AP:  Average precision as described above.
  • AP 50%:    Average Precision with 50 % overlap.

Additional information used by the methods
  • Laser Points: Method uses point clouds from Velodyne laser scanner
  • Depth: Method uses depth from stereo.
  • Video: Method uses 2 or more temporally adjacent images
  • Additional training data: Use of additional data sources for training (see details)

Method Setting Code AP AP50% Runtime Environment
1 MaskRCNN_ROB 8.75 22.35 1 s 1 core @ 2.5 Ghz (C/C++)
2 Test sub 2 5.73 12.02 1 s GPU @ 2.5 Ghz (Python)
3 Test_sub 5.44 11.08 0.1 s 1 core @ 2.5 Ghz (C/C++)
Table as LaTeX | Only published Methods

Related Datasets

  • The Cityscapes Dataset: The cityscapes dataset was recorded in 50 German cities and offers high quality pixel-level annotations of 5 000 frames in addition to a larger set of 20 000 weakly annotated frames.
  • Wilddash: Wilddash is a benchmark for semantic and instance segmentation. It aims to improve the expressiveness of performance evaluation for computer vision algorithms in regard to their robustness under real-world conditions.


When using this dataset in your research, we will be happy if you cite us:
  author = {Hassan Abu Alhaija and Siva Karthik Mustikovela and Lars Mescheder and Andreas Geiger and Carsten Rother},
  title = {Augmented Reality Meets Deep Learning for Car Instance Segmentation in Urban Scenes},
  booktitle = {British Machine Vision Conference (BMVC)},
  year = {2017}

eXTReMe Tracker