KITTI-360

Method

Learning-based Encoder-Decoder Architecture [EncDec]


Submitted on 5 Jan. 2022 15:17 by
Yiyi Liao (MPI)

Running time:
Environment:NVIDIA V100

Method Description:
A baseline for semantic scene completion. The encoder first learns features from the input point cloud. It then merges the point-wise features to voxels such that a 3D U-Net is applied to predict a volumetric reconstruction. The network is trained using a cross-entropy loss where the ground truth point cloud is also discretized into a volume. The output point cloud is uniformly and densely sampled from each occupied voxel.
Parameters:
Latex Bibtex:
@inproceedings{Liao2021Arxiv,
author = {Yiyi Liao and Jun Xie and Andreas Geiger},
title = {KITTI-360: A Novel Dataset and Benchmarks for Urban Scene Understanding in 2D and 3D},
booktitle = {ARXIV},
year = {2021},
}

Detailed Results

This page provides detailed results for the method(s) selected. For the first 5 test point clouds, we display the original image, the color-coded result and an error image. The error image contains 4 colors weighted by the confidence of the pseudo-ground truth:
red: the pixel has the wrong label and the wrong category
yellow: the pixel has the wrong label but the correct category
green: the pixel has the correct label
black: the groundtruth label is not used for evaluation

Test Set Average

Accuracy Completeness F1 mIoU class
41.36 41.23 41.29 9.07
This table as LaTeX

Test Image 0


Test Image 1


Test Image 2


Test Image 3


Test Image 4





eXTReMe Tracker