Method

MonoUNI: A Unified Vehicle and Infrastructure-side Monocular 3D Object Detection Network [MonoUNI]
None

Submitted on 23 Apr. 2023 11:55 by
Z. Chen Cu (None)

Running time:0.04 s
Environment:1 core @ 2.5 Ghz (Python)

Method Description:
Monocular 3D detection of vehicle and infrastructure
sides are two important topics in autonomous
driving. Due to diverse sensor installations and
focal lengths, researchers are faced with the
challenge of constructing algorithms for the two
topics based on different prior knowledge. In this
paper, by taking into account the diversity of pitch
angles and focal lengths, we propose a unified
optimization target named normalized depth, which
realizes the unification of 3D detection problems
for the two sides. Furthermore, to enhance the
accuracy of monocular 3D detection, 3D normalized
cube depth of obstacle is developed to promote the
learning of depth information. We posit that the
richness of depth clues is a pivotal factor
impacting the detection performance on both the
vehicle and infrastructure sides. A richer set of
depth clues facilitates the model to learn better
spatial knowledge, and the 3D normalized cube depth
offers sufficient depth clues.
Parameters:
None
Latex Bibtex:
@InProceedings{MonoUNI,
author = {Jia, Jinrang and Li, Zhenjia and
Shi, Yifeng},
title = {MonoUNI: A Unified Vehicle and
Infrastructure-side Monocular 3D Object Detection
Network with Sufficient Depth Clues},
booktitle = {Thirty-seventh Conference on Neural
Information Processing Systems},
month = {October},
year = {2023},
}

Detailed Results

Object detection and orientation estimation results. Results for object detection are given in terms of average precision (AP) and results for joint object detection and orientation estimation are provided in terms of average orientation similarity (AOS).


Benchmark Easy Moderate Hard
Car (Detection) 94.30 % 88.96 % 78.95 %
Car (Orientation) 94.10 % 88.50 % 78.35 %
Car (3D Detection) 24.75 % 16.73 % 13.49 %
Car (Bird's Eye View) 33.28 % 23.05 % 19.39 %
Pedestrian (Detection) 76.17 % 58.97 % 53.99 %
Pedestrian (Orientation) 69.15 % 52.62 % 47.89 %
Pedestrian (3D Detection) 15.78 % 10.34 % 8.74 %
Pedestrian (Bird's Eye View) 16.54 % 10.90 % 9.17 %
Cyclist (Detection) 71.68 % 53.71 % 45.26 %
Cyclist (Orientation) 62.21 % 45.21 % 38.28 %
Cyclist (3D Detection) 7.34 % 4.28 % 3.78 %
Cyclist (Bird's Eye View) 8.25 % 5.03 % 4.50 %
This table as LaTeX


2D object detection results.
This figure as: png eps txt gnuplot



Orientation estimation results.
This figure as: png eps txt gnuplot



3D object detection results.
This figure as: png eps txt gnuplot



Bird's eye view results.
This figure as: png eps txt gnuplot



2D object detection results.
This figure as: png eps txt gnuplot



Orientation estimation results.
This figure as: png eps txt gnuplot



3D object detection results.
This figure as: png eps txt gnuplot



Bird's eye view results.
This figure as: png eps txt gnuplot



2D object detection results.
This figure as: png eps txt gnuplot



Orientation estimation results.
This figure as: png eps txt gnuplot



3D object detection results.
This figure as: png eps txt gnuplot



Bird's eye view results.
This figure as: png eps txt gnuplot




eXTReMe Tracker