Method

Attention Stereo based 3D Object Detection [ASOD]


Submitted on 11 Dec. 2019 13:11 by
Xianshun Wang (SIMIT)

Running time:0.28 s
Environment:GPU @ 2.5 Ghz (Python)

Method Description:
Our method,called ASOD,extends Faster R-CNN for stereo inputs to detect object in image. In particular, we first employ Attention module in a feature extraction convolutional neural net (CNN) that exploits key information to jointly regress to 2D proposals and object class. And, we add an extra branch after Region Proposal Network(RPN)to obtain 3D proposals,which uses depth map and 2D proposals to recover a 3D proposals. More specifically, an Attention based Data Fusion scheme is proposed that could employ high-level features from the RGB data to guide the expression of point cloud features, in which we use the attention weight to further refine the structure feature of the 3D shape.
Parameters:
During training, we keep 1 blob of data and 128 sampled RoIs in each mini-batch. We train the network using SGD with a weight decay of 0.0005 and a momentum of 0.9. The learning rate is initially set to 0.001 and reduced by 0.1 for every 5 epochs. We train 20 epochs with 2 days in total.
Latex Bibtex:

Detailed Results

Object detection and orientation estimation results. Results for object detection are given in terms of average precision (AP) and results for joint object detection and orientation estimation are provided in terms of average orientation similarity (AOS).


Benchmark Easy Moderate Hard
Car (Detection) 94.09 % 83.52 % 68.68 %
Car (Orientation) 93.56 % 82.13 % 67.32 %
Car (3D Detection) 38.42 % 22.37 % 17.01 %
Car (Bird's Eye View) 54.61 % 33.63 % 26.76 %
This table as LaTeX


2D object detection results.
This figure as: png eps pdf txt gnuplot



Orientation estimation results.
This figure as: png eps pdf txt gnuplot



3D object detection results.
This figure as: png eps pdf txt gnuplot



Bird's eye view results.
This figure as: png eps pdf txt gnuplot




eXTReMe Tracker