Method

mmFUSION: Multimodal Fusion for 3D Objects Detection [mmFUSION]
https://javi-99.github.io/

Submitted on 4 Sep. 2023 11:12 by
Javed Ahmad (Italian Institute of Technology)

Running time:1s
Environment:1 core @ 2.5 Ghz (Python)

Method Description:
We have introduced a novel multi-modal fusion
scheme called mmFUSION for 3D object detection.
This scheme transforms modality-specific
information into lower-volume 3D representations
through dedicated encoders and then fuses them
using carefully designed cross-modality and multi-
modality attention modules. Unlike existing early
and late fusion schemes, which often fail to align
diverse modalities and lead to semantic
ambiguities, our intermediate-level transformation
effectively preserves the geometric and semantic
information of 3D objects.
Parameters:
TBD
Latex Bibtex:
@article{ahmad2023mmfusion,
title={mmFUSION: Multimodal Fusion for 3D Objects
Detection},
author={Ahmad, Javed and Del Bue, Alessio},
journal={arXiv preprint arXiv:2311.04058},
year={2023}
}

Detailed Results

Object detection and orientation estimation results. Results for object detection are given in terms of average precision (AP) and results for joint object detection and orientation estimation are provided in terms of average orientation similarity (AOS).


Benchmark Easy Moderate Hard
Car (Detection) 95.69 % 91.84 % 87.05 %
Car (Orientation) 95.47 % 91.30 % 86.33 %
Car (3D Detection) 85.24 % 74.38 % 69.43 %
Car (Bird's Eye View) 90.35 % 84.60 % 79.82 %
This table as LaTeX


2D object detection results.
This figure as: png eps txt gnuplot



Orientation estimation results.
This figure as: png eps txt gnuplot



3D object detection results.
This figure as: png eps txt gnuplot



Bird's eye view results.
This figure as: png eps txt gnuplot




eXTReMe Tracker