Method

All-in-One: Transferring Vision Foundation Models into Stereo Matching [AIO-Stereo]


Submitted on 11 Mar. 2025 12:09 by
Haoyu Zhang (Fudan University)

Running time:0.23 s
Environment:GPU @ 2.5 Ghz (Python)

Method Description:
As a fundamental vision task, stereo matching has
made remarkable progress. While recent iterative
optimization-based methods have achieved promising
performance, their feature extraction capabilities
still have room for improvement. Inspired by the
ability of vision foundation models (VFMs) to
extract general representations, in this work,
we propose AIO-Stereo which can flexibly select
and transfer knowledge from multiple heterogeneous
VFMs to a single stereo matching model. To better
reconcile features between heterogeneous VFMs and
the stereo matching model and fully exploit prior
knowledge from VFMs, we proposed a dual-level
feature utilization mechanism that aligns
heterogeneous features and transfers multi-level
knowledge. Based on the mechanism, a dual-level
selective knowledge transfer module is designed to
selectively transfer knowledge and integrate the
advantages of multiple VFMs. Experimental results
show that AIO-Stereo achieves start-of-the-art
performance.
Parameters:
lr=1e-4
Latex Bibtex:
@article{zhou2024all,
title={All-in-One: Transferring Vision
Foundation Models into Stereo Matching},
author={Zhou, Jingyi and Zhang, Haoyu and Yuan,
Jiakang and Ye, Peng and Chen, Tao and Jiang, Hao
and Chen, Meiya and Zhang, Yangyang},
journal={arXiv preprint arXiv:2412.09912},
year={2024}
}

Detailed Results

This page provides detailed results for the method(s) selected. For the first 20 test images, the percentage of erroneous pixels is depicted in the table. We use the error metric described in Object Scene Flow for Autonomous Vehicles (CVPR 2015), which considers a pixel to be correctly estimated if the disparity or flow end-point error is <3px or <5% (for scene flow this criterion needs to be fulfilled for both disparity maps and the flow map). Underneath, the left input image, the estimated results and the error maps are shown (for disp_0/disp_1/flow/scene_flow, respectively). The error map uses the log-color scale described in Object Scene Flow for Autonomous Vehicles (CVPR 2015), depicting correct estimates (<3px or <5% error) in blue and wrong estimates in red color tones. Dark regions in the error images denote the occluded pixels which fall outside the image boundaries. The false color maps of the results are scaled to the largest ground truth disparity values / flow magnitudes.

Test Set Average

Error D1-bg D1-fg D1-all
All / All 1.34 2.57 1.54
All / Est 1.34 2.57 1.54
Noc / All 1.22 2.51 1.43
Noc / Est 1.22 2.51 1.43
This table as LaTeX

Test Image 0

Error D1-bg D1-fg D1-all
All / All 1.51 1.29 1.48
All / Est 1.51 1.29 1.48
Noc / All 1.52 1.29 1.49
Noc / Est 1.52 1.29 1.49
This table as LaTeX

Input Image

D1 Result

D1 Error


Test Image 1

Error D1-bg D1-fg D1-all
All / All 1.36 4.03 1.66
All / Est 1.36 4.03 1.66
Noc / All 1.29 4.03 1.60
Noc / Est 1.29 4.03 1.60
This table as LaTeX

Input Image

D1 Result

D1 Error


Test Image 2

Error D1-bg D1-fg D1-all
All / All 1.86 6.48 2.09
All / Est 1.86 6.48 2.09
Noc / All 1.81 6.48 2.04
Noc / Est 1.81 6.48 2.04
This table as LaTeX

Input Image

D1 Result

D1 Error


Test Image 3

Error D1-bg D1-fg D1-all
All / All 1.56 3.04 1.70
All / Est 1.56 3.04 1.70
Noc / All 1.53 3.04 1.68
Noc / Est 1.53 3.04 1.68
This table as LaTeX

Input Image

D1 Result

D1 Error


Test Image 4

Error D1-bg D1-fg D1-all
All / All 0.50 0.84 0.56
All / Est 0.50 0.84 0.56
Noc / All 0.49 0.84 0.55
Noc / Est 0.49 0.84 0.55
This table as LaTeX

Input Image

D1 Result

D1 Error


Test Image 5

Error D1-bg D1-fg D1-all
All / All 1.93 2.27 1.96
All / Est 1.93 2.27 1.96
Noc / All 1.86 2.27 1.89
Noc / Est 1.86 2.27 1.89
This table as LaTeX

Input Image

D1 Result

D1 Error


Test Image 6

Error D1-bg D1-fg D1-all
All / All 2.49 1.53 2.39
All / Est 2.49 1.53 2.39
Noc / All 2.54 1.53 2.43
Noc / Est 2.54 1.53 2.43
This table as LaTeX

Input Image

D1 Result

D1 Error


Test Image 7

Error D1-bg D1-fg D1-all
All / All 0.29 3.33 0.89
All / Est 0.29 3.33 0.89
Noc / All 0.29 3.33 0.90
Noc / Est 0.29 3.33 0.90
This table as LaTeX

Input Image

D1 Result

D1 Error


Test Image 8

Error D1-bg D1-fg D1-all
All / All 0.25 2.83 0.73
All / Est 0.25 2.83 0.73
Noc / All 0.24 2.83 0.72
Noc / Est 0.24 2.83 0.72
This table as LaTeX

Input Image

D1 Result

D1 Error


Test Image 9

Error D1-bg D1-fg D1-all
All / All 0.28 1.77 0.66
All / Est 0.28 1.77 0.66
Noc / All 0.28 1.86 0.67
Noc / Est 0.28 1.86 0.67
This table as LaTeX

Input Image

D1 Result

D1 Error


Test Image 10

Error D1-bg D1-fg D1-all
All / All 1.09 2.76 1.47
All / Est 1.09 2.76 1.47
Noc / All 1.10 2.76 1.49
Noc / Est 1.10 2.76 1.49
This table as LaTeX

Input Image

D1 Result

D1 Error


Test Image 11

Error D1-bg D1-fg D1-all
All / All 0.83 0.63 0.79
All / Est 0.83 0.63 0.79
Noc / All 0.83 0.63 0.80
Noc / Est 0.83 0.63 0.80
This table as LaTeX

Input Image

D1 Result

D1 Error


Test Image 12

Error D1-bg D1-fg D1-all
All / All 0.66 1.04 0.68
All / Est 0.66 1.04 0.68
Noc / All 0.51 1.04 0.55
Noc / Est 0.51 1.04 0.55
This table as LaTeX

Input Image

D1 Result

D1 Error


Test Image 13

Error D1-bg D1-fg D1-all
All / All 0.51 0.14 0.46
All / Est 0.51 0.14 0.46
Noc / All 0.50 0.14 0.45
Noc / Est 0.50 0.14 0.45
This table as LaTeX

Input Image

D1 Result

D1 Error


Test Image 14

Error D1-bg D1-fg D1-all
All / All 1.48 0.00 1.46
All / Est 1.48 0.00 1.46
Noc / All 1.25 0.00 1.23
Noc / Est 1.25 0.00 1.23
This table as LaTeX

Input Image

D1 Result

D1 Error


Test Image 15

Error D1-bg D1-fg D1-all
All / All 2.36 0.20 2.17
All / Est 2.36 0.20 2.17
Noc / All 2.41 0.20 2.21
Noc / Est 2.41 0.20 2.21
This table as LaTeX

Input Image

D1 Result

D1 Error


Test Image 16

Error D1-bg D1-fg D1-all
All / All 3.33 0.17 2.87
All / Est 3.33 0.17 2.87
Noc / All 3.14 0.17 2.70
Noc / Est 3.14 0.17 2.70
This table as LaTeX

Input Image

D1 Result

D1 Error


Test Image 17

Error D1-bg D1-fg D1-all
All / All 0.84 0.20 0.77
All / Est 0.84 0.20 0.77
Noc / All 0.83 0.20 0.76
Noc / Est 0.83 0.20 0.76
This table as LaTeX

Input Image

D1 Result

D1 Error


Test Image 18

Error D1-bg D1-fg D1-all
All / All 4.57 1.09 2.92
All / Est 4.57 1.09 2.92
Noc / All 4.49 1.09 2.86
Noc / Est 4.49 1.09 2.86
This table as LaTeX

Input Image

D1 Result

D1 Error


Test Image 19

Error D1-bg D1-fg D1-all
All / All 0.76 0.17 0.69
All / Est 0.76 0.17 0.69
Noc / All 0.76 0.17 0.70
Noc / Est 0.76 0.17 0.70
This table as LaTeX

Input Image

D1 Result

D1 Error




eXTReMe Tracker