IronDepth: Iterative Refinement of Single-View Depth using Surface Normal and its Uncertainty

BMVC 2022

Gwangbin Bae Ignas Budvytis Roberto Cipolla

Paper arXiv Code

TL;DR

  • We use surface normal to propagate depth between pixels.
  • We formulate depth refinement/upsampling as classification of choosing the neighboring pixel to propagate from.

Demo

Demo

Motivation

In our previous work, we estimated the aleatoric uncertainty in surface normal estimation, and used it to improve the quality of prediction for small structures and object boundaries. We believe that the estimated surface normal (and its uncertainty) can be useful in various computer vision tasks. In our recent work, we showed that it can be used to align CAD models to the objects in the image. In this work, we show that it can improve monocular depth estimation.

Method

The depth of each pixel can be propagated to a query pixel, using the predicted surface normal as guidance. We thus formulate depth refinement as classification of choosing the neighboring pixel to propagate from.

To maintain computational efficiency, we perform the refinement in coarse resolution (H/8 x W/8). Then, we upsample the refined low-resolution depth-map, again using the normal-guided propagation. The depth of each pixel in the upsampled depth-map is given as the weighted sum of the depths propagated from its coarse resolution neighbors.

Results

Compared to the existing depth estimation methods, the improvement in depth accuracy is small. However, when you calculate the surface normal from the predicted depth-map and measure its accuracy, our method is significantly more accurate than the other methods.

IronDepth can also be used as a post-processing tool to improve the accuracy of the existing depth estimation methods.

Since we refine the predicted depth-map by propagating information between pixels, we can seamlessly apply our method to a scenario where sparse depth measurements are available (i.e. depth completion setup). Given a sparse depth measurement, we can add anchor points by fixing the depth for the pixels with measurement. The information provided for the anchor points (i.e. the measured depth) can be propagated to neighboring pixels, making the overall prediction more accurate.

BibTeX

@inproceedings{bae2022irondepth,
    title={IronDepth: Iterative Refinement of Single-View Depth using Surface Normal and its Uncertainty},
    author={Bae, Gwangbin and Budvytis, Ignas and Cipolla, Roberto},
    booktitle={British Machine Vision Conference (BMVC)},
    year={2022}
}