The Edge of Depth: Explicit Constraints between Segmentation and Depth

In this work we study the mutual benefits of two common computer vision tasks, self-supervised depth estimation and semantic segmentation from images. For example, to help unsupervised monocular depth estimation, constraints from semantic segmentation has been explored implicitly such as sharing and transforming features. In contrast, we propose to explicitly measure the border consistency between segmentation and depth and minimize it in a greedy manner by iteratively supervising the network towards a locally optimal solution. Partially this is motivated by our observation that semantic segmentation even trained with limited ground truth (200 images of KITTI) can offer more accurate border than that of any (monocular or stereo) image-based depth estimation. Through extensive experiments, our proposed approach advances the state of the art on unsupervised monocular depth estimation in the KITTI.

Figure 1: Qualitive Comparison against Baseline Watson19-ResNet50

Figure 2: We explicitly regularize the depth border to be consistent with segmentation border. A "better" depth I* is created through morphing according to distilled point pairs pq. By penalizing its difference with the original prediction I at each training step, we gradually achieve a more consistent border. The morph happens over every distilled pairs but only one pair illustrated, due to limited space.

Trained model, data and codes

Welcome to pur GitHub page: http://github.com/TWJianNuo/EdgeDepth-Release

Trained model, data and codes

Publications