Sunday, February 7, 2016

Depth Map Automatic Generator 5c (DMAG5c)

DMAG5c is a variant of Depth Map Automatic Generator 5 (DMAG5). The core of the method is still based upon Fast Cost-Volume Filtering for Visual Correspondence and Beyond by Christoph Rhemann, Asmaa Hosni, Michael Bleyer, Carsten Rother, and Margrit Gelautz. Where it differs with DMAG5 is in choice of the raw matching cost. DMAG5c uses a SIFT-like descriptor to determine the raw matching cost. SIFT is a robust method to detect features in images and match them. It is explained in details in Distinctive Image Features from Scale-Invariant Keypoints by David G. Lowe. The advantage of using a SIFT-like descriptor in determining the raw matching costs is that it's kinda invariant to illumination changes because it focuses on gradient orientations rather than actual colors.

The SIFT-descriptor used here is vastly simplified since: (i) it has a fixed radius (equal to 2), (ii) it uses a single gradient histogram, in other words, there is only one bin in image space, (iii) the gradient magnitudes are not weighted, and (iv) it is assumed the stereo pair has been rectified and there is therefore no need to rotate the descriptor window. Because occlusion handling happens when the raw matching cost is smoothed (for each possible disparity), it is a good idea to keep the radius of the SIFT-descriptor small. Because the radius is small, there is really no need to use more than one bin in image space and consider weights for the gradient magnitudes.

A quick word about the parameters:
- min and max disparity. Those can be obtained with zero effort by using Epipolar Rectification 9b (ER9b).
- radius. This is the radius of the guided image filter. The larger the better but up to a certain point. Note that the speed of DMAG5c doesn't depend on the size of the filter, which is kind of a good thing.
- epsilon. This controls the smoothness of the depth map. The lower the epsilon, the smoother the depth map. I think that 4 is a pretty good value but you can certainly try 3, 2, 1, and even 0.
- disparity tolerance. Controls how tight you want the consistency between left and right depth maps to be. Pixels that have non-consistent disparities (those are shown in black in the occlusion maps) have their disparities recomputed using some sort of averaging between neighboring pixels. That averaging is controlled by the radius to smooth occlusions, sigma space and sigma color. The default values should be more than ok in most cases.

Here's how it behaves on tsukuba which, by the way, doesn't suffer at all from illumination changes:

Left image.

Right image.

Depth map obtained by DMAG5c using radius = 12.

Depth map obtained by DMAG5c using radius = 24.

In practice, DMAG5 should be tried first. If the depth map is not satisfatory no matter the choice of parameters, then it's probably a good idea to switch over to DMAG5c. This is all assuming that the stereo pair has been properly rectified by, let's say, ER9b.

The windows executable (guaranteed to be virus free) is available for free via the 3D Software Page.