Saturday, May 24, 2014

Depth Map Automatic Generator 5 (DMAG5)

DMAG5 is a rather faithful implementation of Fast Cost-Volume Filtering for Visual Correspondence and Beyond by Christoph Rhemann, Asmaa Hosni, Michael Bleyer, Carsten Rother, and Margrit Gelautz which is discussed in this very blog at Fast Cost Volume Filtering for Stereo Matching. A big thanks to Stereo Disparity through Cost Aggregation with Guided Filter by Pauline Tan and Pascal Monasse for the default parameter values.

DMAG5 is similar in spirit to Depth Map Automatic Generator 2 (DMAG2). Just as DMAG2, it is local and filter-based. The difference is that DMAG2 uses a joint bilateral filter while DMAG5 uses a guided filter. The advantage of the guided filter over the joint bilateral filter is that its running time is independent of the filter size (window radius). While DMAG2 is slow (painfully slow if the window radius is not small), DMAG5 is fast. I believe DMAG5 is a real improvement over DMAG2 although one could make the argument that DMAG2 should produce better (more accurate) depth maps than DMAG5 (DMAG5 uses an approximation of the bilateral filter after all). My (limited) experience tells me that DMAG5 produces depth maps that are about as good as those produced by DMAG2.

The cost-volume filtering is performed twice (the first run has the left image as the reference image while the second run has the right image as the reference image) in order to detect occluded pixels. Here, an occluded pixel is a pixel for which the disparity is unreliable. For any found occluded pixel, the disparity is obtained by considering the smallest disparity scanning to its left and to its right. To reduce the streaks in the depth map, occluded pixels are smoothed.

Let's go over the parameters that control DMAG5's behavior:

- Minimum disparity is the disparity corresponding to the furthest point in the background.
- Maximum disparity is the disparity corresponding to the closest point in the foreground.
I suggest using Disparity Finder 2 (DF2) to get the minimum and maximum disparity. Better yet, you can use Epipolar Rectification 9b (ER9b) to rectify/align the two images (crucial for good quality depth map generation) and get the min and max disparities automatically.
- Window radius is the guided filter size. The larger the radius, the more accurate the matches are supposed to be (to a certain extent). If the window radius becomes too large, errors are likely to appear at object boundaries. Note that the running time is not dependent on the window radius, which is a really good thing.
- Alpha is the term that balances the color matching cost and the gradient matching cost. The closer alpha is to 0, the more importance is given to the color. The closer alpha is to 1, the more importance is given to the gradient. In theory, a higher alpha works better when there's quite a bit of texture in the image while a lower alpha works better when the image is relatively flat color wise.
- Truncation value (color) limits the value the color matching cost can take. It reduces the effects of occluded pixels (pixels that appear in only one image).
- Truncation value (gradient) limits the value the gradient matching cost can take. It reduces the effects of occluded pixels (pixels that appear in only one image).
Pauline Tal et al. think that the default truncation values given by Christoph Rhemann et al. (7 for the color truncation and 2 for the gradient truncation) are too small. They suggest 20 and 10 for the color and gradient truncation values, respectively.
- Epsilon controls the smoothness of the depth map. As epsilon is lowered (4, 3, 2, 1, 0, -1, -2, -3, -4, etc), the depth map gets smoother.
- Disparity tolerance (occlusion detection). The larger the value, the more mismatch is allowed (between left and right depth maps) before declaring that the disparity computed at a pixel is unreliable.
- Window radius (occlusion smoothing).
- Sigma space (occlusion smoothing).
- Sigma color (occlusion smoothing).
The parameters that relate to occlusion detection and smoothing should probably be left alone since they only have an effect on the "occluded" pixels, that is, the pixels that show up in black in the occlusion maps.
- Downsampling factor. This parameter enables DMAG5 to run faster by downsampling the images prior to computing the depth maps. If set to 1, the images are used as is and there's no speedup. If set to 2, the images are resized by reducing each dimension by a factor of 2 and DMAG5 should go 4 times faster. The more downsampling is requested, the faster DMAG5 will go, but the more pixelated the depth maps will look upon completion (as they are upsampled). If downsampling is turned on, the parameters that are spatial, that is, min and max disparity, window radius, window radius (occlusion smoothing), and sigma space (occlusion smoothing) are automatically adjusted to adapt to the level of downsampling that is requested. In other words, you don't have to wonder if you should change those parameters when switching, for example, from downsampling factor = 1 to downsampling factor = 2 as DMAG5 does it automatically for you.

The parameters that have the greatest impact on the depth maps are the radius of the guided filter (window radius) and epsilon. So, if you want to experiment, those are the ones to would play with.

Here's an example:

Left image (after rectification).

Right image (after rectification).

Left depth map obtained by DMAG5.

Input for DMAG5:

min disparity = -22
max disparity = 19
radius = 16
alpha = 0.9
truncation cost (color) = 20.0
truncation cost (gradient) = 10.0
epsilon = 4
disparity tolerance = 0
radius (occlusion smoothing) = 9
sigma space (occlusion smoothing) = 9.0
sigma color (occlusion smoothing) = 25.5
downsampling factor = 1

More examples (that compare DMAG6 with DMAG5):
3D Photos - Stevenson tombstone
3D Photos - Civil War reenactors
3D Photos - Looking down at the tombstones

The windows executable (guaranteed to be virus free) is available for free via the 3D Software Page. Please, refer to the 'Help->About' page in the actual program for how to use it.


  1. I'm having issues getting it to work with Windows 8. Is there something I can do? Thanks.

    1. How big are your images? Try using smaller size images.

  2. Hai. I am btech student .I am doing my project on signature creation of stereoscopic 3D image. I need to know the codes of this depth signature creation. I am doing it in netbeans java.please help ..

  3. Could this imageset be used to generate a 3d model?