Saturday, May 24, 2014

Depth Map Automatic Generator 5 (DMAG5)

DMAG5 is a rather faithful implementation of Fast Cost-Volume Filtering for Visual Correspondence and Beyond by Christoph Rhemann, Asmaa Hosni, Michael Bleyer, Carsten Rother, and Margrit Gelautz which is discussed in this very blog at Fast Cost Volume Filtering for Stereo Matching. A big thanks to Stereo Disparity through Cost Aggregation with Guided Filter by Pauline Tan and Pascal Monasse for the default parameter values.

DMAG5 is very dependent on the so-called "guided filter", which is an approximation of the (joint) bilateral filter. How to compute the guided filter is explained in great details here: Guided Image Filtering.

DMAG5 is similar in spirit to Depth Map Automatic Generator 2 (DMAG2). Just as DMAG2, it is local and filter-based. The difference is that DMAG2 uses a joint bilateral filter while DMAG5 uses a guided filter. The advantage of the guided filter over the joint bilateral filter is that its running time is independent of the filter size (window radius). While DMAG2 is slow (painfully slow if the window radius is not small), DMAG5 is fast. I believe DMAG5 is a real improvement over DMAG2 although one could make the argument that DMAG2 should produce better (more accurate) depth maps than DMAG5 (DMAG5 uses an approximation of the bilateral filter after all). My (limited) experience tells me that DMAG5 produces depth maps that are about as good as those produced by DMAG2.

The cost-volume filtering is performed twice (the first run has the left image as the reference image while the second run has the right image as the reference image) in order to detect occluded pixels. Here, an occluded pixel is a pixel for which the disparity is unreliable. For any found occluded pixel, the disparity is obtained by considering the smallest disparity scanning to its left and to its right. To reduce the streaks in the depth map, occluded pixels are smoothed.

Let's go over the parameters that control DMAG5's behavior:

- Minimum disparity is the disparity corresponding to the furthest point in the background.
- Maximum disparity is the disparity corresponding to the closest point in the foreground.
I suggest using Disparity Finder 2 (DF2) to get the minimum and maximum disparity. Better yet, you can use Epipolar Rectification 9b (ER9b) to rectify/align the two images (crucial for good quality depth map generation) and get the min and max disparities automatically.
- Window radius is the guided filter size. The larger the radius, the more accurate the matches are supposed to be (to a certain extent). If the window radius becomes too large, errors are likely to appear at object boundaries. It is usually a good idea to try radii of various values, like 4, 8, 16, 32, etc. Note that the running time is not dependent on the window radius, which is a really good thing. Please note: The larger the image, the larger the radius should be.
- Alpha is the term that balances the color matching cost and the gradient matching cost. The closer alpha is to 0, the more importance is given to the color. The closer alpha is to 1, the more importance is given to the gradient. In theory, a higher alpha works better when there's quite a bit of texture in the image while a lower alpha works better when the image is relatively flat color wise.
- Truncation value (color) limits the value the color matching cost can take. It reduces the effects of occluded pixels (pixels that appear in only one image).
- Truncation value (gradient) limits the value the gradient matching cost can take. It reduces the effects of occluded pixels (pixels that appear in only one image).
Pauline Tal et al. think that the default truncation values given by Christoph Rhemann et al. (7 for the color truncation and 2 for the gradient truncation) are too small. They suggest 20 and 10 for the color and gradient truncation values, respectively.
- Epsilon controls the smoothness of the depth map. As epsilon is lowered (4, 3, 2, 1, 0, -1, -2, -3, -4, etc), the depth map gets smoother.
- Disparity tolerance (occlusion detection). The larger the value, the more mismatch is allowed (between left and right depth maps) before declaring that the disparity computed at a pixel is unreliable.
- Window radius (occlusion smoothing).
- Sigma space (occlusion smoothing).
- Sigma color (occlusion smoothing).
The parameters that relate to occlusion detection and smoothing should probably be left alone since they only have an effect on the "occluded" pixels, that is, the pixels that show up in black in the occlusion maps.
- Downsampling factor. This parameter enables DMAG5 to run faster by downsampling the images prior to computing the depth maps. If set to 1, the images are used as is and there's no speedup. If set to 2, the images are resized by reducing each dimension by a factor of 2 and DMAG5 should go 4 times faster. The more downsampling is requested, the faster DMAG5 will go, but the more pixelated the depth maps will look upon completion (as they are upsampled). If downsampling is turned on, the parameters that are spatial, that is, min and max disparity, window radius, window radius (occlusion smoothing), and sigma space (occlusion smoothing) are automatically adjusted to adapt to the level of downsampling that is requested. In other words, you don't have to wonder if you should change those parameters when switching, for example, from downsampling factor = 1 to downsampling factor = 2 as DMAG5 does it automatically for you.

The parameters that have the greatest impact on the depth maps are the radius of the guided filter (window radius) and epsilon. So, if you want to experiment, those are the ones you should play with.

Here's an example:


Left image (after rectification).


Right image (after rectification).


Left depth map obtained by DMAG5.

Input for DMAG5:

min disparity = -22
max disparity = 19
radius = 16
alpha = 0.9
truncation cost (color) = 20.0
truncation cost (gradient) = 10.0
epsilon = 4
disparity tolerance = 0
radius (occlusion smoothing) = 9
sigma space (occlusion smoothing) = 9.0
sigma color (occlusion smoothing) = 25.5
downsampling factor = 1

More examples (that compare DMAG6 with DMAG5):
3D Photos - Stevenson tombstone
3D Photos - Civil War reenactors
3D Photos - Looking down at the tombstones

Here is a video tutorial for DMAG5:


The windows executable (guaranteed to be virus free) is available for free via the 3D Software Page. Please, refer to the 'Help->About' page in the actual program for how to use it.

Source code: DMAG5 on github.

10 comments:

  1. I'm having issues getting it to work with Windows 8. Is there something I can do? Thanks.

    ReplyDelete
    Replies
    1. How big are your images? Try using smaller size images.

      Delete
  2. Hai. I am btech student .I am doing my project on signature creation of stereoscopic 3D image. I need to know the codes of this depth signature creation. I am doing it in netbeans java.please help ..

    ReplyDelete
  3. Could this imageset be used to generate a 3d model?
    http://www.hayabusa2.jaxa.jp/topics/20180621je/index_e.html

    ReplyDelete
  4. I would like to help you make a UI that makes this as easy as picking two images, and the rest is automated. Let me know if we can collaborate :) jonathan --- at --- leadersandco.com

    ReplyDelete
    Replies
    1. Thanks for your offer Jonathan but dmag5 is already gui driven. There are actually 2 versions: one is gui based, the other is not.

      Delete
  5. Replies
    1. please click on the "3d software" tab and you will find link to archives that contains dmag5. there's a gui 32 bit archive (deprecated), a gui 64 bit archive, and a no gui 64 bit archive.

      Delete
  6. To learn how to work with DMAG 5, I used the images presented here and the recommended parameters. However, low-contrast depth maps were obtained. How to increase the contrast of cards (3D effect)?

    ReplyDelete
    Replies
    1. Hi:

      You should get the same depth map I got, I think.

      Are you sure the left and right images are the same? Your depth map is 400x234 while the left and right images and depth map in the post are 1200x703.

      If you reduce the left and right image size, then you also have to change the min and amx disparities as the parallax range is much lower. If you reduce the image by a factor of 2, you would have to use -11, 10 for min max disparity instead of -22, 19.

      In general, if your depth map doesn't range from pure white to pure black, it means the mean and max disparity you gave are too conservative.

      Hope this helps.

      Delete