Friday, June 26, 2015

Depth Map Automatic Generator 8 (DMAG8)

Depth Map Automatic Generator 8 (DMAG8) is a multi-view stereo automatic depth map generator. Its input is a set (any number) of non-rectified (non-aligned) images, typically extracted from a video taken with a single lens camera. In a nutshell, it's kinda like the Google Lens Blur smartphone app but more complicated. For more info on the DMAG8 internals, you're invited to check out Multi-View Stereo and Fast Bilateral-Space Stereo for Synthetic Defocus.

Because multi-view stereo relies on Structure from Motion (and Bundle Adjustment) which I haven't implemented yet, DMAG8 needs the output (the .nvm output file) of a third-party program called VisualSFM, which creates a sparse 3d reconstruction from a set of images. So, to be able to use DMAG8, you will need to install VisualSFM prior and run that on your set of images to get the .nvm file which contains the sparse 3d reconstruction (DMAG8 is actually only interested in the camera poses and the focal lengths). No worries as it's free. VisualSFM is quite cool on its own as it can also create dense 3d reconstruction via Yasutaka Furukawa's PMVS/CMVS tool chain.

Even if you can get a sparse 3d reconstruction from VisualSFM, it doesn't mean it's good enough for DMAG8. Depth Finder 8 (DF8) can assist you in determining if the nvm file is ok by drawing the epipolar line on the target image that corresponds to a given pixel in the reference image. By checking it against the matching pixel in the target image, one can tell if the sparse 3d reconstruction is good or bad. Of course, it doesn't have to be perfect but it may explain deficiencies in the depth map produced by DMAG8 (no use wasting time adjusting parameters then). Usually, when you are talking Structure from Motion (SfM) and that's what VisualSFM does, you want pictures (when paired together) taken with a rather wide baseline because, in that case, the 3d structure will be more accurate (although matching will be harder). So, if VisualSFM complains that it can't find a pair of images to initialize the 3d reconstruction process, it's usually because the baseline is not wide enough. No worries as you can always choose the initial pair yourself and VisualSFM can still produce the sparse 3d reconstruction.

The manual for DMAG8 (dmag8_manual.pdf) is in the directory where you decompressed ugosoft3d-8-x64.rar. It has a whole section on how to get the best of VisualSFM.

Update: Instead of using VisualSFM to get the camera positions and orientations (and the sparse reconstruction), use Structure from Motion 10 (SfM10). It's, in my opinion, much better and simpler to use.

The following is a sample depth map produced by DMAG8 using a set of two non-rectified images taken with a single lens camera. Doing multi-view stereo on a pair of images is not too exciting, but the key point here is that the two images are not aligned in any way and can't therefore be given to a classic stereo depth map generator.


Image 0 (reference image).


Image 1.


Sparse 3d reconstruction in VisualSFM.


Just for fun, this is the dense 3d reconstruction using Yasutaka Furukawa's CMVS tool within VisualSFM.


Depth map produced by DMAG8 using near plane depth = 0, far plane depth = 0, number of planes = 0, spatial sample rate = 8, range sample rate = 32, radius = 12, lambda = 0.01, max iterations = 1000, and hash table size = 10000.

The following is a sample depth map produced by DMAG8 using a set of three non-rectified images taken with a single lens camera.


Image 0.


Image 1.


Image 2.


Depth map produced by DMAG8 using near plane depth = 0, far plane depth = 0, number of planes = 0, spatial sample rate = 16, range sample rate = 16, radius = 12, lambda = 0.1, max iterations = 1000, and hash table size = 10000.

Here's another example of a depth map produced by DMAG8! This time I used a video shot with a single lens camera. Extracted 8 images from the video using ImageGrab which were then fed to VisualSFM to get the sparse 3d reconstruction (input to DMAG8):


Video and depth map produced by DMAG8 using near plane depth = 9, far plane depth = 80, number of planes = 0, spatial sample rate = 8, range sample rate = 32, radius = 12, lambda = 0.1, max iterations = 1000, and hash table size = 10000.

Although I haven't done it here, I strongly suggest using Edge Preserving Smoothing 7 (EPS7) to smooth out the depth maps produced by DMAG8.

The windows executable (guaranteed to be virus free) is available for free via the 3D Software Page.

No comments:

Post a Comment