Friday, June 24, 2016

3D Photos - Looking down at the tombstones


Left image (after rectification).


Right image (after rectification).

I took the pictures with my Fuji W3, opened the mpo in StereoPhoto Maker, reduced the size to 1200 (in width), and rectified the reduced-size stereo pair with ER9b. Note that ER9b not only rectifies the image pair, it also outputs the min and max disparities for the rectified stereo pair.

Time to get the depth/disparity map. First, I am gonna use DMAG5.


Left depth map obtained by DMAG5.

Input for DMAG5:

min disparity = -10
max disparity = 26
radius = 16
alpha = 0.9
truncation cost (color) = 20.0
truncation cost (gradient) = 10.0
epsilon = 4
disparity tolerance = 0
radius (occlusion smoothing) = 9
sigma space (occlusion smoothing) = 9.0
sigma color (occlusion smoothing) = 25.5
downsampling factor = 1

I guess one could play around with the radius to get different depth maps but this one looks pretty good to me out of the box. There's no need to play with any of the other parameters. If one wants to experiment and see what the other parameters do, I suggest setting the downsampling factor to 2 as it will make DMAG5 go much faster. Once satisfied with a set of parameters, the downdampling ratio can be put back to 1 for better accuracy.

Now, let's get the depth/disparity map with DMAG6.


Left depth map obtained by DMAG6.

Input for DMAG6:

min disparity = -10
max disparity = 26
alpha = 0.9
truncation cost (color) = 20.
truncation cost (gradient) = 10.
truncation cost (discontinuity) = 10000.
iteration number = 5
level number = 5
data cost weight = 0.5
disparity tolerance = 0
radius (occlusion smoothing) = 9
sigma space (occlusion smoothing) = 9.0
sigma color (occlusion smoothing) = 25.5
downsampling factor = 1

I guess one could play with the data cost weight to try to possibly get a better depth although this one looks pretty good. There's no need to play with any of the other parameters.

Not a whole lot of difference between the depth map produced by DMAG5 and the one produced by DMAG6. It's relatively amazing to see that there's not a whole lot of difference between the depth map produced by DMAG5, a local window-based method, and DMAG6, a global method that's based on Belief Propagation (BP).


Animated gif courtesy of "Wiggle Maker".


Shallow depth-of-field simulation (aka miniature faking, diorama effect, and tilt-shift).

Thursday, June 23, 2016

3D Photos - Civil War reenactors


Left image (after rectification).


Right image (after rectification).

I took the pictures with my Fuji W3, opened the mpo in StereoPhoto Maker, reduced the size to 1200 (in width), and rectified the reduced-size stereo pair with ER9b. I think I could have gone without the rectification process but ER9b gave me the min and max disparities, which is nice (no need to use DF2 to find those out manually).

Time to get the depth/disparity map. First, I am gonna use DMAG5.


Left depth map obtained by DMAG5.

Input for DMAG5:

min disparity = -22
max disparity = 19
radius = 16
alpha = 0.9
truncation cost (color) = 20.0
truncation cost (gradient) = 10.0
epsilon = 4
disparity tolerance = 0
radius (occlusion smoothing) = 9
sigma space (occlusion smoothing) = 9.0
sigma color (occlusion smoothing) = 25.5
downsampling factor = 1

I guess one could play around with the radius to get different depth maps but this one looks pretty good to me out of the box. There's no need to play with any of the other parameters.

Now, let's get the depth/disparity map with DMAG6.


Left depth map obtained by DMAG6.

Input for DMAG6:

min disparity = -22
max disparity = 19
alpha = 0.9
truncation cost (color) = 20.
truncation cost (gradient) = 10.
truncation cost (discontinuity) = 10000.
iteration number = 5
level number = 5
data cost weight = 0.5
disparity tolerance = 0
radius (occlusion smoothing) = 9
sigma space (occlusion smoothing) = 9.0
sigma color (occlusion smoothing) = 25.5
downsampling factor = 1

I guess one could play with the data cost weight to try to possibly get a better depth although this one looks pretty good. When the data cost weight is reduced, the depth map becomes smoother but there is a danger that it may become too smooth and not accurate enough. When the data cost weight is increased, the depth map becomes more accurate but less smooth (it might be a good idea to post-process the depth map with EPS7, EPS9 or DMAG9b to regain some smoothness). There's no need to play with any of the other parameters.

Not a whole lot of difference between the depth map produced by DMAG5 and the one produced by DMAG6, so I used the depth map produced by DMAG5 in what follows. It's relatively amazing to see that there's not a whole lot of difference between the depth map produced by DMAG5, a local window-based method, and DMAG6, a global method that's based on Belief Propagation (BP).


Animated gif courtesy of "Wiggle Maker".

Wednesday, June 22, 2016

3D Photos - Stevenson tombstone


Left image after rectification.


Right image after rectification.

I took the pictures with my Fuji W3, opened the mpo in StereoPhoto Maker, reduced the size to 1200 (in width), and rectified the reduced-size stereo pair with ER9b. I think I could have gone without the rectification process but ER9b gave me the min and max disparities, which is nice (no need to use DF2).

Time to get the depth/disparity map. First, I am gonna use DMAG5.


Left depth map obtained by DMAG5.

Input for DMAG5:

min disparity = -29
max disparity = 25
radius = 16
alpha = 0.9
truncation cost (color) = 20.0
truncation cost (gradient) = 10.0
epsilon = 4
disparity tolerance = 0
radius (occlusion smoothing) = 9
sigma space (occlusion smoothing) = 9.0
sigma color (occlusion smoothing) = 25.5
downsampling factor = 1

I guess one could play around with the radius to get different depth maps. There's no need to play with any of the other parameters.

Now, let's get the depth/disparity map with DMAG6.


Left depth map obtained by DMAG6.

Input for DMAG6:

min disparity = -29
max disparity = 25
alpha = 0.9
truncation cost (color) = 20.
truncation cost (gradient) = 10.
truncation cost (discontinuity) = 10000.
level number = 5
iteration number = 5
data cost weight = 0.5
disparity tolerance = 0
radius (occlusion smoothing) = 9
sigma space (occlusion smoothing) = 9.0
sigma color (occlusion smoothing) = 25.5
downsampling factor = 1

I guess one could play with the data cost weight to get different depth maps. There's no need to play with any of the other parameters.

Not a whole lot of difference between the depth map produced by DMAG5 and the one produced by DMAG6, so I used the depth map produced by DMAG5 in what follows.


Animated gif courtesy of "Wiggle Maker".

I think the animated gifs are cool but there's another swell thing one can do with depth maps: shallow depth-of-field effect for miniature faking (tilt-shift).


Shallow depth-of-field effect (miniature faking).

Tuesday, June 21, 2016

Structure from Motion (SfM) + Multi View Stereo (MVS) vs Stereo


Set of 11 images extracted from a video taken with iphone 4s. Dimensions are 1080x1920 pixels.

We are gonna use Multi View Stereo 10 (MVS10) to reconstruct the dense 3D scene (Multi View Stereo). Before that, we ran Structure from Motion 10 (SfM10) to get the camera poses and feature matches between the views (Structure from Motion).

Input to MVS10 ("mvs10_input.txt"):

duh.nvm = name of nvm file (generated by SfM10)
100 = minimum number of matches (camera pair selection)
0.5 = minimum average separation angle (camera pair selection)
32 = window radius
0.9 = alpha
20.0 = truncation cost (color)
10.0 = truncation cost (gradient)
4 = epsilon
0 = disparity tolerance
4 = downsampling factor
1 = sampling step
0.5 = minimum separation angle (removal of low-confidence 3d points)
3 = minimum number of image points (removal of low-confidence 3d points)
2.0 = maximum reprojection error (removal of low-confidence 3d points)
1 = radius (animated gif files)


Animated gif of 3d dense reconstruction.

Using Structure from Motion (SfM) and Multi View Stereo (MVS) may seem kind of overkill just to produce an animated gif of the reconstructed 3D scene (recall that MVS10 primarily outputs a ply file of the 3D scene which can be used for whatever purpose). So, it might be a good idea to see if we can produce a good animated gif from just two frames (say, the first two of the sequence) using Epipolar Rectification 9b (ER9b), Depth Map Automatic Generator 5 (DMAG5) or Depth Map Automatic Generator 6 (DMAG6) (we'll try both), and finally Wiggle Maker.

Prior to using DMAG5 or DMAG6 on the first two frames of the sequence, they must be rectified (aligned). Rectification is used to limit the finding of matches to the x direction (along the width).


Rectified image 00 in the 11 image sequence (used as left image in stereo pair).


Rectified image 01 in the 11 image sequence (used as right image in stereo pair).

Input to DMAG5:

-89 = min disparity
68 = max disparity
16 = window radius
0.9 = alpha
20.0 = truncation cost (color)
10.0 = truncation cost (gradient)
4 = epsilon
0 = disparity tolerance
9 = occlusion smoothing radius
9.0 = occlusion smoothing sigma space
25.5 = occlusion smoothing sigma color
4 = downsampling ratio


Depth/disparity map produced by DMAG5.


Depth/disparity map smoothed out by EPS9.


Animated gif produced by "Wiggle Maker".

Input to DMAG6:

-89 = min disparity
68 = max disparity
0.9 = alpha
20. = truncation cost (color)
10. = truncation cost (gradient)
10000. = truncation cost (discontinuity)
5 = level number
5 = iteration number
0.5 = I don't remember what that is and I am too lazy to look it up
0 = disparity tolerance
9 = occlusion smoothing radius
9.0 = occlusion smoothing sigma space
25.5 = occlusion smoothing sigma color
4 = downsampling ratio


Depth/disparity map produced by DMAG6.


Depth/disparity map smoothed out by EPS9.


Animated gif produced by "Wiggle Maker".

Using a downsampling ratio of 4, it takes seconds (instead of minutes) to generate depth maps with DMAG5 or DMAG6 and the depth map quality is quite acceptable. In "Wiggle Maker", I used "Inpainting Method = None" so that the animation look is similar to that produced by MVS10. Try not to pay any attention to what's happening at the borders (I know it's distracting and I should have cropped the animated gifs). Clearly, there's not that much of a difference between the depth maps produced by DMAG5 and DMAG6 (maybe DMAG6 is a bit better, especially after the smoothing step), so I don't think it makes a whole lot of difference which automatic depth map generator is used. The animation produced from the output of MVS10 (3d dense reconstruction) is much better than the animation produced from the output of either DMAG5 or DMAG6. Hmmm, it better be since MVS10 uses 11 views while DMAG5 or DMAG6 only use 2 and it takes a boatload more time to run SfM10+MVS10 than DMAG5 or DMAG6 (something like 1 hour vs seconds).

Monday, June 20, 2016

Multi View Stereo - Vose tombstone


Set of 8 images extracted from a video taken with iphone 4s. Dimensions are 1080x1920 pixels.

We are gonna use Multi View Stereo 10 (MVS10) to reconstruct the dense 3D scene (Multi View Stereo). Before that, we ran Structure from Motion 10 (SfM10) to get the camera poses and feature matches between the views (Structure from Motion).

Input to MVS10 ("mvs10_input.txt"):

duh.nvm = name of nvm file (generated by SfM10)
100 = minimum number of matches (camera pair selection)
0.5 = minimum average separation angle (camera pair selection)
32 = window radius
0.9 = alpha
20.0 = truncation cost (color)
10.0 = truncation cost (gradient)
4 = epsilon
0 = disparity tolerance
4 = downsampling factor
1 = sampling step
0.5 = minimum separation angle (removal of low-confidence 3d points)
3 = minimum number of image points (removal of low-confidence 3d points)
2.0 = maximum reprojection error (removal of low-confidence 3d points)
1 = radius (animated gif files)


Animated gif of 3d dense reconstruction.

Friday, June 17, 2016

Using Edge Preserving Smoothing 9 (EPS9) to smooth out depth maps

In this post, I would like to show what happens when Edge Preserving Smoothing 9 (EPS9) is applied to a depth/disparity map in the hopes of improving its quality.

I used these parameters in EPS9 for all depth maps:
radius = 9
sigma space = 9.0
sigma color = 25.5

The depth maps were obtained with Depth Map Automatic Generator 2 (DMAG2). See Depth Map Automatic Generator 2 (DMAG2) vs Depth Map Automatic Generator 5 (DMAG5)) for more information.

Depth map 1:


Left image after epipolar rectification.


Left depth map produced by DMAG2.


Left depth map smoothed by EPS9.

Depth map 2:


Left image after epipolar rectification.


Left depth map produced by DMAG2.


Left depth map smoothed by EPS9.

Depth map 3:


Left image after epipolar rectification.


Left depth map produced by DMAG2.


Left depth map smoothed by EPS9.

Conclusion:

Looks like EPS9 does a great job at removing tiny outliers in depth maps. Because it is edge preserving, it also maintains object boundaries.

Thursday, June 16, 2016

Depth Map Automatic Generator 2 (DMAG2) vs Depth Map Automatic Generator 5 (DMAG5)

In this post, I would like to compare the quality of depth maps generated by Depth Map Automatic Generator 2 (DMAG2) (the contender) and Depth Map Automatic Generator 5 (DMAG5) (the champion) for three stereo pairs taken by a Fuji W3.

For DMAG2, I used these parameters for all stereo pairs:

radius = 32
alpha = 0.9
truncation value (color) = 30
truncation value (gradient) = 10
gamma_p = 32.0
gamma_c = 12.0
disparity tolerance = 0
downsampling ratio = 2

Because DMAG2 is slow, you can't use the images as is (downsampling ratio = 1), you have to downsample them a bit (downsampling ratio > 1). To give you an idea, for these 1200x703 images, it takes under 5 minutes to get the depth map when the downsampling ratio is set to 2 but it takes more than one hour to get the depth map when downsampling is set to 1 (no downsampling).

For DMAG5, I used these parameters for all stereo pairs:

window radius = 32
alpha = 0.9
truncation value (color) = 30.0
truncation value (gradient) = 10.0
epsilon = 4
disparity tolerance = 0
window radius (occlusion smoothing) = 9
sigma space (occlusion smoothing) = 9.0
sigma color (occlusion smoothing) = 25.5
downsampling ratio = 1

Stereo pair 1 (min disparity = -26 and max disparity = 19):


Left image after epipolar rectification.


Right image after epipolar rectification.


Left depth map produced by DMAG5.


Left depth map produced by DMAG2.

Stereo pair 2 (min disparity = -62 and max disparity = 13):


Left image after epipolar rectification.


Right image after epipolar rectification.


Left depth map produced by DMAG5.


Left depth map produced by DMAG2.

Stereo pair 3 (min disparity = -18 and max disparity = 31):


Left image after epipolar rectification.


Right image after epipolar rectification.


Left depth map produced by DMAG5.


Left depth map produced by DMAG2.

Conclusion:

Looks like DMAG2 can be about as good as DMAG5 (better in some cases and worse in others). The only parameter that really matters is the window radius, so it should be chosen carefully. Note that gamma_p should not be smaller than the window radius. So, if you increase the window radius, gamma_p should follow and be set equal to the radius (or larger).