Wednesday, November 30, 2016

3D Photos - Pumpkins

In this post, I want to show that you can use my line of software to generate tweeners for lenticular imaging even if the input stereo pair is large (3591x2432 pixels). We're gonna create a left and right (full size) depth map with dmag5, improve the quality of the depth maps with dmag9b, and then generate the intermediate frames (the tweeners) with fsg6 using the full size images. Here, I am using dmag5 as the automatic depth map generator but I could have used (without much difference) dmag2 or dmag6.


Left image of a stereo pair taken with a Fuji W1 by Mike.


Right image of a stereo pair (also taken with a Fuji W1 by Mike).

This original stereo pair was aligned with SPM (StereoPhotoMaker) but if you load the 2 images in Gimp as layers and switch between the two, you can see that corresponding features in the images are not moving horizontally. This is not good for automatic depth map generation which relies on matches being along horizontal lines (scanlines). As a matter of principle, I always rectify my stereo pairs prior to generating a depth map. Here, I am using a modified version of er9b which is not as aggressive (This version will be released soon, hopefully.)


Rectified left image.


Rectified right image.



Left and right depth maps generated by dmag5 using the following parameters:

image 1 = ../image_l.png
image 2 = ../image_r.png
min disparity for image 1 = -70
max disparity for image 1 = 6
disparity map for image 1 = depthmap_l.jpg
disparity map for image 2 = depthmap_r.jpg
occluded pixel map for image 1 = occmap_l.jpg
occluded pixel map for image 2 = occmap_r.jpg
radius = 32
alpha = 0.9
truncation (color) = 30
truncation (gradient) = 10
epsilon = 255^2*10^-4
disparity tolerance = 0
radius to smooth occlusions = 9
sigma_space = 9
sigma_color = 25.5
downsampling factor = 2

Notice the downsampling factor equal to 2 (enables a large cpu gain without much loss in accuracy) and the relatively large radius (the large the images, the larger the radius should be).

I think we can improve on the depth maps, especially as far as the object boundaries are concerned. We are gonna use dmag9b to improve the depth maps. Note that we could also have used dmag2 (or the other edge preserving smoothers) but I like the "destructive" nature of dmag9b.



Left and right depth maps generated by dmag9b using the following parameters (for the right image):

reference image = ../../image_r.png
input disparity map = ../depthmap_r.jpg
sample_rate_spatial = 32
sample_rate_range = 8
lambda = 0.25
hash_table_size = 100000
nbr of iterations (linear solver) = 25
sigma_gm = 1.4142
nbr of iterations (irls) = 32
radius (confidence map) = 12
gamma proximity (confidence map) = 12
gamma color similarity (confidence map) = 12
sigma (confidence map) = 2
output depth map image = depthmap_r_dmag9b.jpg

Now that we are somewhat satisfied with the depth maps, it's time to generate the in-between frames for lenticular creation. We're gonna use fsg6 which uses both left and right depth maps. Be careful though as the right depth map must be inverted.


Inverted right depth map (needed by fsg6).


This is an animated gif showing 12 frames in between (and including) the left and right images generated by fsg6 using the following parameters:

Left image = ../../../image_l.tiff
Right image = ../../../image_r.tiff
Left disparity map = ../depthmap_l_dmag9b.tiff
Right disparity map = ../depthmap_r_dmag9b_inv.tiff
Minimum disparity = -70
Maximum disparity = 6
Number of frames = 12

The frames were reduced to a more manageable size of 600x406 pixels to produce the animated gif.

It should be noted that one could have used a frame sequence generator like fsg4 to produce the frames (on the left and right side of the left image using the left depth map). The problem with this type of frame sequence generators which only makes use of a single depth map is that the inpainting that occurs in the frames looks heavily blurred. The only advantage is that the parallax (3d effect) can be increased as much as desired artificially.

Tuesday, November 29, 2016

3D Photos - Reindeer


This is a stereo pair taken by my htc evo 3d cell phone. The image size is 1920x1080 pixels,


This is the stereo pair as rectified/aligned by er9b.


This is the depth map obtained by dmag5.


This is the depth map obtained by dmag5 shown here on top of the left image using 70% transparency.

Parameters used for dmag5:

image 1 = ../image_l.png
image 2 = ../image_r.png
min disparity for image 1 = -22
max disparity for image 1 = 29
disparity map for image 1 = depthmap_l.jpg
disparity map for image 2 = depthmap_r.jpg
occluded pixel map for image 1 = occmap_l.jpg
occluded pixel map for image 2 = occmap_r.jpg
radius = 32
alpha = 0.9
truncation (color) = 30
truncation (gradient) = 10
epsilon = 255^2*10^-4
disparity tolerance = 0
radius to smooth occlusions = 9
sigma_space = 9
sigma_color = 25.5
downsampling factor = 2

We are gonna try to see if we can improve the depth map quality using eps2, eps5, eps7, eps9, and dmag9b. Eps2 is a bilateral filter. Eps5 and eps7 are approximations of the bilateral filter. Eps9 is an edge preserving median filter. Dmag9b is powered by the Fast Bilateral of Barron et al.


This is the depth map obtained by eps2.


This is the depth map obtained by eps2 shown here on top of the left image using 70% transparency.

Parameters used for eps2:

reference image = ../../image_l.tiff
disparity map = ../depthmap_l.tiff
radius = 32
gamma proximity = 32
gamma color similarity = 8
smoothed disparities = depthmap_l_eps2.tiff


This is the depth map obtained by eps5.


This is the depth map obtained by eps5 shown here on top of the left image using 70% transparency.

Parameters used for eps5:

reference image = ../../image_l.tiff
disparity map = ../depthmap_l.tiff
smoothed disparity map = depthmap_l_eps5.tiff
radius = 16
epsilon = 255^2*10^-4


This is the depth map obtained by eps7.


This is the depth map obtained by eps7 shown here on top of the left image using 70% transparency.

Parameters used for eps7:

image to filter = ../depthmap_l.jpg
joint image = ../../image_l.png
sigma_s = 1000
sigma_r = 100
num_iterations = 3
filtered image = depthmap_l_eps7.jpg


This is the depth map obtained by eps9.


This is the depth map obtained by eps9 shown here on top of the left image using 70% transparency.

Parameters used for eps9:

reference image = ../../image_l.png
disparity map = ../depthmap_l.jpg
radius = 16
sigma_space = 16
sigma_color = 25.5
smoothed disparity map = depthmap_l_eps9.jpg


This is the depth map obtained by dmag9b.


This is the depth map obtained by dmag9b shown here on top of the left image using 70% transparency.

Parameters used for dmag9b:

reference image = ../../image_l.png
input disparity map = ../depthmap_l.jpg
sample_rate_spatial = 32
sample_rate_range = 8
lambda = 0.25
hash_table_size = 100000
nbr of iterations (linear solver) = 25
sigma_gm = 1.4142
nbr of iterations (irls) = 32
radius (confidence map) = 12
gamma proximity (confidence map) = 12
gamma color similarity (confidence map) = 12
sigma (confidence map) = 2
output depth map image = depthmap_l_dmag9b.jpg

Conclusion: The best tool to improve this particular depth map seems to be eps2 and dmag9b. Interesting!

Now that we have a depth map we are satisfied with, it's time to generate a 3d wobble for everybody's entertainment ...


This is the 3d wobble created by wigglemaker.


Special bonus: This is the animated gif created by depthmapviewer.

Monday, November 28, 2016

3D Photos - Fire Hydrant Wheel


This is the initial stereo pair as taken by my htc evo 3d android phone. The image size is 1920x1080 pixels.


This is the stereo pair after epipolar rectification (by er9b).


This is the depth map produced by dmag5 using the following parameters:

image 1 = ../image_l.png
image 2 = ../image_r.png
min disparity for image 1 = -27
max disparity for image 1 = 14
disparity map for image 1 = depthmap_l.jpg
disparity map for image 2 = depthmap_r.jpg
occluded pixel map for image 1 = occmap_l.jpg
occluded pixel map for image 2 = occmap_r.jpg
radius = 16
alpha = 0.9
truncation (color) = 30
truncation (gradient) = 10
epsilon = 255^2*10^-4
disparity tolerance = 0
radius to smooth occlusions = 9
sigma_space = 9
sigma_color = 25.5
downsampling factor = 1

That's a pretty decent depth map we've got here. I think we can improve it a tiny bit by using an edge preserving smoother like eps7.


This is the depth map produced by eps7 using the following parameters:

image to filter = ../depthmap_l.jpg
joint image = ../../image_l.png
sigma_s = 1000
sigma_r = 100
num_iterations = 3
filtered image = depthmap_l_eps7.jpg


This is the 3d wobble gif produced by wigglemaker.

Wednesday, November 23, 2016

3D Photos - The Bronze General


This is a stereo pair taken with a fuji w3 in the vertical position (by mediavr). Our goal is to create a decent depth map for it and then a 3d wigglegram.


This is the rectified stereo pair (rectification by er9b).


This is the depth map obtained by dmag5.

Input to dmag5:

image 1 = ../image_l.png
image 2 = ../image_r.png
min disparity for image 1 = -82
max disparity for image 1 = 72
disparity map for image 1 = depthmap_l.jpg
disparity map for image 2 = depthmap_r.jpg
occluded pixel map for image 1 = occmap_l.jpg
occluded pixel map for image 2 = occmap_r.jpg
radius = 32
alpha = 0.9
truncation (color) = 30
truncation (gradient) = 10
epsilon = 255^2*10^-4
disparity tolerance = 0
radius to smooth occlusions = 9
sigma_space = 9
sigma_color = 25.5
downsampling factor = 2

We are gonna use dmag11 to (semi-manually) improve the quality of the depth map. Note that we could have also used dmag4 or dmag9 instead of dmag11. The input to dmag11 beside the reference image (the left view in the stereo pair) is a "sparse" depth map. In our case, the sparse depth map is gonna be the depth map produced by dmag5 minus the areas where the depth is either wrong or a bit approximative (along the boundary of our general). Note that dmag4, dmag9, and dmag11 are usually used for 2d to 3d image conversion but they can certainly be used to improve depth map quality in a semi-automatic way.


This is the input to dmag11. The white areas are actually transparent pixels but they show up as white here. All I did was use the eraser tool all along the generals's boundary and in the base of the bust.


Here's another view of the input to dmag11, the famous "sparse" depth map. The transparent pixels are the result of the application of the eraser tool. Of course, it's not technically sparse but that's the designation used in dmag4, dmag9, and dmag11.


This is the output of dmag11.

Parameters used for dmag11:

reference image = ../../image_l.png
input depth map = depthmap_l.png
output depth map = depthmap_l_dmag11.jpg
radius = 12
gamma proximity = 10000
gamma color similarity = 1
maximum number iterations = 1000
scale number = 1


This is the 3d wiggle produced by wigglemaker.

Tuesday, November 22, 2016

3D Photos - At the station (2)


This is the original stereo pair taken by mediavr with a w3. Image size is 1824x1368 pixels. Generating a depth map is going to be difficult because of the low texture in certain parts of the image and the lack of clear separation (color-wise) between foreground and background in some places.


This is the rectified stereo pair (by er9b). The advantage of using er9b beside rectifying is that it gives you the min and max disparities as a bonus.


This is the depth map obtained by dmag5.

Parameters used for dmag5:

image 1 = ../image_l.png
image 2 = ../image_r.png
min disparity for image 1 = -82
max disparity for image 1 = -30
disparity map for image 1 = depthmap_l.jpg
disparity map for image 2 = depthmap_r.jpg
occluded pixel map for image 1 = occmap_l.jpg
occluded pixel map for image 2 = occmap_r.jpg
radius = 32
alpha = 0.9
truncation (color) = 30
truncation (gradient) = 10
epsilon = 255^2*10^-4
disparity tolerance = 0
radius to smooth occlusions = 9
sigma_space = 9
sigma_color = 25.5
downsampling factor = 2

Fair to say that it is not the greatest depth map and it would be nice to improve upon it with some post-processing (has to be automatic though). We are gonna use dmag9b to improve the quality of the depth map auto-magically.


This is the depth map obtained by dmag9b.

Input to dmag9b:

reference image = ../../image_l.png
input disparity map = ../depthmap_l.jpg
sample_rate_spatial = 32
sample_rate_range = 8
lambda = 0.25
hash_table_size = 100000
nbr of iterations (linear solver) = 25
sigma_gm = 1.4142
nbr of iterations (irls) = 32
radius (confidence map) = 12
gamma proximity (confidence map) = 12
gamma color similarity (confidence map) = 12
sigma (confidence map) = 0.25
output depth map image = depthmap_l_dmag9b.jpg

Let's change the "sample rate range" to 4 leaving everything else the same ...


This is the depth map obtained by dmag9b.

Input to dmag9b:

reference image = ../../image_l.png
input disparity map = ../depthmap_l.jpg
sample_rate_spatial = 32
sample_rate_range = 4
lambda = 0.25
hash_table_size = 100000
nbr of iterations (linear solver) = 25
sigma_gm = 1.4142
nbr of iterations (irls) = 32
radius (confidence map) = 12
gamma proximity (confidence map) = 12
gamma color similarity (confidence map) = 12
sigma (confidence map) = 0.25
output depth map image = depthmap_l_dmag9b.jpg

In my opinion, a better depth map than the previous one. Let's change the "sample rate spatial" to 4 leaving everything else the same ...


This is the depth map obtained by dmag9b.

Input to dmag9b:

reference image = ../../image_l.png
input disparity map = ../depthmap_l.jpg
sample_rate_spatial = 4
sample_rate_range = 4
lambda = 0.25
hash_table_size = 100000
nbr of iterations (linear solver) = 25
sigma_gm = 1.4142
nbr of iterations (irls) = 32
radius (confidence map) = 12
gamma proximity (confidence map) = 12
gamma color similarity (confidence map) = 12
sigma (confidence map) = 0.25
output depth map image = depthmap_l_dmag9b.jpg

In my opinion, a worse depth map than the previous two because much too similar to the depth map produced by dmag5.


This is the confidence map that was used by dmag9b. Of course, it is generated automatically by dmag9b. The black areas correspond to areas of low confidence in the depth values. They basically correspond to abrupt changes in depth values.

3D Photos - At the station


This is the original pair (provided by mediavr) after a little bit of downsampling. Image dimension is 1824x1368 pixels.


This is the stereo pair after rectification by er9b. The white borders will be removed later. They have no effect of depth map generation.

Let's get a depth map first using dmag5. Other automatic depth map creators could have been used like dmag2, dmag6, or dmag7, but I kinda like using dmag5. Note this kind of scene is the worst nightmare of any automatic depth map creator because of the presence of areas with low texture (walls/floors are the worst offenders).


This is the (left) depth map generated by dmag5.

Input parameters for dmag5:

image 1 = ../image_l.png
image 2 = ../image_r.png
min disparity for image 1 = -85
max disparity for image 1 = -43
disparity map for image 1 = depthmap_l.jpg
disparity map for image 2 = depthmap_r.jpg
occluded pixel map for image 1 = occmap_l.jpg
occluded pixel map for image 2 = occmap_r.jpg
radius = 32
alpha = 0.9
truncation (color) = 30
truncation (gradient) = 10
epsilon = 255^2*10^-4
disparity tolerance = 0
radius to smooth occlusions = 9
sigma_space = 9
sigma_color = 25.5
downsampling factor = 2

Worth noting is the radius equal to 32 (which is quite large) and the downsampling factor equal to 2 (I used 2 instead of 1 mostly for convenience since it is much faster). I guess I could have recreated the depth map using a downsampling factor of 1 but I did not.

At this point, we have a not so great depth map. What to do to improve things automatically? First option is to use an edge preserving smoother like eps7, eps9, eps2, or eps5.


This is the depth map obtained with eps7. I think we are gonna need something a little bit more drastic. Note that it would have been a good idea to use another edge preserving smoother like eps9, eps2, or eps5 to see if it made a difference. Also, the parameters could have been played with a little.

Input parameters for eps7:

image to filter = ../depthmap_l.jpg
joint image = ../../image_l.png
sigma_s = 1000
sigma_r = 100
num_iterations = 3
filtered image = depthmap_l_eps7.jpg

The other option is to use the more aggressive dmag9b or dmag11b.


This is the depth map generated by dmag9b. Dmag9b is based upon the Fast Bilateral Solver by Jonathan Barron et al.

Input for dmag9b:

reference image = ../../image_l.png
input disparity map = ../depthmap_l.jpg
sample_rate_spatial = 32
sample_rate_range = 8
lambda = 0.25
hash_table_size = 100000
nbr of iterations (linear solver) = 25
sigma_gm = 1.4142
nbr of iterations (irls) = 32
radius (confidence map) = 12
gamma proximity (confidence map) = 12
gamma color similarity (confidence map) = 12
sigma (confidence map) = 0.25
output depth map image = depthmap_l_dmag9b.jpg

Let's see what dmag11b produces to put things in perspective ...


This is the depth map generated by dmag11b. Dmag11b removes depths with low confidence and inpaints the "holes" using a bilateral filter. It is very similar to what dmag4 and dmag11 do when densifying a sparse depth map.

Input for dmag11b:

reference image = ../../image_l.png
input depth map = ../depthmap_l.jpg
output depth map = depthmap_l_dmag11b.jpg
radius (confidence map) = 12
gamma proximity (confidence map) = 12
gamma color similarity (confidence map) = 12
sigma (confidence map) = 0.25
radius (inpainting) = 1
gamma proximity (inpainting) = 10000
gamma color similarity (inpainting) = 1
maximum number iterations (inpainting) = 1000
scale number (inpainting) = 1

Both dmag9b and dmag11b use a depth confidence map to determine whether recorded depths are reliable or not.


This is the depth confidence map used by both dmag9b and dmag11b. The darker the pixel, the less reliable the depth at that pixel is.


This is the 3d animated wiggle gif obtained with wigglemaker using the depth map generated by dmag9b.


This is the 3d animated wiggle gif obtained with wigglemaker using the depth map generated by dmag11b.