Sunday, December 20, 2015

3D Photos - Bas-relief of a guy with an axe

The stereo pair came from Photosculpt, a company that makes a software which takes a front image and a right image (off by twenty degrees) and produces a 3d texture out of it that can be used in 3d modeling (in programs like Blender).


Left image. They (Photosculpt) call it the front image.


Right image.

First, the two images need to be properly rectified. I am gonna use Epipolar Rectification 9 (ER9) to rectify the images.


Rectified left image.


Rectified right image.

I am gonna use Depth Map Automatic Generator 5b (DMAG5b) because there's not a lot of depth in the scene, and therefore not a whole lot of occlusions. DMAG5b needs a min and max disparity which you can get by carefully reading the output of the epipolar rectifier ER9. The output of ER9 gives min disparity = -12 and max disparity = 102. Those need to be multiplied by -1. So, for DMAG5b, min disparity = -102 and max disparity = 12.


Depth map generated by DMAG5b using alpha = 0.9, truncation (color) value = 7.0, truncation (gradient) value = 2.0.


Corresponding occlusion map.

As advertised, there is not a lot of occlusions, which justifies the use of DMAG5b. If the 3d scene had more depth, DMAG5b would probably not be the best choice as it would "fatten" object boundaries.


Animated gif of the corresponding 3d scene thanks to Depth Map Viewer.

3D Photos - Old stone wall

The stereo pair came from Photosculpt, a company that makes a software which takes a front image and a right image (off by twenty degrees) and produces a 3d texture out of it that can be used in 3d modeling (in programs like Blender).


Front image. For us, it's the left image.


Right image. For us, it's also the right image.

As always, before even dreaming of getting a depth map, the two images need to be rectified so that the epipolar lines are horizontal, a requirement for most automatic depth map generators. I am gonna use Epipolar Rectification 9 (ER9) to rectify the stereo pair.


Rectified left image.


Rectified right image.

This kinda of stereo pairs where there's not much depth and occlusions is perfect for Depth Map Automatic Generator 5b (DMAG5b), a window-based depth map generator that's not edge-aware. If you look closely at the output given by the epipolar rectifier ER9, you can see that it gives a min and max disparities, in this case, -202 and -25. Well, these min and max disparities can be used as input to DMAG5b (and all the other automatic depth map generators) but they have to be reversed, in other words, the min disparity is 25 and the max disparity is 202.


Depth map generated by DMAG5b using radius = 17, alpha = 0.9, truncation value (color) = 7.0, truncation value (gradient) = 2.0.


Corresponding occlusion map.

As you can see, there's not a whole lot of occluded pixels (in black).


Depth map generated by DMAG5b using radius = 34.


Depth map generated by DMAG5b using radius = 68.

These could be smoothed out by an edge preserving like Edge Preserving Smoothing 9 (EPS9) or by a Gaussian filter.

Of course, one could have also used our favorite general purpose automatic depth map generator, Depth Map Automatic Generator 7 (DMAG7)


Depth map produced by DMAG7 using spatial sample rate = 16, color sample rate = 16, radius = 12, lambda = 0.01.


Depth map produced by DMAG7 using spatial sample rate = 16, color sample rate = 16, radius = 12, lambda = 0.1.


Depth map produced by DMAG7 using spatial sample rate = 16, color sample rate = 16, radius = 12, lambda = 1.


Depth map produced by DMAG7 using spatial sample rate = 32, color sample rate = 16, radius = 12, lambda = 0.1.


Depth map produced by DMAG7 using spatial sample rate = 16, color sample rate = 32, radius = 12, lambda = 0.1.

These depth maps (obtained by DMAG7) could probably be smoothed by Edge Preserving Smoothing 7 (EPS7), the so-called recursive domain filter.

Friday, December 18, 2015

3D Photos - Middlebury's bicycle

This is to illustrate the use of Depth Map Automatic Generator 9b (DMAG9b) (implementation of Jon Barron's Fast Bilateral Solver) to improve the quality of a given depth map (obtained by whatever means).


Left image of Bicycle2 stereo pair. This is the quarter version of the full size image.


Right image of Bicycle2 stereo pair.

Here, we are gonna use Depth Map Automatic Generator 2 (DMAG2) to get an initial depth map.


Depth map obtained by DMAG2 using radius = 5.

Note that this is a rather small radius. Now, we want DMAG9b to improve that depth map.


Depth map obtained by DMAG9b using 4 for the spatial bandwidth, 16 for the color bandwidth, and 0.5 for lambda.


Depth map obtained by DMAG9b using 8 for the spatial bandwidth, 16 for the color bandwidth, and 0.5 for lambda.


Depth map obtained by DMAG9b using 16 for the spatial bandwidth, 16 for the color bandwidth, and 0.5 for lambda.


Depth map obtained by DMAG9b using 8 for the spatial bandwidth, 16 for the color bandwidth, and 5.0 for lambda.


Depth map obtained by DMAG9b using 8 for the spatial bandwidth, 16 for the color bandwidth, and 50.0 for lambda.

Well, you get the idea. You can also play with the color bandwidth aka the range (color) sample rate. Note that the spatial bandwidth is also known as the spatial sample rate. DMAG9b can drastically improve the depth map quality, especially at object boundaries.

Of course, one could also have used Depth Map Automatic Generator 7 (DMAG7) (implementation of Jon Barron's Fast Bilateral Space Stereo) right from the start.


Depth map obtained by DMAG7 using 16 for the spatial sample rate, 16 for the color sample rate, 2 for the radius, and 0.1 for lambda.

Again, note the low radius used.


Depth map obtained by DMAG7 using 16 for the spatial sample rate, 16 for the color sample rate, 2 for the radius, and 0.01 for lambda.

You can check how this stacks up against competing automatic depth map generators at Middlebury Stereo Evaluation - Version 3.

Wednesday, December 16, 2015

2D to 3D Conversion - Nathan Fillion as Firefly's Malcolm Reynolds

This 2d to 3d image conversion highlights the use of Depth Map Automatic Generator 9 (DMAG9) to "densify" a sparse depth map while being edge-aware.


Reference image and scribbles depths on top.


Dense depth map produced by DMAG9.

In my dmag9_input.txt, I used:
Spatial sample rate = 8
Range (color) sample rate = 16
Lambda = 0.5
Hash table size = 10000
Number of PCG (Preconditioned Conjugate Gradient) iterations = 1000
Scale parameter of Geman-McClure function = 1.41
Number of IRLS (Iteratively Reweighted Least Squares) iterations = 1

In general, if DMAG9 leaves out black pixels in the depth map, you can either slightly blur the reference image (Gaussian blur is fine) or increase the spatial sample rate.


Animated 3d wiggle gif made by WiggleMaker.

2D to 3D Conversion - Harrison Ford in Blade Runner

I started with a sparse depth map scribbled in Gimp on top of the "Blade Runner" movie still:


Then, I called upon Depth Map Automatic Generator 4 (DMAG4) to generate the dense depth map:


In DMAG4, I used 5000 for the number of iterations and 1 for the number of scales. In other words, it's plain Random Walks.

Finally, I went on WiggleMaker to create a 3d wiggle animated gif:

Tuesday, December 15, 2015

2D to 3D Conversion - Cary Grant

This is to compare Depth Map Automatic Generator 4 (DMAG4) which is based upon "Image segmentation using Scale-Space Random Walks" by R. Rzeszutek et al. and Depth Map Automatic Generator 9 (DMAG9) which is based upon "The Fast Bilateral Solver" by J. Barron et al. It's in the context of 2d to 3d image conversion, in other words, edge-aware densification (propagation) of a sparse depth map.


Sparse depth map scribbled on top of Cary Grant sitting on a chair.


Dense depth map obtained by DMAG4 using max number of iterations = 1000, number of scales = 1, level of graph connection within a scale = 2, and level of graph connection across scales= 1.

Because there are some areas that are totally black when they should not be, it would be a good idea to crank up the number of iterations.


Dense depth map obtained by DMAG4 using max number of iterations = 2000, number of scales = 1, level of graph connection within a scale = 2, and level of graph connection across scales= 1.


Dense depth map obtained by DMAG4 using max number of iterations = 3000, number of scales = 1, level of graph connection within a scale = 2, and level of graph connection across scales= 1.


Dense depth map obtained by DMAG4 using max number of iterations = 4000, number of scales = 1, level of graph connection within a scale = 2, and level of graph connection across scales= 1.


Dense depth map obtained by DMAG4 using max number of iterations = 5000, number of scales = 1, level of graph connection within a scale = 2, and level of graph connection across scales= 1.


Corresponding 3d wiggle gif created with Wiggle Maker.

Clearly, things get better as the number of iterations increases. Now, let's bring the number of iterations back down and increase the number of scales.


Dense depth map obtained by DMAG4 using max number of iterations = 1000, number of scales = 2, level of graph connection within a scale = 2, and level of graph connection across scales= 1.


Dense depth map obtained by DMAG4 using max number of iterations = 1000, number of scales = 3, level of graph connection within a scale = 2, and level of graph connection across scales= 1.

Clearly, as the number of scales increases, DMAG4 is less susceptible to noise and creates smoother gradations in the presence of noise/texture (back of the chair). Unfortunately, some object boundaries become blurrier (Cary's hair bleeds into the background).

As a finale, let's use a high number of iterations and high number of scales.


Dense depth map obtained by DMAG4 using max number of iterations = 5000, number of scales = 3, level of graph connection within a scale = 2, and level of graph connection across scales= 1.


Corresponding 3d wiggle gif created with Wiggle Maker.

Not much a difference with the previous depth map, which means it's probably quite safe to reduce the number of iterations when increasing the number of scales.

Now, let's see what the challenger, DMAG9, can do.


Dense depth map obtained by DMAG9 using Spatial sample rate = 4, Range (color) sample rate = 16, Lambda = 0.5, Hash table size = 10000, Number of PCG (Preconditioned Conjugate Gradient) iterations = 1000, Scale parameter of Geman-McClure function = 1.41, and Number of IRLS (Iteratively Reweighted Least Squares) iterations = 1.

Let's increase the range (color) sample rate to 32 in order to reduce the number of colors to (256/32)^3 = 512. This should reduce the number of un-propagated black pixels.


Dense depth map obtained by DMAG9 using Spatial sample rate = 4, Range (color) sample rate = 32, Lambda = 0.5, Hash table size = 10000, Number of PCG (Preconditioned Conjugate Gradient) iterations = 1000, Scale parameter of Geman-McClure function = 1.41, and Number of IRLS (Iteratively Reweighted Least Squares) iterations = 1.

Some smoothness can be gained by increasing lambda.


Dense depth map obtained by DMAG9 using Spatial sample rate = 4, Range (color) sample rate = 32, Lambda = 100, Hash table size = 10000, Number of PCG (Preconditioned Conjugate Gradient) iterations = 1000, Scale parameter of Geman-McClure function = 1.41, and Number of IRLS (Iteratively Reweighted Least Squares) iterations = 1.

To improve the behavior of DMAG9, it's usually a good idea to blur the reference image.


Blurred version of the reference image. I used Gimp's Gaussian Blur (bandwidth = 5 pixels in both directions).

Alright, let's do another round of DMAG9 passes on this blurred reference image.


Dense depth map obtained by DMAG9 using Spatial sample rate = 4, Range (color) sample rate = 16, Lambda = 0.5, Hash table size = 10000, Number of PCG (Preconditioned Conjugate Gradient) iterations = 1000, Scale parameter of Geman-McClure function = 1.41, and Number of IRLS (Iteratively Reweighted Least Squares) iterations = 1.


Dense depth map obtained by DMAG9 using Spatial sample rate = 4, Range (color) sample rate = 32, Lambda = 0.5, Hash table size = 10000, Number of PCG (Preconditioned Conjugate Gradient) iterations = 1000, Scale parameter of Geman-McClure function = 1.41, and Number of IRLS (Iteratively Reweighted Least Squares) iterations = 1.


Dense depth map obtained by DMAG9 using Spatial sample rate = 4, Range (color) sample rate = 32, Lambda = 100, Hash table size = 10000, Number of PCG (Preconditioned Conjugate Gradient) iterations = 1000, Scale parameter of Geman-McClure function = 1.41, and Number of IRLS (Iteratively Reweighted Least Squares) iterations = 1.


Corresponding 3d wiggle gif created with Wiggle Maker.

At this point in time, it looks like DMAG4 (Scale Space Random Walks) has the edge over DMAG9 (Fast Bilateral Solver) in the context of 2d to 3d image conversion. As far as DMAG4 is concerned, I am not at all convinced that Scale Space Random Walks (number of scales > 1) is that much better than plain Random Walks (number of scales = 1) but it's there as an option.

Update (12/16/2015):

After publishing this blog post, I realized that I had not played around with the spatial sample rate. So, let's go back to our "base" experiment before increasing the spatial sample rate in an effort to get rid of those "black" pixels. Note that it's a slightly different sparse depth map (modified around Cary's magnificent hair).


Dense depth map obtained by DMAG9 using Spatial sample rate = 4, Range (color) sample rate = 16, Lambda = 0.5, Hash table size = 10000, Number of PCG (Preconditioned Conjugate Gradient) iterations = 1000, Scale parameter of Geman-McClure function = 1.41, and Number of IRLS (Iteratively Reweighted Least Squares) iterations = 1.


Dense depth map obtained by DMAG9 using Spatial sample rate = 8, Range (color) sample rate = 16, Lambda = 0.5, Hash table size = 10000, Number of PCG (Preconditioned Conjugate Gradient) iterations = 1000, Scale parameter of Geman-McClure function = 1.41, and Number of IRLS (Iteratively Reweighted Least Squares) iterations = 1.


Dense depth map obtained by DMAG9 using Spatial sample rate = 16, Range (color) sample rate = 16, Lambda = 0.5, Hash table size = 10000, Number of PCG (Preconditioned Conjugate Gradient) iterations = 1000, Scale parameter of Geman-McClure function = 1.41, and Number of IRLS (Iteratively Reweighted Least Squares) iterations = 1.

If I had the patience, I would then proceed to apply a bit of edge-aware smoothing, for instance, Edge Preserving Smoothing 7 (EPS7). In the end, DMAG9 appears to be doing as good a job as DMAG4 as long as you're using the right parameters.