Tuesday, May 21, 2019

A quick guide on using StereoPhoto Maker (SPM) to generate depth maps

This post describes how to automatically generate depth maps with StereoPhoto Maker. Masuji Suto, the author of StereoPhoto Maker, has integrated two of my tools, DMAG5 and DMAG9b, into StereoPhoto Maker in order for StereoPhoto Maker to be able to generate depth maps. If you don't want to use StereoPhoto Maker at all to generate depth maps, you can certainly use my tools directly: Epipolar Rectification 9b (ER9b) or Epipolar Rectification 9c (ER9c) to rectify/align the stereo pair, Depth Map Automatic Generator 5 (DMAG5) or Depth Map Automatic Generator 5b (DMAG5b) to generate the (initial) depth map, and Depth Map Automatic Generator 9b (DMAG9b) to improve the depth map. All those programs can be downloaded through the 3D Software Page.

I recommend using my stereo tools directly but I understand the convenience StereoPhoto Maker might offer, which is why I wrote this post. The issue I have with StereoPhoto Maker is the rectifier which not only aligns the stereo pair but also can provide the min and max disparities needed by the automatic depth map generator. I have no control over that part in StereoPhoto Maker and that bothers me a little bit. Just be aware that, to get good depth maps, it is quite important to have a properly rectified stereo pair and correct min and max disparities.

Be extremely careful about accepting the min and max disparity values (aka the background and foreground values) that StereoPhoto Maker can automatically provide! I strongly recommend you do that part manually. If the values differ a lot (the ones found automatically and the ones found manually), I would be quite suspicious about the proper rectification/alignment of the stereo pair. In that case, I would suggest using ER9b or ER9c to rectify, or if you are really keen on using StereoPhoto Maker, checking the "Better Precision (slow)" box under "Edit->Preferences->Adjustment" to get better rectification. Also, be extremely careful about StereoPhoto Maker resizing your stereo pair because of what's in "maximum image width"! I would not let StereoPhoto Maker resize the stereo pair and instead make sure that the stereo pair you are processing is under what's in "maximum image width".

Alrighty then, let's get back to the business hand, that is, generating depth maps with StereoPhoto Maker. I am assuming that you have installed StereoPhoto Maker and the combo DMAG5/DMAG9b on your computer. If you haven't done so yet, follow the instructions in How to make Facebook 3D Photo from Stereo pair. I am assuming you have downloaded and extracted the DMAG5/DMAG9b stuff in a directory called dmag5_9. That directory should look like so:


Contents of the dmag5_9 directory.

Because alignment of the stereo pair is of prime importance in depth map generation, I recommend going into "Edit->Preferences->Adjustment" and checking the box that says "Better Precision (slow)". As I don't particularly like large images, the first thing I do after loading the stereo pair is to resize the images to something smaller. In this guide, the stereo pair is an mpo coming from a Fuji W3 camera. The initial dimensions are 3441x2016. You could generate the depth maps using those dimensions but everything is gonna take longer. Clicking on "Edit->Resize", I change the width to 1200. I guess I could have resized to something larger but I wouldn't resize to anything larger than 3000 pixels. Once the stereo pair has been resized, I click on "Adjust->Auto-alignment" to align/rectify the images. As an alternative to SPM's alignment tool, you can use my rectification tools: Epipolar Rectification 9b (ER9b) or Epipolar Rectification 9c (ER9c).

To generate the depth map, I click on "Edit->Depth Map->Create depth map from stereo pair" where I am presented with this window:


Default "Create depth map from stereo pair" window.

What I recommend doing is getting the background and foreground disparity values manually first. To get the background value, use the arrow keys so that the two red/cyan views come into focus (merge) for a background point. Same idea for the foreground value. I keep track of those two values and then click on "Get Values (automatic)". If the automatic values make sense when compared to the ones obtained manually, I just leave them be. If they don't, I edit them and put back the values obtained manually. Since my image width is less than 3000 pixels, I don't bother with the "maximum image width" box. I don't like the idea of SPM resizing the images automatically so I always make sure that my image width is less than what it's in the "maximum image width" box. The reason I don't like the idea of SPM resizing my images is because, if you change the image dimensions, you are supposed to also change the min and max disparity, and I am not sure SPM does it. So, beware! As an alternative to having SPM compute the min and max disparities, you can use my rectification tools: Epipolar Rectification 9b (ER9b) or Epipolar Rectification 9c (ER9c). They both give you the min and max disparities in their output. I only want the left depth map so I keep the "Create left/right depth maps" unchecked. Since I want white to be for the foreground and black for the background (I always use white for the foreground in my depth maps), I click on "Create depth map (front: white, back: black)". Note that because I clicked on "Create depth map (front: white, back: black)", when it is time to save the depth map by going into "Edit->Depth map->Save as Facebook 3d photo (2d photo+depth)", I will need to click on "white" for the "Displayed depth map front side->white" so that the depth map does not get inverted when saved by SPM. If you don't mind a depth map where the foreground is black, then leave the "Create depth map (front: white, back: black)" radio button alone and don't click on it.

I recommend using the default parameters in DMAG5 (on the left) and DMAG9b (on the right) to get the first depth map (as you may need to tweak the parameters to get the best possible depth map). To be sure I have the default settings (SPM stores the latest used settings), I click on "Default settings". To get the depth map, I click on "Create depth map". This is the result:


Left image and depth map produced by SPM.

To get the left image on the left and the depth map on the right, click on the "Side-by-side" icon in the taskbar. For more info on DMAG5, check Depth Map Automatic Generator 5 (DMAG5). For more info on DMAG9b, check Depth Map Automatic Generator 9b (DMAG9b) and/or have a look at dmag9b_manual.pdf in the dmag5_9 directory.

Now, if you go into the dmag5_9 directory, you will find some very interesting intermediate images:
- 000_l.tif. That's the left image as used by DMAG5.
- 000_r.tif. That's the right image as used by DMAG5.
- con_map.tiff. That's the confidence map computed by DMAG9b. White means high confidence, black means low confidence. If you clicked on "Create left/right depth maps", the confidence map gets overwritten when DMAG9b is called to optimize the right depth map.
- dps_l.tif. That's the left depth map produced by DMAG5.
- dps_r.tif. That's the right depth map produced by DMAG5. Even if the "Create left/right depth maps" box is unchecked, the right depth map is always generated by DMAG5 in order to detect left/right inconsistencies in the left depth map.
- out.tif. That's the depth map produced by DMAG9b. DMAG9b uses the depth map generated by DMAG5 as input and improves it. If you clicked on "Create left/right depth maps", there should be out.tif (left depth map) and out_r.tif (right depth map).

The depth map "out.tif" is basically the same as the depth map that you get by clicking on "Edit->Depth map->Save as Facebook 3D photo (image+depth)".

It should be noted that once you have created a depth map, you cannot re-click on "Edit->Depth map->Create depth map from stereo pair", possibly change parameters, and create another depth map. If you do that, the new depth map is basically going to be garbage because the right image has been replaced by the previously created depth map (take a look at 000_r.tif in the dmag5_9 directory to convince yourself). To re-generate the depth map, you need to undo or reload the stereo pair.

At this point, you may want to play with the DMAG5/DMAG9b parameters to see if you can improve the depth map. I think it's a good idea but be aware there may be areas in your image where the depth map cannot be improved upon. For instance, if you have an area that has no texture (think of a blue sky or a white wall) or an area that has a repeated texture, it is quite likely the depth map is gonna be wrong no matter what you do. So, improving the depth map is not always easy as, usually, some areas will get better while others will get worse. Some parameters like DMAG5 radius or DMAG9b sample rate spatial depend on the image dimensions and should kind of be tailored to the image width. I think it's best to change one parameter at a time and see the effect it has on the depth map. If things get better, keep changing that parameter (in the same direction) until things get worse. Then, you tweak the next parameter. Of course, it is up to you whether or not you want to spend the time tinkering with parameters. If you always shoot with the same camera setup, this tinkering may only need to be done once.

Let's see which parameters used by DMAG5 and DMAG9b are worth tinkering with in the quest for a better depth map. I believe changing the sample rate spatial used by DMAG9b from the default 32 to 16 or even 8 should be the first thing to change when trying to improve the depth map. I think the default 32 is probably too large especially if the image width is not that large (like here for our test case).

What DMAG5 parameters (on the left side of the window) give you the most bang for your bucks when trying to improve the mesh quality?

- radius. The larger the radius, the smoother the depth map generated by DMAG5 is going to be but the less accurate. As the radius goes down, more noise gets introduced into the depth map.
- downsampling factor. The larger the downsampling factor, the faster DMAG5 will run but the less accurate. Running DMAG5 using a downsampling factor of 2 is four times faster than running DMAG5 using a downsampling factor of 1 (no downsampling).

Note that those observations concern DMAG5 only. So, you should look at the dps_l.tif file to see the effects of changing DMAG5 parameters. Also, you may find that sometimes the depth map generated by DMAG5 is actually better than the one generated by the combo DMAG5/DMAG9b. Note that DMAG9b can be so aggressive that variations in the depth map produced by DMAG5 do not matter much.


Left depth map generated by DMAG5 (dps_l.tif) using default values.

What happens to the left depth map generated by DMAG5 if you change the radius from 16 to 8 (every other parameter set to default)? Let's find out!


"Create depth map from stereo pair" window. Changed DMAG5 radius from 16 to 8 (every other parameter set to default).


Left depth map generated by DMAG5 (dps_l.tif). Changed DMAG5 radius from 16 to 8 (every other parameter set to default).

What happens to the left depth map generated by DMAG5 if you change the downsampling factor from 2 to 1 (every other parameter set to default)? Let's find out!


"Create depth map from stereo pair" window. Changed DMAG5 downsampling factor from 2 to 1 (every other parameter set to default).


Left depth map generated by DMAG5 (dps_l.tif). Changed DMAG5 downsampling factor from 2 to 1 (every other parameter set to default).

What DMAG9b parameters (on the right side of the window) give you the most bang for your bucks when trying to improve the mesh quality?

- sample rate spatial. The larger the sample rate spatial, the more aggressive DMAG9b will be. I recommend going from 32 (default value) to 16, 8, and even 4. If you can clearly see "blocks" in your depth map, the sample rate spatial is probably too large and should be reduced (by a factor of 2).
- sample rate range. The larger the sample rate range, the more aggressive DMAG9b will be. The default value is 8 but you can try 4 or 16 and see if it's any better.
- lambda. The larger the lambda, the smoother the depth map is going to be (in other words, the more aggressive DMAG9b will be). The default value is 0.25 but you can certainly try larger or smaller values. If you don't want the output depth map to be too different from the depth map generated by DMAG5 (dps_l.tiff), use smaller values for lambda (you will probably also need to use smaller values for the sample rate spatial and the sample rate range).


Depth map generated by DMAG9b (out.tif) using default values.

What happens to the depth map generated by DMAG9b if you change the sample rate spatial from 32 to 16 (every other parameter set to default)? Let's find out!


"Create depth map from stereo pair" window. Changed DMAG9b sample rate spatial from 32 to 16 (every other parameter set to default).


Depth map generated by DMAG9b (out.tif). Changed DMAG9b sample rate spatial from 32 to 16 (every other parameter set to default).

What happens to the depth map generated by DMAG9b if you change the sample rate range from 8 to 4 (every other parameter set to default)? Let's find out!


"Create depth map from stereo pair" window. Changed DMAG9b sample rate range from 8 to 4 (every other parameter set to default).


Depth map generated by DMAG9b (out.tif). Changed DMAG9b sample rate range from 8 to 4 (every other parameter set to default).

What happens to the depth map generated by DMAG9b if you change lambda from 0.25 to 0.5 (every other parameter set to default)? Let's find out!


"Create depth map from stereo pair" window. Changed DMAG9b lambda from 0.25 to 0.5 (every other parameter set to default).


Depth map generated by DMAG9b (out.tif). Changed DMAG9b lambda from 0.25 to 0.5 (every other parameter set to default).

Now, if you want to manually edit the generated depth map, you can do so in SPM by clicking on "Edit->Depth map->Correct depth map". If you want to do edit the generated depth map semi-automatically, you can use the techniques centered around DMAG11 or DMAG4 that are described in Case Study - How to get depth maps from old stereocards using ER9c, DMAG5, DMAG9b, and DMAG11 and Case Study - How to improve depth map quality with DMAG9b and DMAG4.

For the ultimate experience in editing depth maps semi-automatically, I recommend using 2d to 3d Image Conversion Software - The 3d Converter: load up the left image and the depth map (add an alpha channel if it does not have one), use the eraser tool to delete parts in the depth map you do not like (that becomes sparse_depthmap_rgba_image.png needed by the3dconverter), create a new layer and trace where you don't want the depths to bleed through (that becomes edge_rgba_image.png needed by the3dconverter), and run the3dconverter to get the new depth map called dense_depthmap_image.png. You do not to worry or care about gimp_paths.svg, ignored_gradient_rgba_image.png, and emphasized_gradient_rgba_image.png as they are not needed.

Sunday, May 12, 2019

Case Study - DMAG5/DMAG9b vs DMAG5b/DMAG9b

This post kinda compares a depth map produced by the combo DMAG5/DMAG9b vs the combo DMAG5b/DMAG9b. Thanks to my good friend Katsuhiko Inoue for providing the stereo pair (taken in portrait mode with an iphone X).


Left image of stereo pair after rectification by ER9b.


Right image of stereo pair after rectification by ER9b.

I do not know how the right image was obtained. It certainly was not obtained from a portrait mode stereo photo using the dual lens as it's not possible to extract the right image from an iphone X stereo photo. Even if you could extract the right image, it would not be the same focal length as the left image, meaning you would need specialized depth map generation software to get the depth map. Here, I am talking about the dual lens iphone X (back-facing camera system), not the TrueDepth sensor (front-facing camera system). I think the depth map produced by the iphone X was used here to create a synthetic right image using 3dsteroid pro or stereophoto maker. Basically, what I am gonna be doing here is see if I can recover the original depth map from the left image and a synthetic right image.

The dimensions of the original stereo pair are 3024x4032. I reduced the dimensions to 1800x2400 so that DMAG9b would run faster. The only reason I ran ER9b was to get the min and max disparities. It looks like the original stereo pair was very well aligned. Note that because the baseline is so small, you don't want to reduce the image size too much otherwise you are going to get a depth map with few depth levels (shades of gray) as far as DMAG5 and DMAG5b are concerned. Note that the number of depth levels is equal to the difference between the min and max disparities. So, for example, if the min disparity is -44 and the max disparity is 10, you are gonna get 55 depth levels (shades of gray) in the depth map produced by DMAG5 or DMAG5b. Something to consider.

Now, let's run DMAG5 using the following input file:

image 1 = ../er9b/image_l.png
image 2 = ../er9b/image_r.png
min disparity for image 1 = -44
max disparity for image 1 = 10
disparity map for image 1 = depthmap_l.png
disparity map for image 2 = depthmap_r.png
occluded pixel map for image 1 = occmap_l.png
occluded pixel map for image 2 = occmap_r.png
radius = 16
alpha = 0.9
truncation (color) = 30
truncation (gradient) = 10
epsilon = 255^2*10^-4
disparity tolerance = 0
radius to smooth occlusions = 9
sigma_space = 9
sigma_color = 25.5
downsampling factor = 2

I believe those are the default values in StereoPhoto Maker.


Left depth map obtained by DMAG5.

If you want to experiment, you could change the value for the radius. Maybe try 8 or 32 instead of 16 and see what happens. Also, you may want to change the downsampling factor to 1 instead of 2. It will take longer but you will get more levels of depth in the depth map (shades of gray).

Let's run DMAG9b using the following input file:

reference image = ../../er9b/image_l.png
input disparity map = ../depthmap_l.png
sample_rate_spatial = 32
sample_rate_range = 8
lambda = 0.25
hash_table_size = 100000
nbr of iterations (linear solver) = 25
sigma_gm = 1
nbr of iterations (irls) = 32
radius (confidence map) = 12
gamma proximity (confidence map) = 12
gamma color similarity (confidence map) = 12
sigma (confidence map) = 2
output depth map image = depthmap_l_dmag9b.png

I believe those are the default in StereoPhoto Maker except for sigma. Here, I am using sigma = 2.0, SPM uses 32.0. I don't think it matters much to be honest. Recall that the lower the sigma, the less confidence is given to the depth in the input depth map.


Confidence map. White means very confident in input depth, black means little confidence. Since sigma is relative low, the black streaks (poor confidence) are quite prominent.


Depth map produced by DMAG9b.

Let's change sigma from 2.0 to 32.0 and run DMAG9b using the following input file:

reference image = ../../er9b/image_l.png
input disparity map = ../depthmap_l.png
sample_rate_spatial = 32
sample_rate_range = 8
lambda = 0.25
hash_table_size = 100000
nbr of iterations (linear solver) = 25
sigma_gm = 1
nbr of iterations (irls) = 32
radius (confidence map) = 12
gamma proximity (confidence map) = 12
gamma color similarity (confidence map) = 12
sigma (confidence map) = 32
output depth map image = depthmap_l_dmag9b.png


Confidence map. Since sigma is relatively high, the black streaks (poor confidence) are pretty narrow.


Depth map produced by DMAG9b.

Not a whole lot of difference so I am gonna continue with sigma = 2.0. Let's change sample_rate_spatial from 32 to 16 and run DMAG9b using the following input file:

reference image = ../../er9b/image_l.png
input disparity map = ../depthmap_l.png
sample_rate_spatial = 16
sample_rate_range = 8
lambda = 0.25
hash_table_size = 100000
nbr of iterations (linear solver) = 25
sigma_gm = 1
nbr of iterations (irls) = 32
radius (confidence map) = 12
gamma proximity (confidence map) = 12
gamma color similarity (confidence map) = 12
sigma (confidence map) = 2
output depth map image = depthmap_l_dmag9b.png


Depth map produced by DMAG9b.

I think it's a bit better so let's continue the trend and change sample_rate_spatial from 16 to 8. Let's run DMAG9b using the following input file:

reference image = ../../er9b/image_l.png
input disparity map = ../depthmap_l.png
sample_rate_spatial = 8
sample_rate_range = 8
lambda = 0.25
hash_table_size = 100000
nbr of iterations (linear solver) = 25
sigma_gm = 1
nbr of iterations (irls) = 32
radius (confidence map) = 12
gamma proximity (confidence map) = 12
gamma color similarity (confidence map) = 12
sigma (confidence map) = 2
output depth map image = depthmap_l_dmag9b.png


Depth map produced by DMAG9b.

I think I hit the sweet spot so I am gonna stop here. Note that as sample_rate_spatial goes down, the cpu time for DMAG9b goes up.

Because the interocular distance is small, it can be worthwhile to use DMAG5b instead of DMAG5 to get the initial depth map. DMAG5b is a very simple algorithm but it will not perform well at object boundaries if the baseline used to take the stereo pair was (relatively) large. Here, it should perform ok since the pair was taken with an iphone with dual cameras.

Let's run DMAG5b using the following input file:


Depth map produced by DMAG5b.

The depth map produced by DMAG5b is actually better (I think) than the depth map produced by DMAG5. In this particular case. Personally, I would stop here and not even bother with DMAG9b but let's see how the best DMAG5/DMAG9b combo (as seen right above) compares with DMAG5b/DMAG9b.

Let's try to improve this depth map using DMAG9b and the following input file (same as the one right above):

reference image = ../../er9b/image_l.png
input disparity map = ../depthmap_l.png
sample_rate_spatial = 8
sample_rate_range = 8
lambda = 0.25
hash_table_size = 100000
nbr of iterations (linear solver) = 25
sigma_gm = 1
nbr of iterations (irls) = 32
radius (confidence map) = 12
gamma proximity (confidence map) = 12
gamma color similarity (confidence map) = 12
sigma (confidence map) = 2
output depth map image = depthmap_l_dmag9b.png


Depth map produced by DMAG9b.

Here, it does not really matter how the initial depth map was obtained as DMAG9b is quite aggressive. To make DMAG9b less aggressive, lambda is probably the parameter to change. The lower lambda is, the less aggressive DMAG9b is going to be.

Thursday, May 9, 2019

2d to 3d conversion - Great White

This post is an example that shows how to use 2d to 3d Image Conversion - The 3d Converter to create a depth map semi-automatically. I have uploaded on dropbox the gimp file which contains all the layers and the paths: great_white.xcf. See 2d to 3d Image Conversion - The 3d Converter for how to use "The 3d Converter".

This is what the3dconverter_input.txt looks like:
reference_rgb_image.png
sparse_depthmap_rgba_image.png
dense_depthmap_image.png
gimp_paths.svg
ignored_gradient_rgba_image.png
emphasized_gradient_rgba_image.png
edge_rgba_image.png
0.0

All you need is a reference image (layer reference_rgb_image in great_white.xcf saved as reference_rgb_image.png), an "edge image" (layer edge_rgba_image in great_white.xcf saved as edge_rgba_image.png), a sparse depth map (layer sparse_depthmap_rgba_image in great_white.xcf saved as sparse_depthmap_rgba_image.png), and a bunch of equal_depth and relative_depth paths (saved as gimp_paths.svg). This is all done in Gimp and you can get to those by downloading great_white.xcf.

Don't worry about ignored_gradient_rgba_image.png and emphasized_gradient_rgba_image.png as those are not used and therefore don't need to exist.


Reference image aka reference_rgb_image.png.


Reference image, sparse depth map, edge image, and gimp paths as seen in great_white.xcf.

The sparse depth map consists of a white blob for the tip of the nose and a black scribble to denote the background. I know it is a bit weird to relegate the water to the background but I don't see any other way to do it. Still, it kinda places the shark in front of the water. Weird, right. For the edge image, I simply traced the outline of the shark using the pencil tool (keeping the shift key pressed so that the line segments are always straight) with the smallest possible hard brush. The purpose of the edge image is to prevent the depths to bleed across object boundaries. The gimp paths are shown in blue. The ones that kinda look like half circles are the equal_depth paths. The ones that kinda connect the equal_depth paths together are the relative_depth paths. What is cool about using gimp paths is that it is very easy to modify them, in particular, it is very easy to change the relative depths between equal_depth paths as all you have to do is rename the relative_depth paths.


Gimp paths: equal_depth and relative_depth paths.

The +XX in the name of a relative_depth path indicates the relative depth between the beginning and end of the path. So, if you need to change the relative depth, you just need to change the XX in the name of the path.


Dense depth map produced by The 3d Converter.


Wiggle/wobble created using depthy.me.


Wiggle/wobble created using wigglemaker.

Check the 3D effect on Facebook (no need to be registered or logged in into Facebook): Facebook 3D photo. I have got to say that Facebook did an excellent job rendering depth maps with their 3D photos. Clap clap!

This is kinda a trial and error process. So, to help me in visualizing the dense depth map, I use Depth Player. I think it is a great tool. Note that the depth map doesn't have to be too accurate if you just want to post it as a Facebook 3D photo.

Video that kinda explains how to get the input files needed by The3dConverter: