## Tuesday, June 18, 2013

### 3D Photos - Mountaineer

Here's the left view of a stereo picture you can find at FinePix REAL 3D W3 sample 3D image & movies. Getting the correct disparity for the walking stick was a bit of a challenge so it was painted red in order to stand out against the background. This is an interesting case since there are some decent half-occlusions on the left side of the mountaineer for image 1 (right side for image 2).

This is the depth map for image 1 obtained with Depth Map Automatic Generator 2 (DMAG2). As you can see, the disparity for the sky is wrong almost everywhere (should be all black) but that's not a problem since there's no texture.

This is the animated sequence (created in Gimp) that shows the intermediate synthesized frames obtained with Frame Sequence Generator 2 (FSG2).

## Thursday, June 6, 2013

### Disparity Finder 2 (DF2)

Disparity Finder 2 (DF2) is a little program that computes the disparity between two corresponding points on a stereo pair. This is a rather painless way to find the minimum and maximum disparity of a stereo pair. To get the minimum and maximum disparity, simply match the furthest point in the background and the closest point in the foreground, respectively. Note that the disparities could be negative (they're usually always positive in the computer vision world but that's not a problem here). This may be obvious but the min disparity has to be smaller than the max disparity.

If I1 is the left image and I2 the right image, given a pixel (x,y), the disparity d is such that I1(x,y)=I2(x-d,y).

The windows executable (guaranteed to be virus free) is available for free via the 3D Software Page. Please, refer to the 'Help->About' page in the actual program for how to use it.

### Frame Sequence Generator 2 (FSG2)

Frame Sequence Generator 2 (FSG2) generates synthesized intermediate frames given a left image (image 1), a disparity/depth map, and the min and max disparities (needed to compute the actual disparity from the depth map). The basic idea is, for each frame, to shift the pixels in the left image according to the disparity map. This frame will have "holes" (disoccluded pixels) which are filled by an ultra simplistic in-painting algorithm. When the foreground of an image stays stationary and the background shifts (to the right), disoccluded areas are created which can be filled, in a very simplistic manner, by copying the rightmost visible (not disoccluded) pixel. For a more sophisticated method, check out Frame Sequence Generator 4 (FSG4)..

If imageL.xxx is the left image (image 1), the frames generated will be named frame1.xxx, frame2.xxx, etc. The last frame corresponds to the right image. Animated gifs or lenticular prints (see Lenticular Printing - Interlacing with SuperFlip) can be made with imageL.xxx, frame1.xxx, frame2.xxx, etc (don't use the actual right image). The program automatically creates an animated gif of those images after generating and saving the frames.

The disparity/depth map should be a grayscale image where white indicates closest foreground object and black indicates furthest background object. The darkest value in the depthmap corresponds to the minimum disparity and the brightest value corresponds to the maximum disparity. The disparity/depth map doesn't need to come from any of my programs.

Here's an example:

Left image of the "Art" stereo pair (part of 2005 Middlebury dataset).

Depth/disparity map.

Disoccluded areas ("holes") which result from the shifting of the reference image (left image) to get the "right" image (which you get when you ask to generate just 1 frame).

Pseudo-right image obtained with FSG2 (you get the "right" image when you generate just 1 frame).

The windows executable (guaranteed to be virus free) is available for free via the 3D Software Page. Please, refer to the 'Help->About' page in the actual program for how to use it.

### Depth Map Automatic Generator 2 (DMAG2)

Depth Map Automatic Generator 2 (DMAG2) automatically generates two disparity maps and two occlusion maps for a given stereo pair. The algorithm is one local method among many stereo matching local methods. The program computes two disparity maps, performs a left-right consistency check to get the occlusions for each disparity map, and finally fills the occlusions in each disparity map. When DMAG2 finishes (it will show the left and right disparity maps as well as the left and right occlusion maps in four separate windows), you can save the left and right disparity maps as well as the left and right occlusion maps.

DMAG2 is an implementation of the algorithm presented in "Adaptive Support-Weight Approach for Correspondence Search" by Kuk-Jin Yoon et al. Have a look at Weighted Windows for Stereo Matching since I kinda explain there how the matching cost is computed. I changed the code on 03/17/15 so that the matching cost is now the same as the one used in Depth Map Automatic Generator 5 (DMAG5).

Let's go over the parameters that control DMAG2's behavior:

- Minimum disparity is the disparity corresponding to the furthest point in the background.
- Maximum disparity is the disparity corresponding to the closest point in the foreground.
I suggest using Disparity Finder 2 (DF2) to get the minimum and maximum disparity. Better yet, you can use Epipolar Rectification 9b (ER9b) to rectify/align the two images (crucial for good quality depth map generation) and get the min and max disparities automatically.
- Window radius is the guided filter size. The larger the radius, the more accurate the matches are supposed to be (to a certain extent). If the window radius becomes too large, errors are likely to appear at object boundaries. Note that the running time is not dependent on the window radius, which is a really good thing.
- Alpha is the term that balances the color matching cost and the gradient matching cost. The closer alpha is to 0, the more importance is given to the color. The closer alpha is to 1, the more importance is given to the gradient. In theory, a higher alpha works better when there's quite a bit of texture in the image while a lower alpha works better when the image is relatively flat color wise.
- Truncation value (color) limits the value the color matching cost can take. It reduces the effects of occluded pixels (pixels that appear in only one image).
- Truncation value (gradient) limits the value the gradient matching cost can take. It reduces the effects of occluded pixels (pixels that appear in only one image).
I tend to set the truncation value (color) to 20.0 or 30.0 and the truncation value (gradient) to 2.0. Feel free to experiment though!
- Gamma proximity controls the weight given to a neighboring pixel in terms of spatial distance. Typically, you want to give less weight to a neighboring pixel that's far away from the pixel under consideration than to a pixel that is close by. As gamma proximity goes to infinity, the weight becomes equal to unity for any neighboring pixel no matter how far it is. As gamma proximity goes to zero, the more the weight depends on the distance to the pixel under consideration.
- Gamma color (similarity) controls the weight given to a neighboring pixel in terms of color difference. Typically, you want to give less weight to a neighboring pixel that's far away in the color space from the pixel under consideration than to a pixel that is close by. As gamma color goes to infinity, the weight becomes equal to unity for any neighboring pixel no matter how far it is in the color space. As gamma color goes to zero, the more the weight depends on the distance in the color space to the pixel under consideration.
- Disparity tolerance (occlusion detection). The larger the value, the more mismatch is allowed (between left and right depth maps) before declaring that the disparity computed at a pixel is unreliable.
- Window radius (occlusion smoothing).
- Sigma space (occlusion smoothing).
- Sigma color (occlusion smoothing).
The parameters that relate to occlusion detection and smoothing should probably be left alone since they only have an effect on the "occluded" pixels, that is, the pixels that show up in black in the occlusion maps.
- Downsampling factor. This parameter enables DMAG2 to run faster by downsampling the images prior to computing the depth maps. If set to 1, the images are used as is and there's no speedup. If set to 2, the images are resized by reducing each dimension by a factor of 2 and DMAG2 should go 4 times faster. The more downsampling is requested, the faster DMAG2 will go, but the more pixelated the depth maps will look upon completion (as they are upsampled). If downsampling is turned on, the parameters that are spatial, that is, min and max disparity, window radius, gamma proximity, window radius (occlusion smoothing), and sigma space (occlusion smoothing) are automatically adjusted to adapt to the level of downsampling that is requested. In other words, you don't have to wonder if you should change those parameters when switching, for example, from downsampling factor = 1 to downsampling factor = 2 as DMAG2 does it automatically for you.

Here's an example:

Left image (after rectification).

Right image (after rectification).

Left disparity map obtained with DMAG2.

Input used for DMAG2: