Monday, December 29, 2014

3D Photos - Nathaniel Bowditch

I would like, in this post, to pit Depth Map Automatic Generator 5 (DMAG5) against Depth Map Automatic Generator 3 (DMAG3).


Left image of Nathaniel Bowditch's tomb in Mount Auburn cemetery.


Right image of Nathaniel Bowditch's tomb in Mount Auburn cemetery.

The stereo pair was taken with my Fuji W3. I resized it in StereoPhoto Maker (without resampling, this time) and aligned it in StereoPhoto Maker as well.

Alright, let's get started and run DMAG5 focusing first on the window radius parameter.


Depth map obtained by DMAG5 with window radius = 12.


Occlusion map obtained by DMAG5 with window radius = 12.


Depth map obtained by DMAG5 with window radius = 24.


Occlusion map obtained by DMAG5 with window radius = 24.

Recall that the occlusion map shows in black the pixels for which the computed disparity is wrong (mismatch between the left and right disparity maps). With window radius = 24, there are fewer wrong disparities in the background but there are more errors at object boundaries (between statue and background). One can safely say that increasing the window radius is a good thing until the object boundaries become too blurry (depth map becomes too smooth).

I guess I could tinker with DMAG5's other parameters but I don't think it's gonna make a tremendous difference, so I'll just stop here as far as DMAG5 is concerned.


Animation gif created using the depth map obtained by DMAG5 with radius = 12 (thanks to depthy.me).

On to DMAG3 which as you probably recall is a graph cuts type of depth map generator. The parameters to play around with are lambda (smoothing parameter) and K (occlusion parameter). In general, if the occlusion map shows a whole lot of black pixels, it's a good idea to increase K. Also, if the depth map shows a lot of noise, it's generally a good idea to increase lambda.


Depth map obtained by DMAG3 with lambda = 10 and K = 30.


Occlusion map obtained by DMAG3 with lambda = 10 and K = 30.

Alright, right off the bat, it would make sense to increase K to make the occlusion map less dark. Let's bump K from 30 to 50.


Depth map obtained by DMAG3 with lambda = 10 and K = 50.


Occlusion map obtained by DMAG3 with lambda = 10 and K = 50.

Hmm, the depth is kinda noisy in the background, so it would be good to increase the smoothing parameter lambda, say from 10 to 40.


Depth map obtained by DMAG3 with lambda = 40 and K = 50.


Occlusion map obtained by DMAG3 with lambda = 40 and K = 50.

Yeah, I guess it might be a good idea at this point to increase K and keep playing with the parameters, but I am gonna stop right here but you get the idea, I'm sure.


Animation gif created using the depth map obtained by DMAG3 with lambda = 40 and K = 50 (thanks to depthy.me).

Obviously, we have a bit of a problem here because the background in the depth map is a tad too noisy to make a good wiggle. One option to remedy this problem is to smooth the disparities with Edge Preserving Smoothing 5 (EPS5), which is what we're gonna do.


Depth map obtained by DMAG3 with lambda = 40 and K = 50 and then smoothed with EPS5.


Animation gif created using the depth map obtained by DMAG3 with lambda = 40 and K = 50 and then smoothed by EPS5 (thanks to depthy.me).

I am not sure what kind of conclusion can be drawn here, especially with only one test case but it seems to me that a local stereo matching method like DMAG5 which is a weight based window type stereo matching method can outperform (in many ways) a global method like DMAG3 which is a global stereo matching method relying on graph cuts. I know this goes against the academic grain a bit as global methods have kinda been put on a pedestal in the literature for years now, but in practice, one has to recognize that the simpler window-based local methods, if done right, are more than adequate to produce depth maps for making 3d wiggles and lenticulars.

Thursday, November 27, 2014

Lenticulars - How to create a lenticular from a stereo pair with free software

In this post, we are gonna try to describe the whole process of producing a lenticular (in this case, using a 6x4 60 lpi lenticular lens from vuethru) from a stereo pair (in this case, a mpo taken with the Fuji W3 at the 3:2 aspect ratio setting and aligned in StereoPhotoMaker). The cool part is that it's all free, well, at least, as far as the software is concerned.

This tutorial was written quite a long time ago and since then quite a few things have changed:
1) You don't need to reduce the image size prior to using DMAG5. Just specify a "downsampling factor" greater than 1 and you're golden.
2) You don't need to flip the images to get the right depth map. DMAG5 now produces a left and right depth maps.


Left image (dimensions: 3617 x 2410).


Right image (dimensions: 3617 X 2410).

We are not gonna use those to produce the left and right depth maps because those images are, in my opinion, too big. What we are gonna do is reduce them in size by a factor of 4 (in StereoPhotoMaker) and enlarge back the reduced size depth maps later:


Reduced size left image (dimensions: 914 x 608).


Reduced size right image (dimensions: 914 x 608).

We are now ready to get the (reduced size) left and right depth maps. Disparity Finder 2 (DF2) tells us that the min disparity is -32 and the max disparity is 0. We use Depth Map Automatic Generator 5 (DMAG5) to get the (reduced size) left depth map.


Reduced size left depth map (dimensions: 914 x 608).

To get the (reduced size) right depth map, we need to horizontally flip the (reduced size) right image, horizontally flip the (reduced size) left image, use DMAG5 to get the (reduced size) depth map without changing the min and max disparities, and horizontally flip the (reduced size) depth map to get the (reduced size) right depth map. It is very important to use the same parameters to get the left and right depth maps.


Horizontally flipped reduced size right image (dimensions: 914 x 608).


Horizontally flipped reduced size left image (dimensions: 914 x 608).


Reduced size depth map (dimensions: 914 x 608).

This depth map needs to be horizontally flipped. The resulting depth map will be the (reduced size) right depth map.


Reduced size right depth map (dimensions: 914 x 608).

We are gonna enlarge (in Gimp or Photoshop) the reduced size left depth map and the reduced size right depth maps so that their new sizes correspond to the original left and right images.


Full size left depth map (dimensions: 3617 x 2410).


Full size right depth map (dimensions: 3617 x 2410).

We can now generate the frames for our lenticular using Frame Sequence Generator 6 (FSG6). Since my lenticular lens is 60 lpi and my printer has a resolution of 600 dpi, I am gonna go with a 10 frame lenticular (600 divided by 60 is 10). Because the reduced size is one fourth the original size, the min disparity and the max disparities that were used to generate the reduced size depth maps must be multiplied by 4. The min disparity to use in FSG6 is thus 4x(-32)=-128. The max disparity is 4x0=0.

Now that we have the frames for our lenticular, we need to interlace them into a single image to print and laminate to our lenticular lens. To interlace, we are gonna use Lenticular Image Creator (LIC). I have talked about LIC before in Lenticulars - Interlacing with Lenticular Image Creator. I have tried to use SuperFlip but it doesn't like the .png file format and I am too lazy to switch to .tif. Before all that though, one needs to know the exact lpi for the given printer (and paper) by performing a pitch test. I have talked about the pitch test in Lenticulars - Pitch Test with SuperFlip. Let's just say that for my printer, it is 60.12 lpi. In LIC, I use no registration marks but you can certainly add them if you want. Since my lens is 6x4, I choose an output width of 6 inches. Because the ratio of my stereo pair is 3:2, I expect the height to be close to 4 inches, which is the case.

Lenticular interlacing is a bit of mysterious affair but I have attempted to explain it in Lenticulars - Lenticular Interlacing.


10 frame interlaced image (dimensions: 3600 x 2399).

We now need to print the interlaced image onto glossy photo paper. In Gimp, I load the image, go into "Print Size" under "Image", and choose 6 inches for the width. This should give us a resolution of 600 ppi, which is what we requested in LIC. The image is ok to be printed.

All is left to do is properly align the printed interlaced image to the lenticular lens and laminate it. This is where it can get frustrating. What follows explains how I do it. There are probably better ways but it works for me.

- Make a vertical cut in the protective film near the border of the lens and attach a tiny bit of tape to it so that the strip of protective film can be easily removed.

- Position the lens until you obtain a good 3d effect straight on. You should get a 3d effect as soon as the lens is vertically aligned with the interlaced image. That's not all though. The lens should also be "centered" w/r to the interlaced image, that is, it should take the same amount of head shifting to the left and to the right before the image "jumps" (Closing one eye might help for this.) In other words, you should see the middle frame when you look at the lenticular straight on (with one eye closed). During the centering process, if it looks like you need to move the lens one way, well, move it the other way because everything is inverted in the interlaced image (The first frame is to the right and the last frame is to the left.)

- Pull on the piece of tape to release the strip of protective film and press the lens onto the interlaced image.

- Remove the rest of the protective film and press the lens onto the interlaced image underneath using either a laminator or a brayer. For a 6x4 lens, you don't really need any of that, you just have to be careful not to trap air when you press the lens against the paper underneath.

You may want to check this video on youtube about how to register a 3d photo to the lenticular lens since it's pretty well explained:


Of course, an article about 3d would not be complete without an animated gif of some sorts, so here's the 3d wiggle obtained using the reduced size left and right images and the reduced size left and right depth maps:


In the above, I may have used dpi instead of ppi, or vice versa. This is very common behavior as a lot of software out there use the two interchangeably. The dpi is a printer only thing, it tells you how many dots the printer can lay per inch on the paper. By the way, if the printer resolution is 4800 and it uses four colors, the printer resolution is really 1200 (4800/4). In theory, the dpi should be larger than the ppi and people in the know say that it should be a multiple of the ppi (that latter point, I am not completely sure why so if you know of a good source, let me know).

Please note that I am no expert in making diy lenticulars, so if you have suggestions, please let me know your thoughts on the subject. Making lenticulars at home with any inkjet printer is no easy task and the more streamlined and fool proof the process can be, the better.

Saturday, November 15, 2014

3D Photos - Pipes


Left image.


Right image.

We are gonna use Depth Map Automatic Generator 5 (DMAG5) to generate the (left) depth map using a min disparity of -28 and a max disparity of 2. I haven't changed the (current) default settings except the number of smoothing iterations (for the disparities of occluded pixels) which I have set to 0.


Depth map.


Occlusion map.

Let's have some fun and use depthy.me to generate a wiggly gif using the left image and the depth map. I highly recommend that website by the way. Because depthy.me expects near objects to be black and far objects to be white, you need to invert the depth map before you load it into depthy.me.


Wiggle gif obtained from the left image and the (left) depth map via depthy.me.

Now, what we really want is to generate interpolated views between the left and right images using Frame Sequence Generator 6 (FSG6) but, for that, we need a depth map for the right image.

The following procedure to obtain the right depth map (as opposed to the left depth map) is a bit obsolete as DMAG6 now automatically generates the left and right depth maps in one shot. For FSG6 to work, you still have to color invert the right depth map so that white is near and black is far.

Here's how you get a right depth map:
1) You horizontally flip the right image (that's your new image 1 or "left" image),
2) You horizontally flip the left image (that's your new image 2 or "right" image),
3) You use your favorite depth map generator using the exact same parameters (including the min and max disparities) you used to get the left depth map, and
4) You horizontally flip the depth map you have just obtained (That's the right depth map).


Right image horizontally flipped.


Left image horizontally flipped.


Depth map obtained with DMAG5.


Depth map flipped. That's the right depth map.

Now, we are finally ready to use FSG6. Here, I am gonna generate 5 frames which I am gonna use to create a wiggle gif. It's very easy to do in The Gimp for example loading up the frames with "Open as Layers...".


Wiggle gif from frames created with FSG6.

Monday, October 13, 2014

Frame Sequence Generator 6 (FSG6)

Frame Sequence Generator 6 (FSG6) interpolates views between a left image and a right image using the left and right disparity/depth maps. FSG6 relies on backward mapping/warping as explained in View Interpolation (forward mapping/warping and backward mapping/warping).

The input to FSG6 is:
- left image,
- right image,
- left depth map,
- right depth map,
- minimum disparity and maximum disparity (the ones you used to generate the left depth map), and
- the number of frames needed.

The number of frames includes the left and right images. As an example, if the number of frames requested is 3 and the file type of image 1 (left image) is png, the interpolated views will be named frame0.png, frame1.png, and frame2.png. The first frame (in this case, frame0.png) is always a copy of image 1 (left image) and the last frame (in this case, frame3.png) is always a copy of image 2 (right frame).

What follows is for depth map generators that only generate the left depth map:

To generate the right depth map after having generated the left depth map (using any automatic depth map generator that only outputs a left depth map), all you have to do is flip the right image horizontally (that's gonna be your new image 1), flip the left image horizontally (that's gonna be your new image 2), and give the two new images to the depth map generator keeping the minimum and maximum disparities unchanged (and any other parameter unchanged as well). The resulting depth map is the right depth map once it has been flipped horizontally. Flipping an image horizontally is quite easy to do in Photoshop or The Gimp.

In contrast, if your favorite depth map generator generates both depth maps in one shot like DMAG5 or DMAG6, then you don't have to do any of the flipping mentioned above. All you need to do is color invert the right depth map so that the foreground is white and the background is black, just like for the left depth map.

The interpolated views can be used to make an animated 3d gif (wiggle) or to make a lenticular (via interlacing).

If you want to see FSG6 in action, you may want to have a look at 3D Photos - Pumpkins since I give an illustrated example that starts with a stereo pair and ends with a wiggle gif containing the intermediate images computed by FS6.

The windows executable (guaranteed to be virus free) is available for free via the 3D Software Page. Please, refer to the 'Help->About' page in the actual program for how to use it.

Depth Map Automatic Generator 6 (DMAG6)

DMAG6 is an automatic depth map generator that attempts to minimize a cost function that combines a data matching cost (you want the colors for matching pixels to be similar) and a smoothness cost (you want neighboring disparities to be smooth unless you are at an object boundary). It relies heavily on the very efficient implementation of multi-scale belief propagation described in Efficient Belief Propagation for Early Vision by Pedro Felzenszwalb and Daniel Huttenlocher.

DMAG6 is similar to Depth Map Automatic Generator 3 (DMAG3). Indeed, DMAG3 is also a "global method" stereo matcher but instead of using belief propagation to minimize the cost function, it uses graph cuts.

The disparity map is obtained for both the left image and the right image. This provides a way to figure out the occlusions (and pixels where the depth cannot be evaluated accurately) which are shown in the occlusion map.

Let's go over the parameters that control DMAG6's behavior:

- Minimum disparity is the disparity corresponding to the furthest point in the background.
- Maximum disparity is the disparity corresponding to the closest point in the foreground.
I suggest using Disparity Finder 2 (DF2) to get the minimum and maximum disparity. Better yet, you can use Epipolar Rectification 9b (ER9b) to rectify/align the two images (crucial for good quality depth map generation) and get the min and max disparities automatically.
- Alpha is the term that balances the color matching cost and the gradient matching cost. The closer alpha is to 0, the more importance is given to the color. The closer alpha is to 1, the more importance is given to the gradient. In theory, a higher alpha works better when there's quite a bit of texture in the image while a lower alpha works better when the image is relatively flat color wise.
- Truncation value (color) limits the value the color matching cost can take. It reduces the effects of occluded pixels (pixels that appear in only one image).
- Truncation value (gradient) limits the value the gradient matching cost can take. It reduces the effects of occluded pixels (pixels that appear in only one image).
- Truncation value (discontinuity) reduces the penalty when neighboring disparities are different. In the interior of an object, you want the disparities to be pretty much the same assuming the object is in the same depth plane, therefore you want to penalize disparity changes quite a bit. At the boundary of an object where depths can change abruptly, you don't want to penalize the disparity change too much.
I tend to set the truncation value (color) to 20.0 or 30.0, the truncation value (gradient) to 2.0, and the truncation value (discontinuity) to something very large, like 10000.0. Feel free to experiment though!
- Iteration number controls the number of belief propagation iterations at each level.
- Level number controls how many levels the coarse to fine pyramid has. Multi-scale belief propagation means that belief propagation is performed at different levels of the coarse to fine pyramid. The pyramid is built by recursively downsampling the original images.
- Disparity tolerance (occlusion detection). The larger the value, the more mismatch is allowed (between left and right depth maps) before declaring that the disparity computed at a pixel is wrong. Obviously, the larger the value, the less black the occlusion map will look.
- Window radius (occlusion smoothing).
- Sigma space (occlusion smoothing).
- Sigma color (occlusion smoothing).
The parameters that relate to occlusion detection and smoothing should probably be left alone since they only have an effect on the "occluded" pixels, that is, the pixels that show up in black in the occlusion maps.
- Downsampling factor. This parameter enables DMAG6 to run faster by downsampling the images prior to computing the depth maps. If set to 1, the images are used as is and there's no speedup. If set to 2, the images are resized by reducing each dimension by a factor of 2 and DMAG6 should go 4 times faster. The more downsampling is requested, the faster DMAG6 will go, but the more pixelated the depth maps will look upon completion. Downsampling is a quick way to test what parameters do. If downsampling is turned on, the parameters that are spatial, that is, min and max disparity, window radius, window radius (occlusion smoothing), and sigma space (occlusion smoothing) are automatically adjusted to adapt to the level of downsampling that is requested. In other words, you don't have to wonder if you should change those parameters when switching, for example, from downsampling factor = 1 to downsampling factor = 2 as DMAG6 does it automatically for you.

Here's an example:


Left image after rectification.


Right image after rectification.


Left depth map obtained with DMAG6 using alpha = 0.9, truncation value (color) = 20., truncation value (gradient) = 10., truncation value (discontinuity) = 10000., iteration number = 5, level number = 5, and lambda = 0.5.

More examples (that compare DMAG6 with DMAG5):
3D Photos - Stevenson tombstone
3D Photos - Civil War reenactors
3D Photos - Looking down at the tombstones

The windows executable (guaranteed to be virus free) is available for free via the 3D Software Page. Please, refer to the 'Help->About' page in the actual program for how to use it.

Sunday, October 12, 2014

Flow-based Delaunay Surface Reconstruction

When you scan a three-dimensional object, you end up with a point cloud (points on the surface of the object). The process of creating a three-dimensional model off that point cloud is called "surface reconstruction". There are many surface reconstruction methods but here, we are gonna focus on methods based on the Delaunay triangulation. Actually, we are even gonna restrict ourselves to methods based on the Delaunay triangulation which carve/sculpt the mesh away. The starting point of such methods is a Delaunay triangulation of the convex hull defined by the point cloud which is then carved away until, hopefully, the mesh accurately represents the scanned object. The pioneering paper on this methodology is: "Surface Reconstruction by Wrapping Finite Sets in Space" by Herbert Edelsbrunner. A more approachable paper is probably: "A geometric convection approach of 3-D reconstruction" by Raphaƫlle Chaine. Edelsbruner talks about flow as the driving force of his algorithm. For Chaine, it's convection, but it's basically the same exact thing. So, I think it may be a good idea to refer to this type of approaches as flow-based. Other Delaunay based surface reconstruction approaches include the so-called "crust" algorithms pioneered by Nina Amenta and Marshall Bern but we are not gonna talk about those here.

Flow-based Delaunay Surface Reconstruction in Two Dimensions


The process starts with a Delaunay triangulation of the convex hull. In two dimensions, the Delaunay triangulation of a set of points is such that the circumcircle of any element (circle passing through the three vertices of the element) is empty.

Here's an example of the Delaunay triangulation of the convex hull of a point cloud (note that I did this by hand, so I don't actually guarantee it's a Delaunay triangulation but it does look like one):


We basically want to remove elements from the boundary in until we get this (the two-dimensional model in mesh form):


Looking very carefully at the Delaunay triangulation of the convex hull, for a given boundary edge and its unique connected element, the circumcenter (center of the circumcircle) is either on the outside side of the boundary edge or the inside. Any time the circumcenter is on the outside side of the boundary edge, it's an element we probably don't want because it's very likely to be an element that's outside whatever the point cloud is supposed to model. Note that if the vertex opposite the boundary edge is inside the smallest circle passing through the edge's end vertices (we will refer to that circle as the circumcircle of the boundary edge), the circumcenter is on the outside side of the boundary edge and vice versa. This means that we have two ways to tell if an element should be removed. In practice, the latter interpretation of the test is preferred.

The two equivalent ways to tell if an element connected to a boundary edge should be removed:


Here's the pseudo code for the carving or sculpting algorithm that starts from the Delaunay triangulation of the convex hull and ends with the proper (or not so proper) mesh of the two-dimensional model:


Of course, it can't be that easy, all the time. In some cases, the algorithm above may terminate prematurely. Here's an example:


When this happens, the element must be deleted manually or some heuristics based on element size must be used. The algorithm above then continues where it left off. This is clearly the Achille's heal of the method since any decision making based on size can be dicey when sampling is not dense enough.

Flow-based Delaunay Surface Reconstruction in Three Dimensions


In three dimensions, things are little bit more complicated but the methodology is exactly the same as in two dimensions. Instead of triangles, the Delaunay triangulation of the convex hull of the point cloud is made up of tetrahedra. The test to see whether an element connected to a boundary face should be removed from the mesh involves either i) computing the circumsphere of the tetrahedron and determining if the circumcenter is on the outside side of the boundary face or ii) computing the circumsphere of the boundary face (the smallest sphere passing through its three bounding vertices) and determining if the vertex opposite the boundary face is inside. Unlike the two dimensional case, there is a bit of a problem when the element connected to a boundary face is bounded by another boundary face. In that case, you can remove the element if both tests (on both boundary faces) tell you that you should remove the element or if just one test (on one of the two boundary faces) tells you that you should remove the element. The former is obviously more aggressive than the later at carving elements away.

Here's the pseudo code for the carving or sculpting algorithm that starts from the Delaunay triangulation of the convex hull and ends with the proper mesh of the three-dimensional model:


In reality, it's a bit more complicated than that since you also have to deal with boundary faces that become disconnected (dangling boundary faces), which may have to be carved away as well (using a methodology similar to the two dimensional case). This may come as no surprise that when all is said and done, you may have to manually delete some elements (or use some potentially dangerous heuristics based on size) and restart in order to get the mesh you really want. The big issue is how much of this manual deleting has to be done. In two dimensions, it may not be too big of a deal but in three dimensions, it may be a problem. Unless there's a "good" automated way to perform those manual deletions, I don't see how a Delaunay based surface reconstruction like this one can be that useful.

What this methodology is lacking is a global view of the problem especially when you have to do manual deletion to "unstuck" the algorithm. It's very hard to tell if an element should be removed (is outside) when all the information at your disposal is very local. This in contrast with another Delaunay-based explicit method called Eigencrust ("Spectral Surface Reconstruction from Noisy Point Clouds" by Ravikrishna Kolluri, Jonathan Richard Shewchuk, and James F. O’Brienwhich) which partitions a graph that connects the elements to find the surface in one global operation (a costly proposition though, clock wise).

Delaunay-based methods like this one are referred to as explicit because there's no surface approximation involved, in other words, the surface passes through points that are actually in the cloud (To be pedantic, the surface is interpolated, not approximated.) This is unlike implicit methods like Poisson or "marching cubes" that approximate the object's surface. Note that those implicit methods need a good normal extrapolation scheme to be effective whereas an explicit method doesn't. If one is able to get decent, consistent with each other normals, a Poisson surface reconstruction ("Poisson Surface Reconstruction" by Michael Kazhdan, Matthew Bolitho and Hugues Hoppe) is extremely hard to beat.

Saturday, October 11, 2014

View Interpolation (forward and backward mapping/warping)

Given a stereo pair and two depth maps, the problem of getting an intermediate frame is known as view interpolation. The following blurb is very inspired by this academic paper: Fast View Interpolation from Stereo: Simpler can be Better by N. Martin and S. Roy. We are gonna look at two ways to get the interpolated image: forward mapping (warping) and backward mapping (warping). In both cases, an interpolated image using the left image and depth map and an interpolated image using the right image and depth map are built and they are combined to form the final interpolated image. The interpolated image is defined by the parameter alpha (alpha = 0 corresponds to the left image and alpha = 1 corresponds to the right image).

Let's look at how the left and right interpolated images are defined:


Maybe it's easier to grasp if you think about shifting pixels. To get the left interpolated image, you shift the left image's pixels to the left according to the left depth map. To get the right interpolated image, you shift the right image's pixels to the right according to the right depth map. It's as simple as that.

Forward Mapping (Warping)


Here's the typical pseudo code to get the left interpolated image using forward mapping:


Clearly, the term xL-alpha*dL(xL) is not an integer in most cases. The easiest way to deal with this problem is to round it to the nearest integer. The hardest way is probably to add color contributions to the 2 nearest pixels on either side of xM'. This is way beyond the scope of this blurb but it is known as "splatting" if you want to delve into it. The resulting interpolated image will have cracks and holes. The cracks come from that nearest integer business (no big deal) and the holes come from scenes in the image that are now revealed.


The right interpolated image can be obtained in a similar fashion. As for the left interpolated image, the right interpolated image will exhibit cracks and holes. When the two are combined, it is hoped that all holes will be filled. In reality, it can be a little bit more complicated than that as, for a a given pixel of the interpolated image, one can decide whether the color information should come from the left image only, the right image only, or both.

As a side note, if you don't have a right depth map and therefore there is no right interpolated image, holes (coming from a left image and a left depth map) are usually filled by getting the first non-empty pixel to the right and using its color to fill the hole, line by line. It's easy to spot as it produces very distinctive streaks. Another option is to propagate its color toward the left but considering the whole image (as opposed to line by line).

Backward Mapping (Warping)


The idea behind backward mapping is that, given a pixel xM in the intermediate image, you want to be able to get its color by using linear interpolation on the left (right) image. Because of this, the interpolated image will be guaranteed to have no cracks or holes. It doesn't mean the interpolated image will be perfect. The holes that you would get with a forward mapping won't be there but their filling (inpainting) might not be the best.

Here's some pseudo code to get the left interpolated image using backward mapping (warping):


This above can be done scanline by scanline (Scanline is a fancy way of saying horizontal line.) There might be more than one segment that contains xM. In that case, it's a good idea to consider the segment corresponding to the largest disparity (the most foreground object). Also, the segment search needs only be done within the minimum and maximum disparity (times alpha) that corresponds to the stereo pair.

The right interpolated image can be obtained in a similar fashion. The two interpolated images are then combined to give the final interpolated image.