Friday, February 28, 2014

2D to 3D Conversion - The Girl with the Pearl Earring

This an attempt at the 2D to 3D conversion of a famous painting by Vermeer called "The Girl with the Pearl Earring" using Depth Map Automatic Generator 4 (DMAG4) - semi-automatic 2D to 3D image converter.


Here she is, the girl with the pearl earring in all its flat two-dimensional glory.


Sparse depth map painted in Gimp.

Painting in the depth map can be quite challenging especially when you have depth changes occuring in two directions. A good example of this is the forehead which recedes horizontally as well as vertically. That can be challenging to scribble correctly. Here, I chose to use the gradient option in the pencil toolbox. The difficulty though is getting the length fade of the gradient correct. The alternative is to make a selection and run the gradient fill, which should work also.


Dense depth map obtained with DMAG4.

Here, I am gonna go the extra mile and smooth the depth map with Edge Preserving Smoothing 2 (EPS2).


Dense depth map smoothed with DMAG2.


3D scene rendered in Gimpel3D. As you can see, it's far from being perfect but that should do for our purposes.

Note that Gimpel3D seems to have trouble loading depth maps that are indexed (256 levels of gray), so it may be a good idea to convert the depth map to RGB prior to loading it into Gimpel3D.


Here's the wigglegram created in Gimp. In Gimpel3D, I positioned the viewing plane right in front of the nose and save the left and right images for an eye separation of 4.0 and 2.0. If you add the reference image, that gives a total of 5 frames for the 3D animation.

Tuesday, February 25, 2014

Frame Sequence Generator 4 (FSG4)

Frame Sequence Generator 4 (FSG4) generates synthetic frames (virtual views) given a reference image and a disparity/depth map. The inpainting process which fills areas that become disoccluded uses a Gaussian blur (controlled by radius and gamma proximity).

If image.xxx is the reference image, the frames generated will be named frame01.xxx, frame02.xxx, etc. The file format of the reference image dictates the file format of the generated frames. For example, if the reference image is in the png image file format, the generated frames will be in the png file format. Same idea for jpeg and tiff, the other two main image file formats.

Below is a little blurb about what each parameter does.

- Stereo window (grayscale value): Positions the stereo window in the depth map. As you know, 0 is black and 255 is white, so giving a value between 0 and 255 positions the stereo window in the depth map. If the stereo window is set to 0, the stereo window is in the background which is gonna remain stationary and the foreground is gonna move. If the stereo window is set to 255, the stereo window is in the foreground which is gonna remain stationery and the background is gonna move. Clearly, the stereo window can be set anywhere between 0 and 255.
- Stereo effect (percentage): The larger the stereo effect, the more things are gonna move and the harder it's gonna be to paint the disoccluded areas.
- Number of frames: No need to explain further. If it's an odd number, the reference image will be one of the frames, the one in the middle of the sequence (for example, frame 3 if 5 frames are requested).
- Radius: The larger the radius, the more blurring will be applied to the inpainted areas.
- Gamma proximity: Controls the shape of the bell curve that's used in the Gaussian blur. The larger the gamma, the flatter the bell curve and the more blurring is applied for a given radius.
- Maxiter: Controls the number of iterations performed when solving the equations needed to solve the Gaussian blur.

As far as inpainting quality is concerned, the parameter that gives the more bang for the buck is without a doubt the radius. If the inpainted areas look do not look quite right, the radius should be increased (so that more neighboring pixels are involved in the blurring). If the inpainted look horrible no matter what in some areas, it's very likely the depth map is bad in those areas.

The disparity/depth map should be a grayscale image where white indicates closest foreground object and black indicates furthest background object. The disparity/depth map doesn't need to come from any of my programs.

Here's an example:


Reference image.


Dense depth map.


Wigglegram/wobble made up of 12 synthetic views.

The windows executable (guaranteed to be virus free) is available for free via the 3D Software Page. Please, refer to the 'Help->About' page in the actual program for how to use it.

Saturday, February 22, 2014

2D to 3D Conversion - Who's Next

Here's an attempt at doing a 2D to 3D image conversion of The Who's "Who's Next" album cover using Depth Map Automatic Generator 4 (DMAG4) - free semi-automatic 2D to 3D image converter.


Here's the album cover as we know it in 2D.


This is the sparse depth map generated in Gimp using the pencil tool.


Here's the dense depth map obtained using DMAG4, the semi-automatic 2D to 3D image converter.


Video showing the 3D rendered scene in Gimpel3D.


Wigglegram (made with just 2 frames). The left and right images were obtained in Gimpel3D. The stereo window (viewing plane in Gimpel3D) was placed right at the block's leading edge.

Gimpel3D can be used to generate the left and right images for different eye separation values. Those in turn can make up a nice wigglegram. Frame Sequence Generator 4 (FSG4) can also be used to generate the left and right images for an equally nice wigglegram.

I am not a big fan of the anaglyph format, mostly because it requires red-cyan glasses which not everybody has or even want to use. I much prefer an animated 3D gif (wigglegram) or a 3D rendered scene.

Thursday, February 20, 2014

2D to 3D Conversion - Van Gogh Painting

The following is an example of a 2D to 3D image conversion using Depth Map Automatic Generator 4 (DMAG4) - free semi-automatic 2D to 3D image conversion software. This time, the victim is a Vincent Van Gogh painting.


Reference image.


Sparse depth map obtained by scribbling on a new layer with the pencil tool in Gimp. Transparent pixels show up in white here (they show up in red in DMAG4.)


Dense depth map obtained with DMAG4.


Video showing the rendered scene in 3D (using Gimpel3d). The left and right images can be exported in Gimpel3D once you're satisfied with the focal length (viewing plane position) and eye separation.

Clearly, this ain't gonna win me any prize at the next 2D to 3D image conversion convention. The hard part is figuring out at what depth things in the reference image are supposed to be. Getting the dense depth map from the sparse depth map is really the easy part.

If you have made some 2D to 3D image conversions using DMAG4, I would be happy to feature them on this very blog.

Sunday, February 16, 2014

2D to 3D Conversion - Portrait of a Redditor

This is a 2d to 3d image conversion done using Depth Map Automatic Generator 4 (DMAG4) - free 2D to 3D semi-automatic image converter.


This is the image I am gonna try to convert into 3d, a portrait in front view.

To generate the sparse depth map needed by DMAG4 (made up of "clumsy" brush strokes), I made use of key measurements of the human head when viewed in profile:


Key dimensions of the average human male head in profile. The drawing, grid included, is a copy of a drawing in "Drawing the Head and Hands" by Andrew Loomis.

Then I set up a correspondence between gray levels (recall that 255 is white and 0 is black) and distances (in cm):

255 -> 0 cm
245 -> 2.1 cm
235 -> 4.2 cm
225 -> 6.3 cm
...

Then I went to work in Gimp using the pencil tool on a transparent layer:


This is the sparse depth map I made in Gimp (here, shown on top of the reference image).


This is the sparse depth map ready to be fed into DMAG4. Note that the white areas are really transparent. In gimp, those areas show up with a checkerboard pattern but here, on blogger, they show up as white (red in DMAG4).

In Gimp, there are a few things you kinda need to know in order to make your life just a bit easier when working on a sparse depth map:

1) How to create a gradient driven brush stroke between two already colored areas? Click on "Dynamics"->"Color from Gradient" in the pencil tool options, use the eye dropper tool so that the foreground color corresponds to your start point, use the eye dropper tool to make the background color your end point, and draw with the pencil from the start point to the end point. The brush stroke will have a gradient that goes from the foreground color to the background color. You can control the fade from foreground color to background color with the "Fade length" slider under "Dynamics Options" in the pencil toolbox.

1b) How to fill a selected area with a color gradient? I think this is a much better way to do gradient stuff than what's described in 1). What you do is make a polygonal selection with the "Free Select Tool", choose a foreground and background color, and then use the "Blend Tool" to create a gradient fill between a foreground color and the background color within your selected area. Note that the gradient fill is gonna be anti-aliased so you will want to do 3) at some point.

2) How to use the eraser? That's easy enough, just pick up the eraser tool but make sure you check the "Hard edge" box in the eraser tool options.

3) How to make all the pixels in a layer either fully transparent or fully opaque? If, for whatever reason, some pixels in the layer are not semi-transparent due to anti-aliasing shenanigans, you need to make those semi-transparent pixels either fully transparent or fully opaque. That's easy enough if you click on "Layer"->"Transparency"->"Threshold Alpha ...", and click OK. Now, all your pixels in the layer are either transparent or opaque. DMAG4 checks for the presence of semi-transparent pixels in the sparse depth map and complains vividly if there are some.


This is the dense depth map automatically created by DMAG4.

It took me a couple of iterations of going into Gimp and running DMAG4 to get this dense depth map. Between Gimp and DMAG4, I also use Gimpel3D to "visualize" the depth map in 3d and make corrections to the sparse depth map which I then re-feed into DMAG4 (I use Gimpel3d just for visualization purposes although it could be used for much more).

In Gimpel3d, use the "Single Frame" tab to load the reference image ("Load Fresh Frame") and the depth map ("Import Depth Mask"). Note that you may have to flip the image and the depth map vertically in the "Image Settings" tab if they are upside down (you either do nothing or you flip them both). To view the scene in 3Dd click on the "3D" button. Use the combo "Alt"+"left mouse button" to rotate and "Alt"+"right mouse button" to zoom in or out. To control the depth, first click on "Default Layer" (the scene will get a red border) in the lower right menu and then click on "Depth" in the upper right menu. Moving the slider will change the depth of the scene.


This is the 3d scene as rendered in Gimpel3d.

Gimpel3d can be used to get the left and right images, assuming the reference image is the center image. Before doing that, you may want to change the location of the viewing plane (defines the stereo window) and/or the eye separation (determines the severity of the stereo effect). Click on "Views"->"Show Camera" to see where the camera and the viewing plane are. To change the location of the viewing plane, click on "Stereo Settings" in the upper right menu and move the slider for the "Focal Length". To change the eye separation, move the slider for "Eye Separation". The "Linear Scale" slider maintains the ratio between the focal length and the eye separation. Click on "File"->"Export Files" to get to the export options screen and click on "Save Left" to save the left image and "Save Right" (bottom row of buttons below "Export Current Frame Only") to save the right image. You can also save the anaglyph with "Save Anaglyph". I do have however the impression that what Gimpel3d calls the left image is actually the right image and vice versa. Indeed, the right image should have a gap fill on the right side and the left image should have the gap fill on the left. Here, I am getting the opposite effect and that's quite odd. In any case, this is to be verified. For this particular case, not that the reference image and depth map had to be flipped vertically, maybe that's got something to do with it.


This is the 3d scene as rendered in Gimpel3d showing the viewing plane.


Left image.


Right image.


Animated GIF created in StereoPhoto Maker (as is, without changing the location of the viewing plane) switching between the left and right images.

If you want to generate more frames for a smoother animated GIF or a lenticular, the easiest way is probably to keep reducing the eye separation in Gimpel3d and save the left and right images.

Thursday, February 6, 2014

Depth Map Automatic Generator 4 (DMAG4)

DMAG4 is a semi-automatic 2D to 3D image converter. DMAG4 is an implementation of what's explained in Semi-Automatic 2D to 3D Image Conversion using Random Walks.

The sparse depth map (which the program requires) is a transparent image partially filled with brush strokes in various shades of gray to indicate (approximate) depth. This is the non-automatic part. So, how do you create that sparse depth map? In Gimp or Photoshop, you open the reference image and create a new transparent layer on top of the reference image. This is where you are gonna draw the sparse depth map. The brush strokes should be done with the pencil tool using a 100% hard brush so that any painted pixel is fully opaque (no anti-aliasing). To verify that you are not generating semi-transparent pixels, you may want to zoom in on your painted brush strokes and look at the edges. They should be hard and look jagged (no anti-aliasing involved). In case semi-transparent pixels get created for whatever reason (maybe you used a tool with anti-aliasing turned on), they can be easily removed prior to saving the sparse depth map by doing the following (in Gimp): click on "Layer"->"Transparency"->"Threshold Alpha ...", and click OK. After you are done scribbling on the sparse depth map layer, save the layer (only the layer) in whatever format you want as long as it is lossless (png for example). To save the layer only, you can simply make the reference image not visible by clicking off the eye icon in the layer dialog box.

Let's go through the parameters DMAG4 uses. Then, we will introduce the notion of an "edge image" to simplify the process.

The maximum number of iterations is a parameter used in the linear equation solver (which happens to be Lis, a Library of Iterative Solvers for Linear Systems). If you see that the depths don't seem to propagate correctly across areas of similar color (in the reference image), it's probably because the number of iterations is set too low.

The beta parameter is explained in Semi-Automatic 2D to 3D Image Conversion using Random Walks. The lower the beta, the more smooth the depth map is going to be although the more bleeding may occur across object boundaries (if no "edge image" is used). In other words, the lower the beta, the easier it is for depths to propagate and the easier it is for depths to bleed across object boundaries (if no "edge image" is used). If an "edge image" is used, beta can be lowered quite a bit because bleeding is prevented by the existence of the "edge image". With a low beta, depths propagate very easily within an object as the bilateral filter (that's really what DMAG4 is) becomes a mere Gaussian filter.

The number of scales controls the size of the scale space. DMAG4 can build a scale space in order to mitigate the effects of noise or fine texture in the reference image (at the expense of respect of object boundaries). The more scales, the smoother the depth map is going to be, the more bleeding is going to occur, the longer it is going to take and the more memory is going to be used. If you use an "edge image", setting this parameter to 1 should be just fine.

When the level of graph connection within a scale (con_level) is set to 1, each pixel is connected (by a graph edge) to four adjacent pixels (left, right, top, and bottom). If con_level is set to 2, it is connected to eight adjacent pixels (the four from level 1 and the four more situated diagonally). When the level of graph connection across scales (con-level2) is set to 1, each pixel is connected to one pixel on the previous and next scale. If it is set to 2, each pixel is connected to five pixels on the previous and next scale. If it is set to 3, each pixel is connected to nine pixels on the previous and next scale. The more connections, the more the depths get propagated. Using con_level = 1 and con_level2 = 1 is probably a good idea and there's really no need to change that.

Now, let's talk about the "edge image" which is really supposed to make everything easier at not much of a cost for the user.

In order to eliminate the bleeding of depths across objects of similar colors, it is possible to give a so-called "edge image" to DMAG4. An edge image is basically a trace over the object boundaries of the reference image. Just like for the sparse depth map, it is created by adding a transparent layer over the reference image and tracing over the object boundaries. The tracing on the edge image should be 1 pixel wide (in most places). The presence of an edge image enables to be less conservative regarding possible bleeding. Thus, beta can be lowered (say, from 90. to 10.) and the number of scales increased with less fear. Also, the sparse depth map probably doesn't need to be as dense. Note that the edge image does not need to be complete, that is, only areas where bleeding occurs can be traced over. If you do use one an edge image, make sure con_level and con_level2 are set to 1. To sum things up and be perfectly clear, DMAG4 does not need an edge image but if colors are similar between foreground and background, it is probably a good idea to create an edge image to avoid bleeding in some areas.

What is the best way to trace over object boundaries to create the "edge image"? In Gimp, I like to use the "Paths Tool" and draw a path along the object boundaries by left-clicking. When done drawing the path, click on "Stroke Path" in the "Tool Options" of the "Paths Tool", click on "Stroke line", click on "Solid color", uncheck the "Antialiasing" box, choose a "Line width" of 1.0 pixel, and click on "Stroke". When you zoom in on the stroke, it should 1 pixel wide with no antialiasing. The color of the stroke does not matter although I like to use red in most cases. It is very easy to do.

If DMAG4 gives an error about the pixels being semi-transparent, it's because you probably used a tool that's anti-aliased (soft edges) at some point. To fix this in Gimp (Photoshop is probably similar), click on "Layer"->"Transparency"->"Threshold Alpha ...", and click OK. All your semi-transparent pixels are now either transparent or opaque and DMAG4 will thank you for it.

Here's a post that shows the process: 3D Image Conversion - Top Gun.

Here's a video tutorial for DMAG4 used with an "edge image" (I am so sorry there is no sound):


If you have any problem with DMAG4, feel free to send your reference image, sparse depth map, edge image (if any), and dense depth map (if you have gotten that far) to the email that should be somewhere in the right sidebar.

The windows executable (guaranteed to be virus free) is available for free via the 3D Software Page. Please, refer to the 'Help->About' page in the actual program for how to use it.

Wednesday, February 5, 2014

Semi-Automatic 2D to 3D Image Conversion using Random Walks

The "Random Walks" methodology used for semi-automatic 2D to 3D conversion is quite similar to the one used in semi-automatic image segmentation (see "Random Walks for Image Segmentation" by Leo Grady). The goal of image segmentation is to split an image into a set of homogeneous regions, say, in terms of color. A semi-automatic image segmentation approach relies on having the user indicate the number of regions (say, K) and paint a brush stroke (or less) within each region. The painted pixels are called "seeds". If one considers an image to be a graph (by connecting a pixel to its neighboring pixels with a weighted edge), the question that is to be asked for each unseeded pixel is as follows: Given a random walker starting at that pixel, what is the probability that he will first reach a seed associated with region S (S goes from 1 to K)? The edges are weighted (say, from 0 to 1) knowing that a random walker is much more likely to go along an edge whose weight is high (connected pixels are similar in color) than an edge whose weight is low. For now, let's assume this problem can be solved. It is then a matter of picking the highest probability for a given unseeded pixel to obtain the region it would reach first and associate the pixel with that region. Do this for every unseeded pixel and you've got your segmentation.

Now, how do you solve the "Random Walker" problem? Well, one way of doing so is by using an electrical circuit analogy. Given a region S, ground all seeded pixels that are not associated with S and put a voltage of unity to all seeded pixels associated with S, and solve for the unknown voltages using Kirchoff's Current Law and Ohm's Law. It's a linear system of equations which can be solved rather easily. In all cases, the obtained voltages are the probabilities we were discussing just above. The following pictures kinda explain the main points of the method. A great deal comes from Richard's Rzeszutek Masters Thesis entitled "Image Segmentation through the Scale Space Random Walker".


Using Kirchoff's Current Law and Ohm's Law to obtain the equation each graph node (pixel) must satisfy.


Edge weights.


Matrix assembly.

Well, we can use the exact same methodology to do semi-automatic depth map generation for 2D to 3D image conversion. Again, the user is asked to work a little, but this time, he/she has to put down some depth values (shades of gray) using brush strokes, creating a sparse depth map. The depth values which vary from 0 to 255 (the grayscale range) are converted into voltages which vary from 0 to 1. Just like with image segmentation, using Kirchoff's Current Law and Ohm's Law, we obtain a system of linear equations, which, when solved, give the unknown voltages at the pixels that were not painted. Multiply the voltage by the grayscale range and you get a grayscale intensity or depth.

Random Walks, as is, is quite sensitive to noise, not unlike like many computer vision methodologies. To alleviate this problem, the Ryerson team uses a scale space approach which they call SSRW (Scale Space Random Walks) as opposed to plain RW (Random Walks). In the scale space approach, the original image is smoothed (convolved by an isometric Gaussian kernel) as many times as desired. By stacking the images on top of each other, the graph is given an extra dimension to make it a 6-connected three-dimensional graph. This apparently improves the quality of the depth maps produced but it comes at a price: the number of equations to solve is multiplied by the number of (smoothing) levels. See "Image Segmentation using Scale-Space Random Walks" by Richard Rzeszutek, Thomas El-Maraghi, and Dimitrios Androutsos for more info on SSRW.


Scale Space Random Walks.

The following pictures were taken from "Semi-Automatic 2D to 3D Image Conversion Using Scale-Space Random Walks and a Graph Cuts Based Depth Prior" by R. Phan, R. Rzeszutek and D. Androutsos to illustrate the SSRW process:


Reference image


Reference image with painted strokes showing desired depths (sparse depth map)


Depth map otained with SSRW