Friday, October 11, 2013

An example of 2D to 3D conversion using Gimpel3D

Lately, I've been looking into ways to convert 2d flat images into 3d scenes, either directly or via a depth map. One can certainly create depth maps using Photoshop or Gimp but I have always been intrigued by 3d scene reconstruction. One program (free and runs on windows) that can do that has caught my eye: Gimpel3D. Gimpel3D was originally written to convert video streams but it can certainly be used to convert a single frame or image (although it might be a tad overkill just to get a depth map).

The documentation for Gimpel3D is not the best but you can pretty much figure out what all the tools do thanks to the html "help" inside the program. There are some things that are not very well explained though, in particular, the "Alignment Options" and "Gap Fill".

Something that should be stressed: Whatever you do in Gimpel3D in terms of modeling, when you look straight on at the scene from the virtual camera (the red dot when Views->Show Camera is checked), you will always see the 2d picture you loaded as if the scene were flat. This explains why things may look weird when you look at the 3d scene from different view points.

In the following youtube video, I've tried to reconstruct the Mona Lisa (actually, a close-up) using Gimpel3D:


Here's the corresponding depth map:


As you can see, there are some problems: the nose is a bit on the manly side and the eye sockets are set way too deep. There's also a depth discontinuity between the face and the hair on both sides. It's not really Gimpel3D's fault as the model file for the human head is probably not the greatest for our Mona Lisa.

Here's another youtube video where I attempt to model the nose of the Mona Lisa using 5 planes (layers) and the "orientation" tool:


Here's the depth map:


I guess I could have also used the "scale" tool but kinda forgot about it when I made the video. In theory, one could model the whole face manually using a bunch of facial planes (see books about drawing the human head). That would certainly be a bit tedious, especially knowing that there's the capability to project onto models. If you don't have a model file handy for an object in the foreground you want to render precisely, I am afraid that there might not be much of a choice. One could also probably use the "contour extrusion" tool and the "anchor points" tool to speed up the "dimensionalization" of an object. I've tried those and they work quite well (as described), but obviously these tools have their own limitations.

Friday, October 4, 2013

3D Photos - Art

In this post, we are gonna have a look at the stereo pair called "Art" from the Middlebury 2005 Stereo datasets with ground truth and see if we can get a reasonable depth map with Depth Map Automatic Generator (DMAG3) using various lambda values.

Here's "Art" with its ground truth:


Left image


Right image


Ground truth (supplied by Middlebury)

This is not an easy stereo pair to deal with because of the occlusions. Let's give DMAG3 a shot using different values for lambda:


Depth map obtained with DMAG3 (lambda=0)


Occlusion map obtained with DMAG3 (lambda=0)


Depth map obtained with DMAG3 (lambda=1)


Occlusion map obtained with DMAG3 (lambda=1)


Depth map obtained with DMAG3 (lambda=10)


Occlusion map obtained with DMAG3 (lambda=10)

3D Photos - Dolls

Let's consider the stereo pair "Dolls" from the Middlebury 2005 Stereo datasets with ground truth:


Left image


Right image


Ground truth (that's the depth map we rally want to get automatically)

We're gonna use Depth Map Automatic Generator 3 (DMAG3) to generate a depth map using various values for the parameter lambda:


Depth map (lambda=0)


Occlusion map (lambda=0)


Depth map (lambda=10)


Occlusion map (lambda=10)


Depth map (lambda=20)


Occlusion map (lambda=20)

The larger lambda is, the smoother the depth map looks, as expected. At some point, however, the object boundaries are bound to become blurred and accuracy becomes an issue. It should be noted that "Dolls" is an easy stereo pair to deal with since you have a lot of non-repeating textures.