My good friend Gordon sent me a set of 5 pictures taken with an iphone 4s and asked for a 3d reconstruction with as many points as possible using Structure from Motion 10 (SfM10) and Multi View Stereo 10 (MVS10).

step 1 is structure from motion using sfm10. The purpose of sfm10 is to compute the camera positions and orientations corresponding to each input image. It also output a very coarse 3d construction.

Input file for sfm10:

Number of cameras = 5

Image name for camera 0 = IMG_0071.JPG

Image name for camera 1 = IMG_0076.JPG

Image name for camera 2 = IMG_0078.JPG

Image name for camera 3 = IMG_0083.JPG

Image name for camera 4 = IMG_0100.JPG

Focal length = 2800

initial camera pair = 0 2

Number of trials (good matches) = 10000

Max number of iterations (Bundle Adjustment) = 1000

Min separation angle (low-confidence 3D points) = 0.1

Max reprojection error (low-confidence 3D points) = 10

Radius (animated gif frames) = 5

Angle amplitude (animated gif frames) = 1

It's the same as the input file that's in the sfm10_test directory except for the focal length. The focal length needs to be adjusted depending on the size of the images, here 2448x3264. I chose a focal length equal to 2800 (something that's about the same as the width/height). You could use different focal lengths in the same ballpark and it wouldn't make much of a difference (until you go too far and end up with sfm10 issuing errors about the initial camera pair).

Output from sfm10 (the last few bits):

Number of 3D points = 2547

Average reprojection error = 1.13272

Max reprojection error = 13.0476

Adding camera 3 to the 3D reconstruction ... done.

Looking for next camera to add to 3D reconstruction ...

Looking for next camera to add to 3D reconstruction ... done.

No more cameras to add to the 3D reconstruction

Average depth = 8.197 min depth = 1.43428 max depth = 44.7613

Step 2 is multi-view stereo with mvs10 to actuall build the dense 3d reconstruction using the results from sfm10.

Input for mvs10:

nvm file = duh.nvm

Min match number (camera pair selection) = 100

Max mean vertical disparity error (camera pair selection) = 1

Min average separation angle (camera pair selection) = 0.1

radius (disparity map) = 16

alpha (disparity map) = 0.9

color truncation (disparity map) = 30

gradient truncation (disparity map) = 10

epsilon = 255^2*10^-4 (disparity map)

disparity tolerance (disparity map)= 0

downsampling factor (disparity map)= 4

sampling step (dense reconstruction)= 1

Min separation angle (low-confidence 3D points) = 0.1

Max reprojection error (low-confidence image points) = 10

Min image point number (low-confidence 3D points) = 3

Radius (animated gif frames) = 1

Angle amplitude (animated gif frames) = 1

This is pretty much the same as the input file in mvs10_test directory. I did change a few things though. For the radius (disparity map), I used 16 instead of 32 for no good reason really. For the sampling step (dense reconstruction), I used 1 instead of 2 so that the number of 3d points would be as large as possible.

Output from mvs10 (the last bits):

Number of 3D points = 1626906

Average reprojection error = 1.59341

Max reprojection error = 12.6571

Now, having a 3d wobble that looks alright doesn't mean the 3d construction is as good as it could be since you may have bad 3d points, mostly outliers that stretch the depth of the 3d scene way too much. You can see that when loading the point cloud (duh.ply) into sketchfab (make sure to zip the file before uploading to sketchfab in order not to hit the max upload limit which is 50 mb), meshlab, or cloudcompare. If you can't even zoom onto what you think the 3d scene should look like, you have outliers and it's time to tighten the parameters that relate to the low-confidence 3d points or image points. First, you need to make sure "Min image point number (low-confidence 3D points)" is greater or equal to 3. I made sure of that so let's move on. You may want to increase "Min separation angle (low-confidence 3D points)", say, from 0.1 to 0.5. You may also want to decrease "Max reprojection error (low-confidence image points)", say, from 10.0 to 2.0. Let's do just that and see what happens and rerun mvs10. Note that mvs10 should run much faster as it doesn't have to compute the depth maps (saved in the .mvs files).

Input to mvs10:

nvm file = duh.nvm

Min match number (camera pair selection) = 100

Max mean vertical disparity error (camera pair selection) = 1

Min average separation angle (camera pair selection) = 0.1

radius (disparity map) = 16

alpha (disparity map) = 0.9

color truncation (disparity map) = 30

gradient truncation (disparity map) = 10

epsilon = 255^2*10^-4 (disparity map)

disparity tolerance (disparity map)= 0

downsampling factor (disparity map)= 4

sampling step (dense reconstruction)= 1

Min separation angle (low-confidence 3D points) = 0.5

Max reprojection error (low-confidence image points) = 2

Min image point number (low-confidence 3D points) = 3

Radius (animated gif frames) = 1

Angle amplitude (animated gif frames) = 1

Output to mvs10 (the last bits):

Number of 3D points = 1114293

Average reprojection error = 0.837404

Max reprojection error = 2.58606

Of course, the number of 3d points has decreased but the 3d reconstruction should be more accurate.

If you see stepping in the 3d reconstruction (depth jumps), it's probably because of the "downsampling factor (disparity map)" is too large. You can clearly see that when you load up the cloud point that's in the mvs10_test directory (assuming the "downsampling factor (disparity map)" was not changed and is still equal to 4). If you change it from 4 to 2, the depth maps should have more grayscale values (and therefore there will be more possible depth values for the 3d points) but it's gonna take 4 times longer to run mvs10. Note that if you change the "downsampling factor (disparity map)", you need to delete the *.mvs files from your directory to force mvs10 to recompute the depth maps. Once mvs10 completes and you want to later change parameters related to the low-confidence 3d or image points, mvs10 will run much faster as it doesn't have to recompute the depth maps. Anyways, let's rerun mvs10 using "downsampling factor (disparity map) = 2" ans see what happens (remember to remove the .sms files prior).

Input to mvs10:

nvm file = duh.nvm

Min match number (camera pair selection) = 100

Max mean vertical disparity error (camera pair selection) = 1

Min average separation angle (camera pair selection) = 0.1

radius (disparity map) = 16

alpha (disparity map) = 0.9

color truncation (disparity map) = 30

gradient truncation (disparity map) = 10

epsilon = 255^2*10^-4 (disparity map)

disparity tolerance (disparity map)= 0

downsampling factor (disparity map)= 2

sampling step (dense reconstruction)= 1

Min separation angle (low-confidence 3D points) = 0.5

Max reprojection error (low-confidence image points) = 2

Min image point number (low-confidence 3D points) = 3

Radius (animated gif frames) = 1

Angle amplitude (animated gif frames) = 1

Output of mvs10 (the last bits):

Number of 3D points = 1346440

Average reprojection error = 0.789732

Max reprojection error = 2.7325

Just for fun, let's increase "Min separation angle (low-confidence 3D points)" from 0.5 to 1.0 and rerun mvs10 (without deleting the .mvs files, of course).

Input to mvs10:

nvm file = duh.nvm

Min match number (camera pair selection) = 100

Max mean vertical disparity error (camera pair selection) = 1

Min average separation angle (camera pair selection) = 0.1

radius (disparity map) = 16

alpha (disparity map) = 0.9

color truncation (disparity map) = 30

gradient truncation (disparity map) = 10

epsilon = 255^2*10^-4 (disparity map)

disparity tolerance (disparity map)= 0

downsampling factor (disparity map)= 2

sampling step (dense reconstruction)= 1

Min separation angle (low-confidence 3D points) = 1

Max reprojection error (low-confidence image points) = 2

Min image point number (low-confidence 3D points) = 3

Radius (animated gif frames) = 1

Angle amplitude (animated gif frames) = 1

Output of mvs10 (the last bits):

Number of 3D points = 1333404

Average reprojection error = 0.78909

Max reprojection error = 2.73256

Note that the parameters for sfm10 and mvs10 are explained in sfm10_manual.pdf and mvs10_manual.pdf, respectively, which should be sitting in the directory where you extracted the archive ugosoft3d-10-x64.rar.