Tuesday, August 2, 2016

Multi View Stereo - Simpson tombstone

The video below shows the input still images (extracted from iphone4 video using Avidemux), the sparse 3d reconstructions as well as the camera poses obtained by SfM10, and the dense 3d reconstruction obtained by MVS10. SfM10 and MVS10 (windows 64 bit) can be downloaded from 3D Software with no strings attached.


Input to SfM10:

Number of cameras = 7
Image name for camera 0 = IMG_0160_00.jpg
Focal length for camera 0 = 1866
Image name for camera 1 = IMG_0160_01.jpg
Focal length for camera 1 = 1866
Image name for camera 2 = IMG_0160_02.jpg
Focal length for camera 2 = 1866
Image name for camera 3 = IMG_0160_03.jpg
Focal length for camera 3 = 1866
Image name for camera 4 = IMG_0160_04.jpg
Focal length for camera 4 = 1866
Image name for camera 5 = IMG_0160_05.jpg
Focal length for camera 5 = 1866
Image name for camera 6 = IMG_0160_06.jpg
Focal length for camera 6 = 1866
Number of trials (good matches) = 10000
Max number of iterations (Bundle Adjustment) = 1000
Min separation angle (low-confidence 3D points) = 0.1
Max reprojection error (low-confidence 3D points) = 2
Radius (animated gif frames) = 10
Angle amplitude (animated gif frames) = 5

For the focal lengths, I used the formula "max image dimension" * "35mm equivalent focal length" / 36. The "35mm equivalent focal length" comes from the EXIF data of the image. One could also use the formula "max image dimension" * 1.2 if the EXIF data is not available and the lens is standard. Another way to get focal lengths is to use the epipolar rectifier ER9b.
When running SfM10, you have to keep an eye on the number of 3D points that are rejected because of the reprojection error. If the rejection number is large (relative to the number of 3D points prior to the check), then something is wrong somewhere. Usually it's the focal lengths that are wrong. Note that all focal lengths are assumed to be the same.

Input to MVS10:

nvm file = duh.nvm
Min match number (camera pair selection) = 100
Min average separation angle (camera pair selection) = 0.1
radius (disparity map) = 16
alpha (disparity map) = 0.9
color truncation (disparity map) = 30
gradient truncation (disparity map) = 10
epsilon = 255^2*10^-4 (disparity map)
disparity tolerance (disparity map)= 0
downsampling factor (disparity map)= 2
sampling step (dense reconstruction)= 1
Min separation angle (low-confidence 3D points) = 0.1
Min image point number (low-confidence 3D points) = 3
Max reprojection error (low-confidence image points) = 2
Radius (animated gif frames) = 1
Angle amplitude (animated gif frames) = 2

I usually use a downsampling factor equal to 4 to speed up the process (of computing depth maps) but here I used a downsampling factor equal to 2, which increases the running time by a lot but leads to better results (I think).

Sketchfab viewer:

No comments:

Post a Comment