Wednesday, August 24, 2016

Multi View Stereo - Head of some Roman guy

Sequence of (just) 5 images (1080 x 1920 pixels):


Note that the head is in a plexiglass box at the Museum of Fine Arts in Boston, MA. Can't remember whose head it is though.

Goal is to get a dense 3d reconstruction (photogrammetry) from those 5 images using SfM10 (Structure from Motion) and MVS10 (Multi-View Stereo). Using SfM10 to get the camera poses (in the form of a nvm file) is usually a no-brainer so I won't talk about it here. MVS10 is a bit more trickier as you have to balance accuracy and density of the reconstruction (number of 3d points), usually by playing with the max reprojection error.

Input to MVS10:

nvm file = duh.nvm
Min match number (camera pair selection) = 100
Min average separation angle (camera pair selection) = 0.1
radius (disparity map) = 32
alpha (disparity map) = 0.9
color truncation (disparity map) = 20
gradient truncation (disparity map) = 10
epsilon = 255^2*10^-4 (disparity map)
disparity tolerance (disparity map)= 0
downsampling factor (disparity map)= 2
sampling step (dense reconstruction)= 1
Min separation angle (low-confidence 3D points) = 0.1
Min image point number (low-confidence 3D points) = 3
Max reprojection error (low-confidence image points) = 8
Radius (animated gif frames) = 1
Angle amplitude (animated gif frames) = 5

I used a max reprojection error of 8.0 pixels instead of the usual 2.0 pixels to get more 3d points in the reconstruction.


Here's the point cloud in sketchfab viewer:

Tuesday, August 23, 2016

Multi View Stereo - Bust of Hadrian

This is an attempt at reconstructing the bust of Hadrian which is proudly exhibited in the "coin room" at the Museum of Fine Arts in Boston, MA. The input to SfM10 is a sequence of 9 images extracted from a video I took with my iphone using Avidemux and rotated 90 degrees using XnView. The image size is 1080x1920.

This is the sequence of images:


This is the input to MVS10:

nvm file = duh.nvm
Min match number (camera pair selection) = 100
Min average separation angle (camera pair selection) = 0.1
radius (disparity map) = 32
alpha (disparity map) = 0.9
color truncation (disparity map) = 20
gradient truncation (disparity map) = 10
epsilon = 255^2*10^-4 (disparity map)
disparity tolerance (disparity map)= 0
downsampling factor (disparity map)= 2
sampling step (dense reconstruction)= 1
Min separation angle (low-confidence 3D points) = 0.1
Min image point number (low-confidence 3D points) = 3
Max reprojection error (low-confidence image points) = 2
Radius (animated gif frames) = 1
Angle amplitude (animated gif frames) = 5

The nvm file is the output of the structure from motion software SfM10. I use a downsampling factor of 2 for depth map generation. Using a downsampling factor of 1 takes too long and using a downsampling factor of 4 is a little bit too loose w/r to the quality of the depth maps produced. A compromise was reached by using a downsampling factor of 2. I use a max reprojection error of 2.0 pixels, which is quite tight. To get a denser 3D reconstruction albeit less accurate, you may want to use something looser (maybe up to 16.0 pixels) and re-run MVS10 (without deleting any mvs file).

This is the animated gif representing the dense 3d reconstruction obtained by MVS10:


This is the sketchfab viewer which enables you to move around the 3D scene:

Friday, August 12, 2016

Multi View Stereo - Weber monument

The following shows one of the 7 views that were used to generate a dense 3d reconstruction of the Weber family monument using SfM10 (Structure from Motion) and MVS10 (Multi-View Stereo).


This is an animated gif (reduced size) of the dense 3d reconstruction.


Here's the point cloud on sketchfab:

Multi View Stereo - Dudley tombstone

The following shows one of the 7 views that were used to generate a dense 3d reconstruction of the Dudley tombstone using SfM10 (Structure from Motion) and MVS10 (Multi-View Stereo).


This is an animated gif (reduced size) of the dense 3d reconstruction.


Here's the point cloud on sketchfab:

Tuesday, August 2, 2016

Multi View Stereo - Harry Race mausoleum


Input to SfM10 (Structure from Motion):

Number of cameras = 9
Image name for camera 0 = IMG_0164_00.jpg
Focal length for camera 0 = 2304
Image name for camera 1 = IMG_0164_01.jpg
Focal length for camera 1 = 2304
Image name for camera 2 = IMG_0164_02.jpg
Focal length for camera 2 = 2304
Image name for camera 3 = IMG_0164_03.jpg
Focal length for camera 3 = 2304
Image name for camera 4 = IMG_0164_04.jpg
Focal length for camera 4 = 2304
Image name for camera 5 = IMG_0164_05.jpg
Focal length for camera 5 = 2304
Image name for camera 6 = IMG_0164_06.jpg
Focal length for camera 6 = 2304
Image name for camera 7 = IMG_0164_07.jpg
Focal length for camera 7 = 2304
Image name for camera 8 = IMG_0164_08.jpg
Focal length for camera 8 = 2304
Number of trials (good matches) = 10000
Max number of iterations (Bundle Adjustment) = 1000
Min separation angle (low-confidence 3D points) = 0.1
Max reprojection error (low-confidence 3D points) = 2
Radius (animated gif frames) = 10
Angle amplitude (animated gif frames) = 2

I computed the focal length by using the formula: f = "max image dimension" * 1.2. When running SfM10, keep an eye on the number of 3D points that get rejected due to the reprojection error right after the initial pair has been selected. If it's high, something is amiss, and it's probably the focal lengths that are to blame. If you want to change the focal lengths after SfM10 has been run, the file "initial_camera_pair.sfm" should be deleted (only that "sfm"file) before re-running SfM10.

Animated wiggle gif:


Input to MVS10 (Multi-View Stereo):

nvm file = duh.nvm
Min match number (camera pair selection) = 100
Min average separation angle (camera pair selection) = 0.1
radius (disparity map) = 16
alpha (disparity map) = 0.9
color truncation (disparity map) = 30
gradient truncation (disparity map) = 10
epsilon = 255^2*10^-4 (disparity map)
disparity tolerance (disparity map)= 0
downsampling factor (disparity map)= 4
sampling step (dense reconstruction)= 1
Min separation angle (low-confidence 3D points) = 0.1
Min image point number (low-confidence 3D points) = 3
Max reprojection error (low-confidence image points) = 2
Radius (animated gif frames) = 1
Angle amplitude (animated gif frames) = 2

I used a downsampling equal to 4 in order to speed up the process. If the downsampling is set to 2 instead of 4, the running time will increase by about a factor 8 (2*2*2). That's almost an order of magnitude.

Animated wiggle gif:


Sketchfab viewer:


SfM10 and MVS10 are available for download at 3D Software.

Multi View Stereo - Simpson tombstone

The video below shows the input still images (extracted from iphone4 video using Avidemux), the sparse 3d reconstructions as well as the camera poses obtained by SfM10, and the dense 3d reconstruction obtained by MVS10. SfM10 and MVS10 (windows 64 bit) can be downloaded from 3D Software with no strings attached.


Input to SfM10:

Number of cameras = 7
Image name for camera 0 = IMG_0160_00.jpg
Focal length for camera 0 = 1866
Image name for camera 1 = IMG_0160_01.jpg
Focal length for camera 1 = 1866
Image name for camera 2 = IMG_0160_02.jpg
Focal length for camera 2 = 1866
Image name for camera 3 = IMG_0160_03.jpg
Focal length for camera 3 = 1866
Image name for camera 4 = IMG_0160_04.jpg
Focal length for camera 4 = 1866
Image name for camera 5 = IMG_0160_05.jpg
Focal length for camera 5 = 1866
Image name for camera 6 = IMG_0160_06.jpg
Focal length for camera 6 = 1866
Number of trials (good matches) = 10000
Max number of iterations (Bundle Adjustment) = 1000
Min separation angle (low-confidence 3D points) = 0.1
Max reprojection error (low-confidence 3D points) = 2
Radius (animated gif frames) = 10
Angle amplitude (animated gif frames) = 5

For the focal lengths, I used the formula "max image dimension" * "35mm equivalent focal length" / 36. The "35mm equivalent focal length" comes from the EXIF data of the image. One could also use the formula "max image dimension" * 1.2 if the EXIF data is not available and the lens is standard. Another way to get focal lengths is to use the epipolar rectifier ER9b.
When running SfM10, you have to keep an eye on the number of 3D points that are rejected because of the reprojection error. If the rejection number is large (relative to the number of 3D points prior to the check), then something is wrong somewhere. Usually it's the focal lengths that are wrong. Note that all focal lengths are assumed to be the same.

Input to MVS10:

nvm file = duh.nvm
Min match number (camera pair selection) = 100
Min average separation angle (camera pair selection) = 0.1
radius (disparity map) = 16
alpha (disparity map) = 0.9
color truncation (disparity map) = 30
gradient truncation (disparity map) = 10
epsilon = 255^2*10^-4 (disparity map)
disparity tolerance (disparity map)= 0
downsampling factor (disparity map)= 2
sampling step (dense reconstruction)= 1
Min separation angle (low-confidence 3D points) = 0.1
Min image point number (low-confidence 3D points) = 3
Max reprojection error (low-confidence image points) = 2
Radius (animated gif frames) = 1
Angle amplitude (animated gif frames) = 2

I usually use a downsampling factor equal to 4 to speed up the process (of computing depth maps) but here I used a downsampling factor equal to 2, which increases the running time by a lot but leads to better results (I think).

Sketchfab viewer: