If you have a look at the academic papers that deal with depth maps or stereo matching (just using google), you will soon be overwhelmed by the amount of stuff that's being written over the years. The question is: how do you know which method works best?
There's a paper published in 2002 titled "A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms" by Daniel Scharstein and Richard Szeliski that does a thorough review of stereo matching methods. "Taxonomy" means that they have identified "building blocks" that are common in most algorithms. Doing that always makes things easier since it shows that a lot of algorithms are actually quite similar in the way they are designed. "Dense" means that the matching is done at the pixel level. "Dense Two-Frame Stereo Correspondence" is basically another way of saying "depth map". The authors are kind enough to provide the source code that was used to do the testing (which you can use on your own stereo pairs). If you are interested in trying this out yourself, go to vision.middlebury.com, download the source code, the scripts and the images. You need to compile the code to be able to use the program, called StereoMatcher, which runs on a script that you can customize. On a windows pc, you're probably gonna need Microsoft Visual Studio (the Express version is free). On a linux box, it's a bit easier since the gnu C++ compiler is already there as part of the install. Once you have created the executable and read the scripts that are used to run the provided test cases, it's not that hard to figure out how you can run your own stuff. The only thing is that you're gonna have to save or convert your stereo images in the ppm format (see gimp or photoshop for that). You also have to be well aware that the stereo pairs you can feed to those stereo matching algorithms need to be rectified, that is, for any point in the scene, the two projected points (on the left and right images) must be on the same line, or scan line (see Stereo Matching Rectified Geometry).
I took the liberty to run StereoMatch on the (overused) test image called "Tsukuba":
Here are the depth maps for each available algorithm:
SSD (Sum of Squared Differences).
SO (Scanline Optimization).
SAD (Sum of Absolute Differences).
SA (Simulated Annealing).
GC (Graph Cut).
DP (Dynamic Programming).
GC (Graph Cut) appears to be the best performing algorithm for this particular benchmark stereo pair which, to be honest, has about nothing to do with the "real world" (piecewise planar objects perpendicular to the optical axes are not too commonly found in nature). If this subject fascinates you, check Stereo Matching - Local Methods, Stereo Matching - Global Methods, and Stereo Matching - Variational Methods for more info on various stereo matching methodologies.
I, Ugo Capeto, have implemented some software to generate depth maps from stereo pairs. It's available at the 3D Software page right here on this blog and it's free to download.