Introduction to Stereo Imaging -- Theory

Let us consider a simplified approach to the mathematics of the problem in order to aid understanding of the tasks involved.
We will consider a set up using two cameras in stereo. -- other methods that involve stereo are similar.
Let's consider a simplified optical set up:
Fig. 5 A simplified stereo imaging system
Fig. 5 shows:
Consider a point (x,y,z), in three-dimensional world coordinates, on an object.
Let this point have image coordinates tex2html_wrap_inline3026 and tex2html_wrap_inline3028 in the left and right image planes of the respective cameras.
Let f be the focal length of both cameras, the perpendicular distance between the lens centre and the image plane. Then by similar triangles:

Solving for (x,y,z) gives:

The quantity tex2html_wrap_inline3034 which appears in each of the above equations is called the disparity.
There are several practical problems with this set up:

However as the camera separation becomes large difficulties arise in correlating the two camera images.
In order to measure the depth of a point it must be visible to both cameras and we must also be able to identify this point in both images.
As the camera separation increases so do the differences in the scene as recorded by each camera.
Thus it becomes increasingly difficult to match corresponding points in the images.
This problem is known as the stereo correspondence problem.

Methods of Acquisition

Laser Ranging Systems
Laser ranging works on the principle that the surface of the object reflects laser light back towards a receiver which then measures the time (or phase difference) between transmission and reception in order to calculate the depth.
Most laser rangefinders:
Structured Light Methods
Basic idea:

Moire Fringe Methods
The essence of the method is that a grating is projected onto an object and an image is formed in the plane of some reference grating as shown in Fig. 6.
The image then interferes with the reference grating to form Moire fringe contour patterns which appear as dark and light stripes, as demonstrated by Fig. 7. Analysis of the patterns then gives accurate descriptions of changes in depth and hence shape.
Fig. 6 A moire projection system
Fig. 7 Moire fringe patterns
NOTE: Ambiguities arise in interrogating the fringe patterns.
Moire fringe methods are capable of producing very accurate depth data (resolution to within about 10 microns) but the methods have certain drawbacks.
Shape from Shading Methods
Methods based on shape from shading employ photometric stereo techniques to produce depth measurements.
Using a single camera, two or more images are taken of an object in a fixed position but under different lighting conditions.
By studying the changes in brightness over a surface and employing constraints in the orientation of surfaces, certain depth information may be calculated.
Methods based on these techniques are not suited for general three-dimensional depth data acquisition:
Passive Stereoscopic Methods
Stereoscopy as a technique for measuring range by triangulation to selected locations in a scene imaged by two cameras already -- further details on general stereo configurations in Books.
The primary computational problem of stereoscopy is to find the correspondence of various points in the two images.
This requires:
Active Stereoscopic Methods
The problems of passive stereoscopic techniques may be overcome by

Our Active Stereo Vision System

This Section describes the active stereoscopic subsystem which provides the three-dimensional data to our system for automatically inspecting mechanical parts.
NOTE: Whilst this Section considers some specific active stereo problems, many of the other issues discussed are not specific to any particular three-dimensional data acquisition technique, and will be of general interest.
The main components of the Vision System are illustrated by the schematic diagram in Fig. 8.
Fig. 8 Schematic diagram of vision system
The vision system consists of:
Initially the cameras of the system must be calibrated in order to
Depth maps extracted from the scene by :
Fig. 9 Measuring a depth value
  1. For each vertical stripe of laser light form an image of the stripe in the pair of frames from each camera.
  2. For each row in the master camera image, search until the stripe is found at point P(i,j), say.
  3. Form a three-dimensional line l passing through the centre tex2html_wrap_inline3054 of the master camera and P(i,j).
  4. Construct the epipolar line which is the projection of the line l into the image formed by the other camera. Do this by projecting two arbitrary points tex2html_wrap_inline3060 and tex2html_wrap_inline3062 into the image and constructing a line between the two projected points.
  5. Search along the epipolar line for the laser stripe. If it is found at tex2html_wrap_inline3064, proceed to Step 6.
  6. Find the point tex2html_wrap_inline3066 on line l which corresponds to tex2html_wrap_inline3064. Calculate the (x,y,z) coordinates of tex2html_wrap_inline3066, and store the z value at position (i,j) corresponding to x and y in the depth map.

The position of the point tex2html_wrap_inline3066 is easily found by projecting a line tex2html_wrap_inline3086 from the centre tex2html_wrap_inline3088 of the secondary camera passing through Q. The intersection of the lines l and tex2html_wrap_inline3086 gives the coordinates of tex2html_wrap_inline3066.
The depth map is formed by using a world coordinate system fixed on the master camera with its origin at tex2html_wrap_inline3054.
Fig. 10 Depth Map/Image Overlay

The 3D Image - Depth Maps

The simplest and most convenient way of representing and storing the depth measurements taken from a scene is a depth map.
A depth map is a two-dimensional array where the x and y distance information corresponds to the rows and columns of the array as in an ordinary image, and the corresponding depth readings (z values) are stored in the array's elements (pixels).
Depth map is like a grey scale image except the z information (float - 32 bytes) replaces the intensity information.
Fig. 3 Artificial depth maps
Fig. 4 Real depth maps

Why use 3D data?

An 3D image containing has many advantages over its 2D counterpart:
Explicit Geometry
  • 2D images give only limited information the physical shape and size of an object in a scene.
  • 3d images express the geometry in terms of three-dimensional coordinates.

e.g Size (and shape) of an object in a scene can be straightforwardly computed from its three-dimensional coordinates.

Recent technological advances ( e.g. in camera optics, CCD cameras and laser rangefinders) have made the production of reliable and accurate three-dimensional depth data possible.
Consequently many three-dimensional data acquisition systems have been developed.