Abstract
In this paper, we show how the translational motion of a stereo vision system relative
to, and its distance from, the scene can be recovered in closed form directly from the measurements
of image gradients and time derivatives. There is no need to estimate image motion or establish
correspondences between features across images. The direction of translational motion is recovered
using a procedure which involves minimizing the sum squared error of a linear constraint equation
over the image. The solution is given in terms of the eigenvector corresponding to the smallest
eigenvalue of a 3 x 3 positive semi-definite matrix. Using the average disparity, which maximizes the
crosscorrelation between the left and right images, we estimate the scale-factor necessary to compute
the magnitude of the translational motion, and consequently the distance to the scene.