All of our mobile systems use passive stereo vision as main sensor, since other sensors are not suitable for our applications. Laser scanners are typically heavy and can only measure depth in one plane. Time of Flight (ToF) cameras have limited resolution and field of view. Active systems using structured light, like the Kinect, fail in sunlight and when multiple systems scan the same area.

Dense Stereo Matching

We use the Semi-Global Matching method (Hirschmüller, 2008), as it combines accuracy with efficiency. For radiometric robustness, Census is employed as matching cost (Hirschmüller and Scharstein, 2009). The method is quite insensitive to the choice of parameters, which means that it usually does not require parameter tuning.

SGM is well suited for parallel implementations on the CPU using vector commands as well as on the GPU (Ernst and Hirschmüller, 2008) and FPGA (Hirschmüller, 2011). For our mobile systems, we use a Spartan 6 implementation that is part of the 6D vision system of Daimler for supporting driver assistance systems. Our version processes 0.5 MPixel images and 128 pixel disparity range with 14.7 frames per second.

Visual Odometry

Visual odometry (Hirschmüller et al., 2002) computes the relative transformation between successive images with six degrees of freedom (6D). We are currently using AGAST (Mair et al., 2010) as feature detector and Rank as feature descriptor. The approach has been extended (Stelzer et al, 2012) for estimating the error of visual odometry and using keyframes for making it locally drift free.

Our visual odometry method can cope with large displacements between successive images due to fast movements which are easily caused by fast rotations. Furthermore, it is quite robust and can handle a large number of outliers in the correspondences.


While stereo matching is implemented on an FPGA, rectification, post filtering of disparity images and visual odometry calculation is implemented in different threads on the CPU (Schmid and Hirschmüller, 2013). For flying systems, we currently use a Core2Duo CPU. About 80 % of one core is needed for processing 14.6 frames per second. This is less than previously reported (Schmid and Hirschmüller, 2013), since further code optimizations have been made. For ground based systems, we are using a more powerful Quadcore i7 CPU. Less than half of one core is required for reaching 14.6 Hz. Thus, in both cases, sufficient computation time is left for obstacle avoidance, mapping and path planning.




K. Schmid and H. Hirschmüller, "Stereo Vision andIMU based Real-Time Ego-Motion and Depth Image Computation on a Handheld Device", IEEE International Conference on Robotics and Automation, May 2013 in Karlsruhe, Germany.

A. Stelzer, H. Hirschmüller and M. Görner, "Stereo-Vision-Based Navigation of a Six-Legged Walking Robot in Unknown Rough Terrain", in the International Journal of Robotics Research, Special Issue on Robot Vision, 2012, Volume 31, Issue 4, pp. 381-402.

H. Hirschmüller, "Semi-Global Matching - Motivation, Developments and Applications", Invited Paper at the Photogrammetric Week, September 2011 in Stuttgart, Germany, pp. 173-184.

E. Mair, G. Hager, D. Burschka, M. Suppa and G. Hirzinger, "Adaptive and generic corner detection based on the accelerated segment test", in the European Conference on Computer Vision (ECCV), 2010, pp. 183-196.

H. Hirschmüller and D. Scharstein, "Evaluation of Stereo Matching Costs on Images with Radiometric Differences", in IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume 31(9), September 2009, pp. 1582-1599.

I. Ernst and H. Hirschmüller, "Mutual Information based Semi-Global Stereo Matching on the GPU", in Proceedings of the International Symposium on Visual Computing (ISVC08), 1-3 December 2008, Las Vegas, Nevada, USA.

H. Hirschmüller, "Stereo Processing by Semi-Global Matching and Mutual Information", in IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume 30(2), February 2008, pp. 328-341.

H. Hirschmüller, P. R. Innocent and J. M. Garibaldi, "Fast, Unconstrained Camera Motion Estimation from Stereo without Tracking and Robust Statistics", in Proceedings of the 7th International Conference on Control, Automation, Robotics and Vision, 2-5 December 2002, Singapore, pp. 1099-1104.


Dr. Heiko Hirschmüller

German Aerospace Center (DLR)
Institute of Robotics and Mechatronics
Department of Perception and Cognition
Münchner Straße 20
82234 Weßling-Oberpfaffenhofen
email: heiko (dot) hirschmueller (at) dlr (dot) de
Copyright © 2014 German Aerospace Center (DLR). All rights reserved.