This paper presents a new method for three dimensional object tracking by fusing information from stereo
vision and stereo audio. From the audio data, directional information about an object is extracted by
the Generalized Cross Correlation (GCC) and the object’s position in the video data is detected using the
Continuously Adaptive Mean shift (CAMshift) method. The obtained localization estimates combined
with confidence measurements are then fused to track an object utilizing Particle Swarm Optimization
(PSO). In our approach the particles move in the 3D space and iteratively evaluate their current position
with regard to the localization estimates of the audio and video module and their confidences, which
facilitates the direct determination of the object’s three dimensional position. This technique has low
computational complexity and its tracking performance is independent of any kind of model, statistics,
or assumptions, contrary to classical methods. The introduction of confidence measurements further
increases the robustness and reliability of the entire tracking system and allows an adaptive and dynamical
information fusion of heterogenous sensor information.
«
This paper presents a new method for three dimensional object tracking by fusing information from stereo
vision and stereo audio. From the audio data, directional information about an object is extracted by
the Generalized Cross Correlation (GCC) and the object’s position in the video data is detected using the
Continuously Adaptive Mean shift (CAMshift) method. The obtained localization estimates combined
with confidence measurements are then fused to track an object utilizing Particle Swar...
»