Real-time tracking and mapping approaches support intelligent agents such as robots, AR/VR devices, and autonomous driving vehicles to interact with unknown environments. Visual-based tracking methods aim to estimate the six degrees of freedom (DoF) camera poses, while mapping algorithms aim at reconstructing unknown environments into sparse or dense models.
Commonly, camera poses tend to drift when errors accumulate during tracking processes. To limit the increase of pose errors, solutions, including local bundle adjustment, sliding window optimization, marginalization, and loop closure, are proposed to use correspondences to build co-visibility graphs. Those approaches achieve robust tracking performance after using optimization modules. However, the co-visibility strategy based on point features still needs to improve in low/non-textured regions since only some features are extracted during the tracking process.
Furthermore, lines and planes, especially in indoor scenes, are explored under the co-visibility architecture to compensate for the reduction in the number of point correspondences. Given more features, the robustness of trackers will be continually improved. However, the shortness of co-visibility graphs that mainly rely on overlaps needs to be addressed, which leads to shorter constraint edges in the graphs.
Instead of only using re-projection errors of point-line-plane correspondences under the co-visibility graph pipeline, more structure information, such as Vanishing Point and Manhattan/Atalanta World Assumptions, is leveraged into our pose estimation modules by assuming scenes have some perpendicular and orthogonal cues. Since these structural cues are loosely organized by basic landmarks rather than represented as minimal parameterizations, it is difficult to use them in optimization modules. Even though they are often used in visual odometry systems, keeping these structural landmarks correct during the tracking process remains an open challenge.
How to exploit structural regularities in pose estimation and scene reconstruction is the most critical exploration goal of this dissertation. The methods presented here are incorporated into a completed tracking and mapping system. Specifically, our tracking module uses the structural regularities in the front-end and back-end modules. Moreover, we propose a new type of graph architecture, the Extensibility Graph, which is incorporated with co-visibility graphs to make up for the shortcomings of over-reliance on visual overlaps of traditional co-visibility ones.
«
Real-time tracking and mapping approaches support intelligent agents such as robots, AR/VR devices, and autonomous driving vehicles to interact with unknown environments. Visual-based tracking methods aim to estimate the six degrees of freedom (DoF) camera poses, while mapping algorithms aim at reconstructing unknown environments into sparse or dense models.
Commonly, camera poses tend to drift when errors accumulate during tracking processes. To limit the increase of pose errors, solutions,...
»