We present an automatic annotation pipeline to recover 9D cuboids and 3D shape from pre-trained off-the-shelf 2D detectors and sparse LIDAR data. Our autolabeling method solves this challenging ill-posed inverse problem by relying on learned shape priors and optimization of geometric and physical parameters. To that end, we propose a novel differentiable shape renderer over signed distance fields (SDF), which we leverage in combination with normalized object coordinate spaces (NOCS). Initially trained on synthetic data to predict shape and coordinates, our method uses these predictions for projective and geometrical alignment over real samples. We also propose a curriculum learning strategy, iteratively retraining on samples of increasing difficulty for subsequent self-improving annotation rounds. Our experiments on the KITTI3D dataset show that we can recover a substantial amount of accurate cuboids, and that these autolabels can be used to train 3D vehicle detectors with state-of-the-art results.
«
We present an automatic annotation pipeline to recover 9D cuboids and 3D shape from pre-trained off-the-shelf 2D detectors and sparse LIDAR data. Our autolabeling method solves this challenging ill-posed inverse problem by relying on learned shape priors and optimization of geometric and physical parameters. To that end, we propose a novel differentiable shape renderer over signed distance fields (SDF), which we leverage in combination with normalized object coordinate spaces (NOCS). Initially t...
»