Chen, Hanzhi;Sun, Boyang;Zhang, Anran;Pollefeys Marc;Leutenegger, Stefan VidBot: Learning Generalizable 3D Actions from In-the-Wild 2D Human Videos for Zero-Shot Robotic Manipulation IEEE/CVF Conference on Computer Vision and Pattern Recognition 2025