Multi-task Forest for Human Pose Estimation in Depth Images. (Oral Presentation)

Lallemand, J.; Pauly, O.; Schwarz, L.; Tan, D. J.; Ilic, S.

lallemand20133Dv

Titel:: Multi-task Forest for Human Pose Estimation in Depth Images. (Oral Presentation)
Dokumenttyp:: Konferenzbeitrag
Autor(en):: Lallemand, J.; Pauly, O.; Schwarz, L.; Tan, D. J.; Ilic, S.
Abstract:: In this paper, we address the problem of human body pose estimation from depth data. Previous works based on random forests relied either on a classification strategy to infer the different body parts or on a regression approach to predict directly the joint positions. To permit the inference of very generic poses, those approaches did not consider additional information during the learning phase, as e.g. the performed activity. In the present work, we introduce a novel approach to integrate additional information at training time that actually improves the pose prediction during the testing. Our main contribution is a structured output forest that aims at solving a joint regression-classification task: each foreground pixel from a depth image is associated to its relative displacements to the 3D joint positions as well as the activity class. Integrating activity information in the objective function during forest training permits to better separate the space of 3D poses, leading to a better modeling of the posterior. Thereby, our approach provides an improved pose prediction, and as a by-product, can give an estimate of the performed activity. We performed experiments on a dataset performed by 10 people associated with the ground truth body poses from a motion capture system. To demonstrate the benefits of our approach, we divided the poses into 10 different activities for the training phase, which permits to improve human pose estimation compared to a pure regression forest approach. «
In this paper, we address the problem of human body pose estimation from depth data. Previous works based on random forests relied either on a classification strategy to infer the different body parts or on a regression approach to predict directly the joint positions. To permit the inference of very generic poses, those approaches did not consider additional information during the learning phase, as e.g. the performed activity. In the present work, we introduce a novel approach to integrate add... »
Stichworte:: CAMP,MLMed,ComputerVision
Kongress- / Buchtitel:: To appear in Proc. International Conference on 3D Vision (3DV)
Jahr:: 2013
BibTeX

Vorkommen:

mediaTUM Gesamtbestand Hochschulbibliographie 2013 Fakultäten Informatik Lehrstuhl für Informatik / Informatik-Anwendungen in der Medizin (Prof. Navab)

mediaTUM Gesamtbestand Einrichtungen Schools TUM School of Computation, Information and Technology Departments Computer Science Informatik 16 - Lehrstuhl für Anwendungen in der Medizin (Prof. Navab)Import