Transformer-like Neural Networks in Application to 3D Instance Segmentation

Hussain, Sajad

Benutzer: Gast

Masterarbeiten

Zurück
Zurück zum Anfang der Trefferliste
Dauerhafter Link zum angezeigten Objekt

Wenn Sie Schwierigkeiten haben, das Dokument zu öffnen, versuchen Sie auch bitte diesen Link

Autor(en):: Hussain, Sajad
Titel:: Transformer-like Neural Networks in Application to 3D Instance Segmentation
Abstract:: Abstract Instance segmentation of indoor point clouds remains difficult, driven by data scale, clutter, and imbalance across object classes. Geometry-driven methods such as SphericalMask provide robust coarse localization through spherical polygons and radial point migration, but they lack learned instance reasoning. Transformer-based decoders, while offering global context, often suffer from noisy attention and weak geometric grounding. This thesis addresses these limitations by extending the custom SphericalMask pipeline with an instance-aware MaskModule and an optional detection head. The MaskModule predicts a per-query spatial support mask that restricts cross-attention to meaningful regions, reducing global noise and improving separation between small and under-represented classes. The detection head adds geometric supervision by predicting abjectness and bounding boxes from pooled instance features, acting as a backbone regularizer. Both components were integrated into the AIH-3DIS framework and evaluated across three model variants. Experiments show that the MaskModule variant achieves the strongest overall performance with an AP of 0.320 (+2.7 over baseline) and improvements in AP50 and AP25. While the baseline attains slightly higher strict recall, its predictions lack spatial precision. In contrast, the MaskModule provides a balanced trade-off, offering more stable, fine-grained segmentation particularly for small building elements thereby improving downstream BIM and digital-twin applications. «
Abstract Instance segmentation of indoor point clouds remains difficult, driven by data scale, clutter, and imbalance across object classes. Geometry-driven methods such as SphericalMask provide robust coarse localization through spherical polygons and radial point migration, but they lack learned instance reasoning. Transformer-based decoders, while offering global context, often suffer from noisy attention and weak geometric grounding. This thesis addresses these limitations by extending the... »
Stichworte:: LOCenter; GNI;
Fachgebiet:: ALL Allgemeines
Aufgabensteller:: Klepaczko, A.; Pryczek, M.; Noichl, F.; Borrmann, A.
Jahr:: 2025
Jahr / Monat:: 2025-12
Monat:: Dec
Hochschule / Universität:: Technische Universität München
BibTeX

Vorkommen:

mediaTUM Gesamtbestand Einrichtungen Schools TUM School of Engineering and Design Departments Civil and Environmental Engineering Lehrstuhl für Computing in Civil and Building Engineering (Prof. Borrmann)Abschlussarbeiten Masterarbeiten