OBJECTIVE: To develop a two-phased deep learning sorting algorithm for post-X-ray image acquisition in order to facilitate large musculoskeletal image datasets according to their anatomical entity.
METHODS: In total, 42,608 unstructured and pseudonymized radiographs were retrieved from the PACS of a musculoskeletal tumor center. In phase 1, imaging data were sorted into 1000 clusters by a self-supervised model. A human-in-the-loop radiologist assigned weak, semantic labels to all clusters and clusters with the same label were merged. Three hundred thirty-two non-musculoskeletal clusters were discarded. In phase 2, the initial model was modified by "injecting" the identified labels into the self-supervised model to train a classifier. To provide statistical significance, data split and cross-validation were applied. The hold-out test set consisted of 50% external data. To gain insight into the model's predictions, Grad-CAMs were calculated.
RESULTS: The self-supervised clustering resulted in a high normalized mutual information of 0.930. The expert radiologist identified 28 musculoskeletal clusters. The modified model achieved a classification accuracy of 96.2% and 96.6% for validation and hold-out test data for predicting the top class, respectively. When considering the top two predicted class labels, an accuracy of 99.7% and 99.6% was accomplished. Grad-CAMs as well as final cluster results underlined the robustness of the proposed method by showing that it focused on similar image regions a human would have considered for categorizing images.
CONCLUSION: For efficient dataset building, we propose an accurate deep learning sorting algorithm for classifying radiographs according to their anatomical entity in the assessment of musculoskeletal diseases.
KEY POINTS: • Classification of large radiograph datasets according to their anatomical entity. • Paramount importance of structuring vast amounts of retrospective data for modern deep learning applications. • Optimization of the radiological workflow and increase in efficiency as well as decrease of time-consuming tasks for radiologists through deep learning.
«
OBJECTIVE: To develop a two-phased deep learning sorting algorithm for post-X-ray image acquisition in order to facilitate large musculoskeletal image datasets according to their anatomical entity.
METHODS: In total, 42,608 unstructured and pseudonymized radiographs were retrieved from the PACS of a musculoskeletal tumor center. In phase 1, imaging data were sorted into 1000 clusters by a self-supervised model. A human-in-the-loop radiologist assigned weak, semantic labels to all clusters and cl...
»