A comprehensive tool for creating and evaluating privacy-preserving biomedical prediction models.

Eicher, Johanna; Bild, Raffael; Spengler, Helmut; Kuhn, Klaus A; Prasser, Fabian

doi:10.1186/s12911-020-1041-3

User: Guest

2020

Back
Back to start of result list
Permanent link for displayed object

Title:: A comprehensive tool for creating and evaluating privacy-preserving biomedical prediction models.
Document type:: Article; Journal Article
Author(s):: Eicher, Johanna; Bild, Raffael; Spengler, Helmut; Kuhn, Klaus A; Prasser, Fabian
Abstract:: BACKGROUND: Modern data driven medical research promises to provide new insights into the development and course of disease and to enable novel methods of clinical decision support. To realize this, machine learning models can be trained to make predictions from clinical, paraclinical and biomolecular data. In this process, privacy protection and regulatory requirements need careful consideration, as the resulting models may leak sensitive personal information. To counter this threat, a wide range of methods for integrating machine learning with formal methods of privacy protection have been proposed. However, there is a significant lack of practical tools to create and evaluate such privacy-preserving models. In this software article, we report on our ongoing efforts to bridge this gap. RESULTS: We have extended the well-known ARX anonymization tool for biomedical data with machine learning techniques to support the creation of privacy-preserving prediction models. Our methods are particularly well suited for applications in biomedicine, as they preserve the truthfulness of data (e.g. no noise is added) and they are intuitive and relatively easy to explain to non-experts. Moreover, our implementation is highly versatile, as it supports binomial and multinomial target variables, different types of prediction models and a wide range of privacy protection techniques. All methods have been integrated into a sound framework that supports the creation, evaluation and refinement of models through intuitive graphical user interfaces. To demonstrate the broad applicability of our solution, we present three case studies in which we created and evaluated different types of privacy-preserving prediction models for breast cancer diagnosis, diagnosis of acute inflammation of the urinary system and prediction of the contraceptive method used by women. In this process, we also used a wide range of different privacy models (k-anonymity, differential privacy and a game-theoretic approach) as well as different data transformation techniques. CONCLUSIONS: With the tool presented in this article, accurate prediction models can be created that preserve the privacy of individuals represented in the training set in a variety of threat scenarios. Our implementation is available as open source software. «
BACKGROUND: Modern data driven medical research promises to provide new insights into the development and course of disease and to enable novel methods of clinical decision support. To realize this, machine learning models can be trained to make predictions from clinical, paraclinical and biomolecular data. In this process, privacy protection and regulatory requirements need careful consideration, as the resulting models may leak sensitive personal information. To counter this threat, a wide ran... »
Journal title abbreviation:: BMC Med Inform Decis Mak
Year:: 2020
Journal volume:: 20
Journal issue:: 1
Fulltext / DOI:: doi:10.1186/s12911-020-1041-3
Pubmed ID:: http://view.ncbi.nlm.nih.gov/pubmed/32046701
Print-ISSN:: 1472-6947
TUM Institution:: Institut für Medizinische Statistik und Epidemiologie
BibTeX

Occurrences:

mediaTUM Gesamtbestand Einrichtungen Schools TUM School of Medicine and Health Departments Clinical Medicine Institut für KI und Informatik in der Medizin (Prof. Rückert)2020

mediaTUM Gesamtbestand Hochschulbibliographie 2020 Fakultäten Medizin Institut für Medizinische Statistik und Epidemiologie