Polygenic risk scores outperform machine learning methods in predicting coronary artery disease status.

Gola, Damian; Erdmann, Jeannette; Müller-Myhsok, Bertram; Schunkert, Heribert; König, Inke R

doi:10.1002/gepi.22279

2020

Zurück
Zurück zum Anfang der Trefferliste
Dauerhafter Link zum angezeigten Objekt

Titel:: Polygenic risk scores outperform machine learning methods in predicting coronary artery disease status.
Dokumenttyp:: Article; Journal Article; Research Support, Non-U.S. Gov't
Autor(en):: Gola, Damian; Erdmann, Jeannette; Müller-Myhsok, Bertram; Schunkert, Heribert; König, Inke R
Abstract:: Coronary artery disease (CAD) is the leading global cause of mortality and has substantial heritability with a polygenic architecture. Recent approaches of risk prediction were based on polygenic risk scores (PRS) not taking possible nonlinear effects into account and restricted in that they focused on genetic loci associated with CAD, only. We benchmarked PRS, (penalized) logistic regression, naïve Bayes (NB), random forests (RF), support vector machines (SVM), and gradient boosting (GB) on a data set of 7,736 CAD cases and 6,774 controls from Germany to identify the algorithms for most accurate classification of CAD status. The final models were tested on an independent data set from Germany (527 CAD cases and 473 controls). We found PRS to be the best algorithm, yielding an area under the receiver operating curve (AUC) of 0.92 (95% CI [0.90, 0.95], 50,633 loci) in the German test data. NB and SVM (AUC ~ 0.81) performed better than RF and GB (AUC ~ 0.75). We conclude that using PRS to predict CAD is superior to machine learning methods. «
Coronary artery disease (CAD) is the leading global cause of mortality and has substantial heritability with a polygenic architecture. Recent approaches of risk prediction were based on polygenic risk scores (PRS) not taking possible nonlinear effects into account and restricted in that they focused on genetic loci associated with CAD, only. We benchmarked PRS, (penalized) logistic regression, naïve Bayes (NB), random forests (RF), support vector machines (SVM), and gradient boosting (GB) on a d... »
Zeitschriftentitel:: Genet Epidemiol
Jahr:: 2020
Band / Volume:: 44
Heft / Issue:: 2
Seitenangaben Beitrag:: 125-138
Volltext / DOI:: doi:10.1002/gepi.22279
PubMed:: http://view.ncbi.nlm.nih.gov/pubmed/31922285
Print-ISSN:: 0741-0395
TUM Einrichtung:: Klinik für Herz- und Kreislauferkrankungen im Erwachsenenalter (Prof. Schunkert)
BibTeX

Vorkommen:

mediaTUM Gesamtbestand Einrichtungen Schools TUM School of Medicine and Health Departments Clinical Medicine Lehr- und Forschungskooperationen mit den Kliniken und Instituten am Deutschen Herzzentrum Klinik für Herz- und Kreislauferkrankungen im Erwachsenenalter (DHM) (Prof. Schunkert)2020