Genetic variation affecting gene expression condition disease. Here, I developed machine learning models predicting two major steps of gene expression. First I modeled RNA stability from DNA sequence. This explains 59% of mRNA stability variation across genes. It reveals new regulatory elements and shows codon usage to be the major determinant. Second I developed MMSplice, a modular deep neural network architecture which predicts the effects of genetic variants on exon skipping, splice site choice, splicing efficiency, and pathogenicity. MMSplice won the CAGI5 exon-skipping prediction challenge 2018. These models and modeling approaches will help to pinpoint pathogenic genetic variants.
«
Genetic variation affecting gene expression condition disease. Here, I developed machine learning models predicting two major steps of gene expression. First I modeled RNA stability from DNA sequence. This explains 59% of mRNA stability variation across genes. It reveals new regulatory elements and shows codon usage to be the major determinant. Second I developed MMSplice, a modular deep neural network architecture which predicts the effects of genetic variants on exon skipping, splice site choi...
»