Upcoming Talks

Ist logo

Designing Bayesian learning models for large genomic datasets

Date: Friday, September 21, 2018 10:00 - 11:00
Speaker: Matt Robinson (University of Lausanne)
Location: Meeting room 3rd floor / Central Bldg. (I01.3OG.Meeting Room)
Series: Life Sciences Seminar
Host: Nick Barton

Abstract:

Genome-wide association studies (GWAS) have detected thousands of genomic regions associated with common complex diseases and quantitative traits, but they rely on single-marker regression approaches, which have poor estimation and prediction properties. Here, we develop a Bayesian penalised regression model that estimates genetic effects jointly from a mixture of distributions, allowing for related individuals and accounting for marker LD and population structure. We first apply this approach to 456,426 individuals from the UK Biobank dataset finding evidence for thousands of genomic regions with ?95% posterior probability of contributing ?0.001% of trait variation captured by SNP markers for body mass index (BMI, 7297 250kb genomic regions, or 63% of the genome), cardiovascular disease (CAD, 6235, 54%), type-2 diabetes (T2D, 5781, 50%) and height (HT, 4978, 43%). We then show how this model can be adapted and applied to DNA methylation data to estimate association between blood biomarkers and clinical outcomes, whilst controlling for cell-count confounding. Finally, we discuss how this regression approach can be used to formulate a Bayesian factor analysis, which when applied to genomic data may provide additional insights into population genetic differentiation either across a gradient, or between groups.
Qr image
Download ICS Download invitation
Back to eventlist