Bayesian variable selection regression for genome-wide association studies and other large-scale problems

Yongtao Guan; Matthew Stephens

doi:10.1214/11-AOAS455

Back

Bayesian variable selection regression for genome-wide association studies and other large-scale problems

Journal article

Open access

Peer reviewed

Bayesian variable selection regression for genome-wide association studies and other large-scale problems

Yongtao Guan and Matthew Stephens

The annals of applied statistics, Vol.5(3), pp.1780-1815

2011-10-27

DOI: https://doi.org/10.1214/11-AOAS455

Abstract

Statistics - Applications

Annals of Applied Statistics 2011, Vol. 5, No. 3, 1780-1815 We consider applying Bayesian Variable Selection Regression, or BVSR, to genome-wide association studies and similar large-scale regression problems. Currently, typical genome-wide association studies measure hundreds of thousands, or millions, of genetic variants (SNPs), in thousands or tens of thousands of individuals, and attempt to identify regions harboring SNPs that affect some phenotype or outcome of interest. This goal can naturally be cast as a variable selection regression problem, with the SNPs as the covariates in the regression. Characteristic features of genome-wide association studies include the following: (i) a focus primarily on identifying relevant variables, rather than on prediction; and (ii) many relevant covariates may have tiny effects, making it effectively impossible to confidently identify the complete "correct" subset of variables. Taken together, these factors put a premium on having interpretable measures of confidence for individual covariates being included in the model, which we argue is a strength of BVSR compared with alternatives such as penalized regression methods. Here we focus primarily on analysis of quantitative phenotypes, and on appropriate prior specification for BVSR in this setting, emphasizing the idea of considering what the priors imply about the total proportion of variance in outcome explained by relevant covariates. We also emphasize the potential for BVSR to estimate this proportion of variance explained, and hence shed light on the issue of "missing heritability" in genome-wide association studies.

Files and links (1)

url

https://doi.org/10.1214/11-AOAS455View

Published (Version of record) Open

Metrics

3 Record Views

233 Times Cited - Web of Science

See more details

InCites Highlights

These are selected metrics from InCites Benchmarking & Analytics tool, related to this output

Collaboration types: Domestic collaboration
Citation topics: 1 Clinical & Life Sciences; 1.189 Genome Studies; 1.189.455 Genome-Wide Association Studies
Web Of Science research areas: Statistics & Probability
ESI research areas: Mathematics

UN Sustainable Development Goals (SDGs)

This output has contributed to the advancement of the following goals:

Source: InCites

Details

Title: Bayesian variable selection regression for genome-wide association studies and other large-scale problems
Creators: Yongtao Guan
Matthew Stephens
Publication Details: The annals of applied statistics, Vol.5(3), pp.1780-1815
Academic Unit: Leadership Department; Miami Herbert Business School; MHBS - Management Science
Language: English
Resource Type: Journal article
Record Identifier: 991031620109902976