Skip to contents

get_lead_snps() Get the top variants within 1 MB windows of the genome with association p-values below the given threshold

Usage

get_lead_snps(
  df,
  thresh = 5e-08,
  region_size = 1e+06,
  protein_coding_only = FALSE,
  chr = NULL,
  .checked = FALSE,
  verbose = NULL,
  keep_chr = TRUE
)

Arguments

df

Dataframe

thresh

A number. P-value threshold, only extract variants with p-values below this threshold (5e-08 by default)

region_size

An integer (default = 20000000) (or a string represented as 200kb or 2MB) indicating the window size for variant labeling. Increase this number for sparser annotation and decrease for denser annotation.

protein_coding_only

Logical, set this variable to TRUE to only use protein_coding genes for annotation

chr

String, get the top variants from one chromosome only, e.g. chr="chr1"

.checked

Logical, if the input data has already been checked, this can be set to TRUE so it wont be checked again (FALSE by default)

verbose

Logical, set to TRUE to get printed information on number of SNPs extracted

keep_chr

Logical, set to FALSE to remove the "chr" prefix before each chromosome if present (TRUE by default)

Value

Dataframe of lead variants. Returns the best variant per MB (by default, change the region size with the region argument) with p-values below the input threshold (thresh=5e-08 by default)

Examples

if (FALSE) { # \dontrun{
get_lead_snps(CD_UKBB)
} # }