Standardise GWAS
standardise_gwas.Rmd
Here is a simple example of using the
genepi.utils::GWAS()
object to standardise your GWAS input
ready for use with other features in this package. First some
unprocessed example data:
library(genepi.utils)
filepath <- system.file("extdata", "example2_gwas_sumstats.tsv", package="genepi.utils")
gwas <- data.table::fread(filepath)
gwas
Column mapping
A number of column mapping options for standard GWAS formats are
provided through the ColumnMap()
class.
- a string - valid id for a pre-defined map
(e.g. ‘giant’ or ‘gwama’)
- an unnamed character vector or list - the
constructor will create a
ColumnMap
object as long as the provided column names are found within the list of known aliases
- a named character vector or list - the names must
be the standard column names to map to, and the values the current
column names
- a list of
Column
class objects
# possible inputs to mapping object
colmap_input1 <- c('MARKER', 'CHR', 'POS', 'BETA', 'SE', 'P', 'EAF', 'A1', 'A2') # will try to guess standard name mapping
colmap_input2 <- list(rsid='MARKER', chr='CHR', bp='POS', beta='BETA', se='SE', p='P', eaf='EAF', ea='A1', oa='A2') # standard naming
colmap_input3 <- list(Column(name='rsid', alias=c('MARKER'), type='character'),
Column(name='chr', alias=c('CHR'), type='character'),
Column(name='bp', alias=c('POS'), type='integer'),
Column(name='beta', alias=c('BETA'), type='numeric'),
Column(name='se', alias=c('SE'), type='numeric'),
Column(name='p', alias=c('P'), type='numeric'),
Column(name='eaf', alias=c('EAF'), type='numeric'),
Column(name='ea', alias=c('A1'), type='character'),
Column(name='oa', alias=c('A2'), type='character'))
# column mapping object
map <- ColumnMap("ns_map")
map <- ColumnMap(colmap_input1)
map <- ColumnMap(colmap_input2)
map <- ColumnMap(colmap_input3)
map
Standardise GWAS
A GWAS can then be standardised like so:
filepath <- system.file("extdata", "example2_gwas_sumstats.tsv", package="genepi.utils")
gwas <- GWAS(dat=filepath, map=map)
gwas
Custom quality control
A GWAS object is created after a number of data checks and filters.
The default filters can be altered via the filters argument, which takes
a named list of strings that can be evaluated as expressions to filter a
data.table. The results of the filtering can be found in the
@qc
property of the GWAS
object which is a
named list of integer vectors identifying rows that fail that particular
filter.
gwas <- data.table::fread(filepath)
gwas <- GWAS(dat = gwas,
map = map,
filters = list(beta_invalid = "!is.infinite(beta) & abs(beta) < 10",
eaf_invalid = "eaf >= 0.01 & eaf <= 0.99",
p_invalid = "!is.infinite(p) & sign(p) != -1 & p < 1",
se_invalid = "!is.infinite(se) & abs(se) < 10",
chr_missing = "!is.na(chr)",
bp_missing = "!is.na(bp)",
beta_missing = "!is.na(beta)",
se_missing = "!is.na(se)",
p_missing = "!is.na(p)",
eaf_missing = "!is.na(eaf)"))
gwas@qc