Filters and cleans GWAS summary statistics for downstream pleiotropy analysis. Removes missing values, applies significance thresholds, and standardizes column names.
Usage
preprocess_gwas(
gwas_data,
pvalue_threshold = 5e-08,
columns = c("SNPS", "MAPPED_TRAIT", "PVALUE_MLOG", "CHR_ID", "CHR_POS")
)Examples
data(gwas_subset)
gwas_clean <- preprocess_gwas(gwas_subset, pvalue_threshold = 1e-5)
head(gwas_clean)
#> SNPS MAPPED_TRAIT PVALUE_MLOG CHR_ID CHR_POS
#> <char> <char> <num> <char> <char>
#> 1: rs814573 Alzheimer disease 672.69900 19 44908684
#> 2: rs814573 myocardial infarction 15.00000 19 44908684
#> 3: rs7412 Alzheimer disease 122.39790 19 44919689
#> 4: rs7412 LDL cholesterol 9629.00000 19 44919689
#> 5: rs2817462 memory performance 6.30103 6 156588550
#> 6: rs2817462 tyrosine measurement 9.69897 6 156588550