复现一篇高分(IF = 11.274)孟德尔随机化分析文章-day3

话不多说,直接上代码

library(TwoSampleMR)
library(data.table)

#step 1. read exposure data
exposure_dat <- read_exposure_data('met-a-746_MR_format.txt', clump = F, sep = "\t",phenotype_col = "Phenotype",snp = "SNP", beta_col = "beta", se_col = "se", effect_allele_col = "effect_allele", other_allele_col = "other_allele", pval_col = "pval", samplesize_col = "samplesize", eaf_col = "eaf")

#step 2. exposure data clump, parameters are from the article(default parameter is clump_r2 = 0.001,clump_kb = 10000)
exposure_dat_clump <- clump_data(exposure_dat,  clump_r2 = 0.1, pop = "EUR",clump_kb = 500)

#step 3. read outcome data
outcome_data <- fread('Alzheimer_GWAS_summary_example.txt')
outcome_dat <- format_data( dat=outcome_data, type = "outcome", snps = exposure_dat_clump$SNP, header = TRUE, phenotype_col = "Phenotype", snp_col ="SNP",beta_col ="beta",se_col ="se",effect_allele_col ="effect_allele",other_allele_col ="other_allele",pval_col ="pval",samplesize_col = "samplesize", eaf_col = "eaf")

#step 4. harmonise
dat <- harmonise_data(exposure_dat_clump, outcome_dat)

#step 5. caculate F-stat for each SNP
dat$EAF2 <- (1 - dat$eaf.exposure)
dat$MAF <- pmin(dat$eaf.exposure, dat$EAF2)
PVEfx <- function(BETA, MAF, SE, N){
  pve <- (2*(BETA^2)*MAF*(1 - MAF))/((2*(BETA^2)*MAF*(1 - MAF)) + ((SE^2)*2*N*MAF*(1 - MAF)))
  return(pve) 
}
dat$PVE <- mapply(PVEfx, dat$beta.exposure, dat$MAF, dat$se.exposure, N = dat$samplesize.exposure)
dat$FSTAT <- ((dat$samplesize.exposure - 1 - 1)/1)*(dat$PVE/(1 - dat$PVE))

#step 6. heterogeneity test, heterogeneity (Inverse variance weighted) Q-pval = 0.3858222 > 0.05, then choose Inverse variance weighted (fixed effects) method
mr_results_het <- mr_heterogeneity(dat)

#step 7. MR analysis using Inverse variance weighted (fixed effects) method
res <- mr(dat, method_list = c("mr_ivw_fe"))

#step 8. Add OR and CI information
res <- generate_odds_ratios(res)


部分结果展示

step 5.

所有SNP的F-stat值都大于10,因此都纳入分析中


step 5 结果展示

step 6. 异质性检测结果

p = 0.3858222 > 0.05,因此选择Inverse variance weighted (fixed effects) method;如果 p < 0.05,选择Inverse variance weighted (multiplicative random effects) method


step 6 结果展示

step 7. Inverse variance weighted (fixed effects) 结果

step 7 和 step 8结果展示

和原文章的对比,原文章附件中Table S6包含了代谢物和 Alzheimer的所有MR结果,最后一行就是test的代谢物,Q estimate 均为5.25, P value for Q estimate 均为0.39,均选择 Fixed-effect model 进行分析,OR 值均为0.69,95% CI均为0.57-0.84,P value 均为1.98×10-4。

复现完成,喜大普奔......然后循环分析所有代谢物

原文章Table S6

大家来点赞啊,点赞超过100,分享day-4代码,多个工具变量F值的计算
另外,还有一些reviewer关心统计功效,power计算的R代码on the way

你可能感兴趣的:(复现一篇高分(IF = 11.274)孟德尔随机化分析文章-day3)