TwoSampleMR:local clump(MR-Base exceeded 300 seconds) 包括Windows和Linux R解决办法

这篇具有很好参考价值的文章主要介绍了TwoSampleMR:local clump(MR-Base exceeded 300 seconds) 包括Windows和Linux R解决办法。希望对大家有所帮助。如果存在错误或未考虑完全的地方,请大家不吝赐教,您也可以点击"举报违法"按钮提交疑问。

首先是Windows

一个做孟德尔随机化的过程遇到的报错:

bmi_exp_dat <- clump_data(bmi_exp_dat,clump_r2=0.01,pop = "EUR")

Please look at vignettes for options on running this locally if you need to run many instances of this command.
Clumping C5nTuK, 5340156 variants, using EUR population reference
Error in api_query("ld/clump", query = list(rsid = dat[["rsid"]], pval = dat[["pval"]],  : 
  The query to MR-Base exceeded 300 seconds and timed out. Please simplify the query

以下是作者给出的解决办法:

  1. Extracting without clumping
  2. Performing clumping on each chromosome separately

Alternatively, you could try downloading the VCF from IEU OpenGWAS project and using the gwasvcf to extract based on p-value, and use the ieugwasr package to do clumping locally. We're trying to get to a point where it's easy to do heavier computation on these data locally.

LD reference files listed in the gwasvcf page

看到作者建议可以用ieugwasr这个包做本地clump。

下面尝试做本地clump

首先安装ieugwasr包

devtools::install_github("mrcieu/ieugwasr")

 然后我们可以看到

ld_clump_local()

Wrapper for clump function using local plink binary and ld reference dataset

 ld_clump_local()这个函数是用来做本地clump的,用法和参数如下:

ld_clump_local(dat, clump_kb, clump_r2, clump_p, bfile, plink_bin)

Arguments

dat

Dataframe. Must have a variant name column ("variant") and pval column called "pval". If id is present then clumping will be done per unique id.

clump_kb

Clumping kb window. Default is very strict, 10000

clump_r2

Clumping r2 threshold. Default is very strict, 0.001

clump_p

Clumping sig level for index variants. Default = 1 (i.e. no threshold)

bfile

If this is provided then will use the API. Default = NULL

plink_bin

Specify path to plink binary. Default = NULL. See https://github.com/explodecomputer/plinkbinr for convenient access to plink binaries

 需要plink包,安装一下plink包,获得可执行二进制plink地址

devtools::install_github("explodecomputer/plinkbinr")
library(plinkbinr)
get_plink_exe()
#[1] "D:/R-4.1.1/library/plinkbinr/bin/plink_Windows.exe"

然后我们下载bfile,我的GWAS是欧洲人种所以用EUR*

wget http://fileserve.mrcieu.ac.uk/ld/1kg.v3.tgz

 #ld_clump_local()报错,作者是把两个函数合并了,在ld_clump函数里包含bfile参数的话,会自动调用ld_clump_local

所以改用ld_clump

ld_clump( dplyr::tibble(rsid=dat$rsid, pval=dat$pval, id=dat$trait_id), plink_bin = genetics.binaRies::get_plink_binary(), bfile = "/path/to/reference/EUR" )"
b <- ld_clump(
    dplyr::tibble(rsid=a$rsid, pval=a$p, id=a$id),
    #get_plink_exe()
    plink_bin = "D:/R-4.1.1/library/plinkbinr/bin/plink_Windows.exe",
    #欧洲人群参考基因组位置
    bfile = "D:/EUR_ref/EUR"
)

注意a的列名,必须要有:(with the following columns:)

  • rsid
  • pval
  • trait_id

然后就完成了,可以看到跟在线方法去除的LD SNP个数是一样的

PLINK v1.90b6.10 64-bit (17 Jun 2019)          www.cog-genomics.org/plink/1.9/
(C) 2005-2019 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to C:\Users\Lenovo\AppData\Local\Temp\RtmpS8aq1j\file2e646a65362c.log.
Options in effect:
  --bfile D:/EUR_ref/EUR
  --clump C:\Users\Lenovo\AppData\Local\Temp\RtmpS8aq1j\file2e646a65362c
  --clump-kb 10000
  --clump-p1 0.99
  --clump-r2 0.001
  --out C:\Users\Lenovo\AppData\Local\Temp\RtmpS8aq1j\file2e646a65362c

32549 MB RAM detected; reserving 16274 MB for main workspace.
8550156 variants loaded from .bim file.
503 people (0 males, 0 females, 503 ambiguous) loaded from .fam.
Ambiguous sex IDs written to
C:\Users\Lenovo\AppData\Local\Temp\RtmpS8aq1j\file2e646a65362c.nosex .
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 503 founders and 0 nonfounders present.
Calculating allele frequencies... done.
8550156 variants and 503 people pass filters and QC.
Note: No phenotypes present.
--clump: 78 clumps formed from 2011 top variants.
Results written to
C:\Users\Lenovo\AppData\Local\Temp\RtmpS8aq1j\file2e646a65362c.clumped .
Warning: 'rs9930333' is missing from the main dataset, and is a top variant.
Warning: 'rs8083289' is missing from the main dataset, and is a top variant.
Warning: 'rs2683992' is missing from the main dataset, and is a top variant.
27 more top variant IDs missing; see log file.
Removing 1963 of 2041 variants due to LD with other variants or absence from LD reference panel

Linux同理

expo_dat <- expo_dat[which(expo_dat$pval.exposure<1e-5),]

b <- ld_clump(
    dplyr::tibble(rsid=expo_dat$SNP, pval=expo_dat$pval.exposure, id=expo_dat$id.exposure),
    #plink位置
    plink_bin = "/GM_GWAS_LD_clumped_snps/plink",
    bfile = "/GM_GWAS_LD_clumped_snps/EUR_ref/EUR",
    clump_kb = 1000,clump_r2 = 0.1
)
expo_dat <- expo_dat[which(expo_dat$SNP %in% b$rsid),]
Clumping A9o7Sb, 50 variants, using EUR population reference
PLINK v1.90b6.21 64-bit (19 Oct 2020)          www.cog-genomics.org/plink/1.9/
(C) 2005-2020 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to /tmp/RtmpdrrA6m/file4bc2f27623281.log.
Options in effect:
  --bfile /share2/pub/zhenggw/zhenggw/GM_GWAS_LD_clumped_snps/EUR_ref/EUR
  --clump /tmp/RtmpdrrA6m/file4bc2f27623281
  --clump-kb 1000
  --clump-p1 0.99
  --clump-r2 0.1
  --out /tmp/RtmpdrrA6m/file4bc2f27623281

515328 MB RAM detected; reserving 257664 MB for main workspace.
Allocated 61144 MB successfully, after larger attempt(s) failed.
8550156 variants loaded from .bim file.
503 people (0 males, 0 females, 503 ambiguous) loaded from .fam.
Ambiguous sex IDs written to /tmp/RtmpdrrA6m/file4bc2f27623281.nosex .
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 503 founders and 0 nonfounders present.
Calculating allele frequencies... done.
8550156 variants and 503 people pass filters and QC.
Note: No phenotypes present.
Warning: 'rs55654746' is missing from the main dataset, and is a top variant.
--clump: 8 clumps formed from 49 top variants.
Results written to /tmp/RtmpdrrA6m/file4bc2f27623281.clumped .
Removing 42 of 50 variants due to LD with other variants or absence from LD reference panel

更新了 更新了

新增加In file(file, "rt") :   cannot open file '/tmp/Rtmp8ur92k/xxx': No such file or directory 解决办法

好多人出现了这个问题,我今天做LD matrix也遇到了,来解决一下,报错信息是找不到文件,我考虑是tmp文件在程序运行过程中被自动删除了。尝试手动添加tmp文件名

ld_matrix_local <- function(variants, bfile, plink_bin, with_alleles=TRUE)
{
	# Make textfile
	shell <- ifelse(Sys.info()['sysname'] == "Windows", "cmd", "sh")
	fn <- tempfile()


ld_matrix_local <- function(variants, bfile, plink_bin, with_alleles=TRUE)
{
    # Make textfile
    shell <- ifelse(Sys.info()['sysname'] == "Windows", "cmd", "sh")
    fn <- "~/plinkLD/tmpfile/tmp"

发现还是同样error

但是提示里面发现Using up to 39 threads (change this with --threads),考虑可能是线程冲突,导致文件没有产生,所以在源代码加入线程变量

    fun2 <- paste0(
        shQuote(plink_bin, type=shell),
        " --bfile ", shQuote(bfile, type=shell),
        " --extract ", shQuote(fn, type=shell), 
        " --r square ", 
        " --keep-allele-order ",


        " --threads 1 ",


        " --out ", shQuote(fn, type=shell)
    )
515328 MB RAM detected; reserving 257664 MB for main workspace.
Allocated 61144 MB successfully, after larger attempt(s) failed.
8550156 variants loaded from .bim file.
503 people (0 males, 0 females, 503 ambiguous) loaded from .fam.
Ambiguous sex IDs written to /tmp/Rtmp8ur92k/file557ea8a4015e.nosex .
--extract: 3839 variants remaining.
Using 1 thread.
Before main variant filters, 503 founders and 0 nonfounders present.
Calculating allele frequencies... done.
3839 variants and 503 people pass filters and QC.
Note: No phenotypes present.
--r square to /tmp/Rtmp8ur92k/file557ea8a4015e.ld ... done.

顺利解决!!!

 所以原函数加一行代码就好 --threads 1 ,这个线程1好像速度也蛮快的,不懂有什么影响文章来源地址https://www.toymoban.com/news/detail-411528.html

到了这里,关于TwoSampleMR:local clump(MR-Base exceeded 300 seconds) 包括Windows和Linux R解决办法的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处: 如若内容造成侵权/违法违规/事实不符,请点击违法举报进行投诉反馈,一经查实,立即删除!

领支付宝红包 赞助服务器费用

相关文章

  • 【R包安装】TwoSampleMR 两样本孟德尔随机化

    根据报错,依次安装依赖包,不能直接在R中安装的包(可能是因为版本不对)可以用conda安装。 安装好依赖包以后就能成功安装TwoSampleMR包了 library以后显示上面的信息就说明安装成功了

    2024年02月11日
    浏览(42)
  • ADO世界之SECOND

    目录 一、ADO 排序记录 1.对数据进行排序 2.根据指定的字段名处对记录进行升序排序 3.根据指定的字段名处对记录进行降序排序 4.让用户来选择根据哪列进行排序 二、ADO 添加记录 1.向数据库中的表添加记录 2.使用 INSERT command 命令时的注意事项 3.无数据字段 三、ADO 更新记录

    2024年02月19日
    浏览(29)
  • 练习时长两年半的网络安全防御“second”

    目录   1.防火墙的安全区域 Trust区域 DMZ区域 Untrust区域 Local区域 安全区域的受信任程度与优先级 2. 安全策略 ​编辑 安全域间、安全策略与报文流动方向  安全域间是用来描述流量的传输通道,它是两个“区域”之间的唯一“道路”。如果希望对经过这条通 道的流量进行检

    2024年02月15日
    浏览(58)
  • 0032【Edabit ★☆☆☆☆☆】【每秒帧数】Frames Per Second

    0032【Edabit ★☆☆☆☆☆】【每秒帧数】Frames Per Second algorithms language_fundamentals math numbers Instructions Create a function that returns the number of frames shown in a given number of minutes for a certain FPS. Examples Notes FPS stands for “frames per second” and it’s the number of frames a computer screen shows every second.

    2024年02月08日
    浏览(31)
  • 【性能测试】性能测试指标TPS(Transaction per Second)

    性能测试指标TPS(Transaction per Second)总结。 提示:以下是本篇文章正文内容,下面案例可供参考 tps是Transaction per Second的缩写,也就是事物数/秒。它是软件测试结果的测量单位,一个事物是指一个客户机向服务器发送请求饭后服务器做出反应的过程。 客户机在发送请求时开

    2024年02月01日
    浏览(82)
  • pnpm 源不对 Will retry in 10 seconds. 2 retries left.

    由于使用npm config set registry 切换淘宝源时,把地址打错了。 后面使用pnpm install 时出现  此时无论我怎么使用npm config set registry 或者pnpm config set registry 切回正确的源均没有效果。 在其他用npm的项目运行一下npm i  再运行pnpm i 即可解决。

    2024年01月24日
    浏览(39)
  • MySQL binlog超过binlog_expire_logs_seconds阈值没有删除案例

    生产环境有一套3个节点的MySQL InnoDB Cluster,MySQL的版本为Server version: 8.0.35 MySQL Community Server - GPL,早上突然收到Zabbix的告警,其中一个节点出现空间告警:\\\"/data: Disk space is low (used 80%)\\\" 检查分析后发现是因为MySQL的binlog没有清理导致空间报警,如下所示(binlog太多,省略了部分内

    2024年04月11日
    浏览(33)
  • Docker容器启动elasticsearch总是失败?status为Exited (1) XX seconds ago?

    执行完以下命令会出现启动失败的错误。 docker run -d –name es -e “ES_JAVA_OPTS=-Xms512m -Xmx512m” -e “discovery.type=single-node” -v es-data:/usr/share/elasticsearch/data -v es-plugins:/usr/share/elasticsearch/plugins –privileged –network es-net -p 9200:9200 -p 9300:9300 elasticsearch:7.12.1 如下图: -e “ES_JAVA_OPTS=-Xms5

    2024年02月05日
    浏览(50)
  • vscode执行Python输出exited with code=9009 in 0.655 seconds

    想用vscode写个脚本,用自己电脑配置了下vscode的python环境,结果点击右上角三角图标运行时却只会输出 exited with code=9009 in 0.655 seconds 这就不太理解了,我在公司时是能正常输出的呀,然后想到这界面好像不同,记起来我在公司时是按照别人博客配置的在终端执行 Run in Termin

    2023年04月13日
    浏览(29)

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

博客赞助

微信扫一扫打赏

请作者喝杯咖啡吧~博客赞助

支付宝扫一扫领取红包,优惠每天领

二维码1

领取红包

二维码2

领红包