2019-08-12 【三代出现】安装 cDNA_Cupcake

Last Updated: 07/26/2019
Cupcake是一个三代测序后续分析软件的集合,可以
cDNA_Cupcake is a miscellaneous collection of Python and R scripts used for analyzing sequencing data. Most of the scripts only require Biopython. For scripts that require additional libraries, it will be specified in documentation.
https://github.com/Magdoll/cDNA_Cupcake
Current version: 8.2

发现一个不错的介绍

https://github.com/Magdoll/cDNA_Cupcake/wiki#refgmap

  1. 首先通过git拉包
git clone https://github.com/Magdoll/cDNA_Cupcake.git

出现错误

(base) [jing@localhost ~]$ git clone https://github.com/Magdoll/cDNA_Cupcake.git
Cloning into 'cDNA_Cupcake'...
error: RPC failed; curl 56 OpenSSL SSL_read: SSL_ERROR_SYSCALL, errno 104
fatal: the remote end hung up unexpectedly

查询得解答:
使用git clone error: RPC failed

#Solution:
#修改Git的传输字节限制即可。
 git config --global http.postBuffer  524288000

运行以上代码后,正常下载了
这步骤比较慢,,14:20-15:05,断线,重新上, 大约60分钟

(base) [jing@localhost ~]$ git clone https://github.com/Magdoll/cDNA_Cupcake.git
Cloning into 'cDNA_Cupcake'...
remote: Enumerating objects: 164, done.
remote: Counting objects: 100% (164/164), done.
remote: Compressing objects: 100% (115/115), done.
Receiving objects:  18% (301/1615), 9.45 MiB | 49.00 KiB/s

运行以下:

export PATH=$PATH:/home/jing/cDNA_Cupcake/sequence/
export PATH=$PATH:/home/jing/cDNA_Cupcake/rarefaction/
改为自己的路径

  1. 装Cupcake ToFU
    因为: The only exception is Cupcake ToFU, which does require compiling and installation.
    https://github.com/Magdoll/cDNA_Cupcake/wiki/Cupcake-ToFU%3A-supporting-scripts-for-Iso-Seq-after-clustering-step

下载下来之后,
cd cDNA_Cupcake
python setup.py build
python setup.py install

报错

image.png

缺啥安啥
conda install numpy
yum search zlib
install之后重新运行安装

image.png

继续yum search gcc
install
还是不行

image.png

试一下
yum install gcc libffi-devel python-devel openssl-devel
还是不行
装了一堆,,还是不行。。。。。。有装好的告诉我下怎么装好么?



What to do after Iso Seq Cluster?https://github.com/PacificBiosciences/IsoSeq_SA3nUP/wiki/What-to-do-after-Iso-Seq-Cluster%3F

Cupcake ToFU 能做什么?

在经过cluster步骤之后,我们应该已经获得了高质量isoforms(HQ isoform sequences.),满足以下条件:

  1. 所得序列为全长(包含5‘UTR,序列中包含polyA)
  2. 高质量(predicted accuracy by default is >= 99%)
  3. 有至少2个全长序列支持(subreads?)

独白:可能用不着那么高质量的reads,也可以挖掘很多有用的信息

这写高质量isoforms中,依旧存在冗余序列(isoforms),因此前步骤产出的序列,并不能真正代表样品中的所有unique isoforms。有两个原因:

  1. Clustering algorithm tradeoff between sensitivity and specificity.

  2. Natural 5' degradation in RNA.
    所以,下面需要做的步骤有

  3. Best practice for aligning Iso Seq to reference genome: minimap2, GMAP, STAR, BLAT

  4. Collapse identical isoforms to obtain final set of unique, full-length, high-quality isoforms

  5. Obtain associated count information for each unique isoform

  6. Robust ORF prediction using ANGEL

  7. Fusion finding -- tutorial to come soon

Cupcake TOFU 可以做第 (2), (3), and (5)步

你可能感兴趣的:(2019-08-12 【三代出现】安装 cDNA_Cupcake)