Transdecoder预测转录本的开放阅读框(CDS)

首先trinity生成的fasta文件

安装Transdecoder(我是通过conda安装的,也可以去下载安装包自己解压加环境)

运行TransDecoder.LongOrfs

$TransDecoder.LongOrfs -t /data1/spider/ytbiosoft/data/trinity.all/trinity_out_dir_all.Trinity.fasta   

结果如下:


(bioinforspace) [spider 04:01:18 e[35;1m]/data1/spider/ytbiosoft/data/trinity.all/TransDecoder.LongOrfs

$TransDecoder.LongOrfs -t /data1/spider/ytbiosoft/data/trinity.all/trinity_out_dir_all.Trinity.fasta

-- Skipping CMD: /data1/spider/miniconda3/envs/bioinforspace/opt/transdecoder/util/compute_base_probs.pl /data1/spider/ytbiosoft/data/trinity.all/trinity_out_dir_all.Trinity.fasta 0 > /data1/spider/ytbiosoft/data/trinity.all/TransDecoder.LongOrfs/trinity_out_dir_all.Trinity.fasta.transdecoder_dir/base_freqs.dat, checkpoint [/data1/spider/ytbiosoft/data/trinity.all/TransDecoder.LongOrfs/trinity_out_dir_all.Trinity.fasta.transdecoder_dir.__checkpoints_longorfs/base_freqs_file.ok] exists.

- extracting ORFs from transcripts.

-total transcripts to examine: 375779

[375700/375779] = 99.98% done    CMD: touch /data1/spider/ytbiosoft/data/trinity.all/TransDecoder.LongOrfs/trinity_out_dir_all.Trinity.fasta.transdecoder_dir.__checkpoints_longorfs/TD.longorfs.ok

#################################

### Done preparing long ORFs.  ###

##################################

    Use file: /data1/spider/ytbiosoft/data/trinity.all/TransDecoder.LongOrfs/trinity_out_dir_all.Trinity.fasta.transdecoder_dir/longest_orfs.pep  for Pfam and/or BlastP searches to enable homology-based coding region identification.

    Then, run TransDecoder.Predict for your final coding region predictions.


运行TransDecoder.Predict


$TransDecoder.Predict -t /data1/spider/ytbiosoft/data/trinity.all/trinity_out_dir_all.Trinity.fasta  结果如下:

-rw-rw-r-- 1 spider spider      1449 Apr 11 19:48 pipeliner.38661.cmds

-rw-rw-r-- 1 spider spider       296 Apr 11 20:01 pipeliner.38864.cmds

-rw-rw-r-- 1 spider spider      3644 Apr 11 20:41 pipeliner.40446.cmds

-rw-rw-r-- 1 spider spider      3350 Apr 11 20:38 pipeliner.41631.cmds

-rw-rw-r-- 1 spider spider  16515704 Apr 11 20:40 trinity_out_dir_all.Trinity.fasta.transdecoder.bed

-rw-rw-r-- 1 spider spider 104584265 Apr 11 20:43 trinity_out_dir_all.Trinity.fasta.transdecoder.cds

-rw-rw-r-- 1 spider spider  75371916 Apr 11 20:39 trinity_out_dir_all.Trinity.fasta.transdecoder.gff3

-rw-rw-r-- 1 spider spider  44730662 Apr 11 20:41 trinity_out_dir_all.Trinity.fasta.transdecoder.pep

drwxrwxr-x 3 spider spider      4096 Apr 11 20:38 trinity_out_dir_all.Trinity.fasta.transdecoder_dir

drwxrwxr-x 2 spider spider      4096 Apr 11 20:43 trinity_out_dir_all.Trinity.fasta.transdecoder_dir.__checkpoints

drwxrwxr-x 2 spider spider        52 Apr 11 20:12 trinity_out_dir_all.Trinity.fasta.transdecoder_dir.__checkpoints_longorfs

其中:

*.pep (是最终的候选ORF编码的蛋白序列)

*.cds (是编码蛋白的核酸序列)

*.gff3 (是表示ORF和转录本的位置关系)

*.bed (用于后期的IGV可视化)

欢迎联系互相学习:[email protected]

你可能感兴趣的:(Transdecoder预测转录本的开放阅读框(CDS))