2018-04-18宏基因组实战qiime2-201802(四)用dada2 过滤 和建树

因为我是双端数据,所以这一步我主要是参考了这个实战:
https://docs.qiime2.org/2018.2/tutorials/atacama-soils/

我前一步已经拿到了我切过引物的数据,要先看一下这个质量分布


切之后

这里我先上代码

qiime dada2 denoise-paired \
  --p-n-threads 0 \
  --i-demultiplexed-seqs trimmed-seqs.qza \
  --o-table table \
  --o-representative-sequences rep-seqs \
  --p-trim-left-f 0 \
  --p-trim-left-r 0 \
  --p-trunc-len-f 220 \
  --p-trunc-len-r 220

别忘了设置线程数哦

根据自己的分析需要来选择左边和右边切多少
我这里粗暴地切了220,切多切少都不好,要看你reads的具体质量

这一步其实很关键,会生成一个table 和一个 rep-seqs
用来做后续的分析

接下来就可以可视化 table 和 rep-seqs

qiime feature-table summarize \
  --i-table table.qza \
  --o-visualization table.qzv \
  --m-sample-metadata-file sample-metadata.tsv
qiime tools view table.qzv

做table的时候要输入一个 sample-metadata.tsv 里面会有很多信息,后续分析也要用

qiime feature-table tabulate-seqs \
  --i-data rep-seqs.qza \
  --o-visualization rep-seqs.qzv
qiime tools view rep-seqs.qzv 

这里我举一个之前失败的例子

质量很差

这里sequence count 几乎没有。后面无法分析下去

建树

这一步基本就是看攻略

https://docs.qiime2.org/2018.2/tutorials/moving-pictures/
中文版的更好理解一点:https://blog.csdn.net/woodcorpse/article/details/75204871

perform a multiple sequence alignment of the sequences

qiime alignment mafft \
  --i-sequences rep-seqs.qza \
  --o-alignment aligned-rep-seqs.qza

Next, we mask (or filter) the alignment to remove positions that are highly variable. These positions are generally considered to add noise to a resulting phylogenetic tree.

qiime alignment mask \
  --i-alignment aligned-rep-seqs.qza \
  --o-masked-alignment masked-aligned-rep-seqs.qza

Next, we’ll apply FastTree to generate a phylogenetic tree from the masked alignment.

qiime phylogeny fasttree \
  --i-alignment masked-aligned-rep-seqs.qza \
  --o-tree unrooted-tree.qza

we apply midpoint rooting to place the root of the tree at the midpoint of the longest tip-to-tip distance in the unrooted tree.

qiime phylogeny midpoint-root \
  --i-tree unrooted-tree.qza \
  --o-rooted-tree rooted-tree.qza

我这也是照葫芦画瓢
还是应该好好理解一番

你可能感兴趣的:(2018-04-18宏基因组实战qiime2-201802(四)用dada2 过滤 和建树)