利用qiime2分析微生物组16S rRNA数据小结

混合双端、V3-V4区域测序,

00.RawData已经进行了样本拆分、barcode去除和引物切除。每个样本文件夹里有5个文件,第一个extendedfrags.fastq文件是拼接后的序列,raw_1fq.gz和raw_2.fq.gz是未去barcode和引物的双端序列;最后两文件是去掉引物和barcode后的原始数据。

extendedFrags.fastq文件是由flash软件合并双端序列(即reads拼接)所得。

处理过程:

1. 导入数据

1)创建文件列表seq-list.tsv文件(必须用绝对路径)

sample-id    absolute-filepath
A1  $PWD/data/A1_16S.fastq
A2  $PWD/data/A2_16S.fastq
A3  $PWD/data/A3_16S.fastq

2)导入数据

qiime tools import \
--type 'SampleData[SequencesWithQuality]' \
--input-path seq-list.tsv \
--output-path seqs.qza \
--input-format SingleEndFastqManifestPhred33V2

2. 按测序碱基质量过滤序列,得到Clean Data

qiime quality-filter q-score \
--i-demux seqs.qza \
--o-filtered-sequences demux-filtered.qza \
--o-filter-stats demux-filter-stats.qza

###Saved SampleData[SequencesWithQuality] to: demux-filtered.qza
###Saved QualityFilterStats to: demux-filter-stats.qza

3.质量控制和生成特征表(使用deblurvsearch)

1)deblur降噪16S(自带去嵌合体功能)

deblur在denoising时需要输入整齐一样长度的序列,所以需要trim成相同的长度。

deblur的开发者们建议设置一个质量分数开始迅速下降的长度。(recommend setting this value to a length where the median quality score begins to drop too low)

qiime deblur denoise-16S \
--i-demultiplexed-seqs demux-filtered.qza \
--p-trim-length 120 \
--o-representative-sequences new-seqs.qza \
--o-table new-table.qza \
--p-sample-stats \
--o-stats deblur-stats.qza

###Saved FeatureTable[Frequency] to: new-table.qza
###Saved FeatureData[Sequence] to: new-seqs.qza
###Saved DeblurStats to: deblur-stats.qza

2)Vsearch

qiime vsearch dereplicate-sequences \
--i-sequences demux-filtered.qza \
--o-dereplicated-table new-table.qza \
--o-dereplicated-sequences new-seqs.qza

###Saved FeatureTable[Frequency] to: new-table.qza
###Saved FeatureData[Sequence] to: new-seqs.qza

4. 生成OTU

1) close referenced

#将参考数据库rep_set/97_otus.fasta转成qza格式
qiime tools import \
--input-path  rep_set/97_otus.fasta \
--output-path 97_otus.qza \
--type 'FeatureData[Sequence]'

#Imported rep_set/97_otus.fasta as DNASequencesDirectoryFormat to 97_otus.qza


qiime vsearch cluster-features-closed-reference \
  --i-table new-table.qza \
  --i-sequences new-seqs.qza \
  --i-reference-sequences 97_otus.qza \
  --p-perc-identity 0.97 \
  --o-clustered-table table-cr-97.qza \
  --o-clustered-sequences seqs-cr-97.qza \
  --o-unmatched-sequences unmatched-cr-97.qza

#Saved FeatureTable[Frequency] to: table-cr-97.qza
#Saved FeatureData[Sequence] to: seqs-cr-97.qza
#Saved FeatureData[Sequence] to: unmatched-cr-97.qza

2) denovo

qiime vsearch cluster-features-de-novo \
  --i-table new-table.qza \
  --i-sequences new-seqs.qza \
  --p-perc-identity 0.99 \
  --o-clustered-table table-dn-99.qza \
  --o-clustered-sequences rep-seqs-dn-99.qza

3) open referenced

qiime vsearch cluster-features-open-reference \
  --i-table new-table.qza \
  --i-sequences new-seqs.qza \
  --i-reference-sequences 97_otus.qza \
  --p-perc-identity 0.97 \
  --o-clustered-table table-or-97.qza \
  --o-clustered-sequences rep-seqs-or-97.qza \
  --o-new-reference-sequences new-ref-seqs-or-97.qza

注:使用vsearch合并样本

 创建文件列表seq-list.tsv文件

sample-id   forward-absolute-filepath   reverse-absolute-filepath

A1  $PWD/data/A1_16S_R1.fastq   $PWD/data/A1_16S_R2.fastq

A2  $PWD/data/A2_16S_R1.fastq   $PWD/data/A2_16S_R2.fastq

A3  $PWD/data/A3_16S_R1.fastq   $PWD/data/A3_16S_R2.fastq

合并: 

qiime vsearch join-pairs \
  --i-demultiplexed-seqs primer-trimmed-demux.qza \
  --p-threads  4 \
  --o-joined-sequences demux-joined.qza
#查看合并结果
qiime demux summarize \
  --i-data demux-joined.qza \
  --o-visualization demux-joined.qzv

你可能感兴趣的:(生信,linux,qiime2)