10x数据类型:
每个样本测出3个fastq,通过I1,R1,R2来区别,
下载安装cellranger,
下载所需要的reference,
(一)跑cellranger count
/home/XXX/software/biosoftware/cellranger-2.2.0/cellranger count \
--id=ID24 \
--fastqs=/path/data/20180810_10x/10x/ \
--sample=WBJPE18018647-1_HMVMKCCXY_L6_WBJPE18018647_20180729_P,WBJPE18018647-2_HMVMKCCXY_L6_WBJPE18018647_20180729_P,WBJPE18018647-3_HMVMKCCXY_L6_WBJPE18018647_20180729_P,WBJPE18018647-4_HMVMKCCXY_L6_WBJPE18018647_20180729_P,WBJPE18018648-1_HMVMKCCXY_L7_WBJPE18018648_20180729_P,WBJPE18018648-2_HMVMKCCXY_L7_WBJPE18018648_20180729_P,WBJPE18018648-3_HMVMKCCXY_L7_WBJPE18018648_20180729_P,WBJPE18018648-4_HMVMKCCXY_L7_WBJPE18018648_20180729_P \
--transcriptome=/home/XXX/database/refdata-cellranger-GRCh38-1.2.0
8个样本,数据量约132G,耗时38小时,线程20个,内存128G。
最后得到的结果在outs目录下
Outputs:
- Run summary HTML: /path/data/20180810_10x/work/L006/outs/web_summary.html
- Run summary CSV: /path/data/20180810_10x/work/L006/outs/metrics_summary.csv
- BAM: /path/data/20180810_10x/work/L006/outs/possorted_genome_bam.bam
- BAM index: /path/data/20180810_10x/work/L006/outs/possorted_genome_bam.bam.bai
- Filtered gene-barcode matrices MEX: /path/data/20180810_10x/work/L006/outs/filtered_gene_bc_matrices
- Filtered gene-barcode matrices HDF5: /path/data/20180810_10x/work/L006/outs/filtered_gene_bc_matrices_h5.h5
- Unfiltered gene-barcode matrices MEX: /path/data/20180810_10x/work/L006/outs/raw_gene_bc_matrices
- Unfiltered gene-barcode matrices HDF5: /path/data/20180810_10x/work/L006/outs/raw_gene_bc_matrices_h5.h5
- Secondary analysis output CSV: /path/data/20180810_10x/work/L006/outs/analysis
- Per-molecule read information: /path/data/20180810_10x/work/L006/outs/molecule_info.h5
- Loupe Cell Browser file: /path/data/20180810_10x/work/L006/outs/cloupe.cloupe
2018-08-29 03:45:03 [perform] Serializing pipestance performance data.
Waiting 6 seconds for UI to do final refresh.
Pipestance completed successfully!
(二)用dropEst软件跑10x 数据
1,创建目录及配置文件
mkdir -p 01_dropTag 02_alignment 03_dropEst
sh pipeline.sh \
/home/XXX/software/biosoftware/dropEst/build \ # dropest软件路径
/path/work/02.dropEst/10x.test.xml \ # 配置文件
/path/work/02.dropEst/star \ # star的索引路径
/home/XXX/database/refdata-cellranger-GRCh38-1.2.0/genes/genes.gtf # gtf文件路径
配置文件xml如下:
10x
8
16
10
0
10
10000000
AAAAAAAA
/home/XXX/software/biosoftware/dropEst/data/barcodes/10x_aug_2016_split
const
0.2
2
1
100
20
1e-5
1e-7
这里的pipeline.sh如下:
$ cat pipeline.sh
if [ "$#" -ne 4 ]; then
echo "usage: $0 dropest_directory config_file star_index_folder gtf_with_genes"
echo "example: $0 ~/dropEst/build ~/dropEst/configs/indrop_v3.xml ~/star/mm10/index/ ~/star/mm10/genes.gtf"
exit 1
fi
dropest_dir=$1
config_file=$2
star_index=$3
gtf_file=$4
cd 01_dropTag
$dropest_dir/droptag -c $config_file -r 0 -p 20 -S -s -n sample1 -l sample1 /path/work/02.dropEst/data/WBJPE18018647-1_HMVMKCCXY_L6_WBJPE18018647_20180729_P_S1_L006_I1_001.fastq.gz /path/work/02.dropEst/data/WBJPE18018647-1_HMVMKCCXY_L6_WBJPE18018647_20180729_P_S1_L006_R1_001.fastq.gz /path/work/02.dropEst/data/WBJPE18018647-1_HMVMKCCXY_L6_WBJPE18018647_20180729_P_S1_L006_R2_001.fastq.gz
$dropest_dir/droptag -c $config_file -r 0 -p 20 -S -s -n sample2 -l sample2 /path/work/02.dropEst/data/WBJPE18018647-2_HMVMKCCXY_L6_WBJPE18018647_20180729_P_S2_L006_I1_001.fastq.gz /path/work/02.dropEst/data/WBJPE18018647-2_HMVMKCCXY_L6_WBJPE18018647_20180729_P_S2_L006_R1_001.fastq.gz /path/work/02.dropEst/data/WBJPE18018647-2_HMVMKCCXY_L6_WBJPE18018647_20180729_P_S2_L006_R2_001.fastq.gz
$dropest_dir/droptag -c $config_file -r 0 -p 20 -S -s -n sample3 -l sample3 /path/work/02.dropEst/data/WBJPE18018647-3_HMVMKCCXY_L6_WBJPE18018647_20180729_P_S3_L006_I1_001.fastq.gz /path/work/02.dropEst/data/WBJPE18018647-3_HMVMKCCXY_L6_WBJPE18018647_20180729_P_S3_L006_R1_001.fastq.gz /path/work/02.dropEst/data/WBJPE18018647-3_HMVMKCCXY_L6_WBJPE18018647_20180729_P_S3_L006_R2_001.fastq.gz
$dropest_dir/droptag -c $config_file -r 0 -p 20 -S -s -n sample4 -l sample4 /path/work/02.dropEst/data/WBJPE18018647-4_HMVMKCCXY_L6_WBJPE18018647_20180729_P_S4_L006_I1_001.fastq.gz /path/work/02.dropEst/data/WBJPE18018647-4_HMVMKCCXY_L6_WBJPE18018647_20180729_P_S4_L006_R1_001.fastq.gz /path/work/02.dropEst/data/WBJPE18018647-4_HMVMKCCXY_L6_WBJPE18018647_20180729_P_S4_L006_R2_001.fastq.gz
$dropest_dir/droptag -c $config_file -r 0 -p 20 -S -s -n sample5 -l sample5 /path/work/02.dropEst/data/WBJPE18018648-1_HMVMKCCXY_L7_WBJPE18018648_20180729_P_S5_L007_I1_001.fastq.gz /path/work/02.dropEst/data/WBJPE18018648-1_HMVMKCCXY_L7_WBJPE18018648_20180729_P_S5_L007_R1_001.fastq.gz /path/work/02.dropEst/data/WBJPE18018648-1_HMVMKCCXY_L7_WBJPE18018648_20180729_P_S5_L007_R2_001.fastq.gz
$dropest_dir/droptag -c $config_file -r 0 -p 20 -S -s -n sample6 -l sample6 /path/work/02.dropEst/data/WBJPE18018648-2_HMVMKCCXY_L7_WBJPE18018648_20180729_P_S6_L007_I1_001.fastq.gz /path/work/02.dropEst/data/WBJPE18018648-2_HMVMKCCXY_L7_WBJPE18018648_20180729_P_S6_L007_R1_001.fastq.gz /path/work/02.dropEst/data/WBJPE18018648-2_HMVMKCCXY_L7_WBJPE18018648_20180729_P_S6_L007_R2_001.fastq.gz
$dropest_dir/droptag -c $config_file -r 0 -p 20 -S -s -n sample7 -l sample7 /path/work/02.dropEst/data/WBJPE18018648-3_HMVMKCCXY_L7_WBJPE18018648_20180729_P_S7_L007_I1_001.fastq.gz /path/work/02.dropEst/data/WBJPE18018648-3_HMVMKCCXY_L7_WBJPE18018648_20180729_P_S7_L007_R1_001.fastq.gz /path/work/02.dropEst/data/WBJPE18018648-3_HMVMKCCXY_L7_WBJPE18018648_20180729_P_S7_L007_R2_001.fastq.gz
$dropest_dir/droptag -c $config_file -r 0 -p 20 -S -s -n sample8 -l sample8 /path/work/02.dropEst/data/WBJPE18018648-4_HMVMKCCXY_L7_WBJPE18018648_20180729_P_S8_L007_I1_001.fastq.gz /path/work/02.dropEst/data/WBJPE18018648-4_HMVMKCCXY_L7_WBJPE18018648_20180729_P_S8_L007_R1_001.fastq.gz /path/work/02.dropEst/data/WBJPE18018648-4_HMVMKCCXY_L7_WBJPE18018648_20180729_P_S8_L007_R2_001.fastq.gz
cd ../02_alignment
STAR --runThreadN 20 --genomeDir $star_index --readFilesCommand zcat --outSAMtype BAM Unsorted --readFilesIn /path/work/02.dropEst/01_dropTag/sample1.fastq.gz.tagged.fastq.gz,/path/work/02.dropEst/01_dropTag/sample2.fastq.gz.tagged.fastq.gz,/path/work/02.dropEst/01_dropTag/sample3.fastq.gz.tagged.fastq.gz,/path/work/02.dropEst/01_dropTag/sample4.fastq.gz.tagged.fastq.gz,/path/work/02.dropEst/01_dropTag/sample5.fastq.gz.tagged.fastq.gz,/path/work/02.dropEst/01_dropTag/sample6.fastq.gz.tagged.fastq.gz,/path/work/02.dropEst/01_dropTag/sample7.fastq.gz.tagged.fastq.gz,/path/work/02.dropEst/01_dropTag/sample8.fastq.gz.tagged.fastq.gz
cd ../03_dropEst
# $dropest_dir/dropest -w -M -u -G 20 -g $gtf_file -c $config_file ../02_alignment/Aligned.out.bam
$dropest_dir/dropest -w -m -r "/path/work/02.dropEst/01_dropTag/sample8.params.gz /path/work/02.dropEst/01_dropTag/sample7.params.gz /path/work/02.dropEst/01_dropTag/sample6.params.gz /path/work/02.dropEst/01_dropTag/sample5.params.gz /path/work/02.dropEst/01_dropTag/sample4.params.gz /path/work/02.dropEst/01_dropTag/sample3.params.gz /path/work/02.dropEst/01_dropTag/sample2.params.gz /path/work/02.dropEst/01_dropTag/sample1.params.gz" -g $gtf_file -c $config_file ../02_alignment/Aligned.out.bam
分步去跑第一步,droptag,然后将8个样本的结果合并起来,用于第二步的比对,再跑第三步的dropest。
第三步dropest报错:内存超了128G,因为我服务器的运行内存只有128G,转到天河超算中跑。