BWA软件安装和使用

BWA软件安装和使用:


1.安装请参考【1】


2.使用:

hadoop@Mcnode1:~/cloud/adam/xubo/data/down-sratool/sra$ bwa aln ../../dmel-all-chromosome-r5.37/dmel-all-chromosome-r5.37.fasta DRR047093.fastq >RAL357_1.sai
[bwa_aln] 17bp reads: max_diff = 2
[bwa_aln] 38bp reads: max_diff = 3
[bwa_aln] 64bp reads: max_diff = 4
[bwa_aln] 93bp reads: max_diff = 5
[bwa_aln] 124bp reads: max_diff = 6
[bwa_aln] 157bp reads: max_diff = 7
[bwa_aln] 190bp reads: max_diff = 8
[bwa_aln] 225bp reads: max_diff = 9
[bwa_aln_core] calculate SA coordinate... 6.11 sec
[bwa_aln_core] write to the disk... 0.00 sec
[bwa_aln_core] 9261 sequences have been processed.
[main] Version: 0.7.12-r1039
[main] CMD: bwa aln ../../dmel-all-chromosome-r5.37/dmel-all-chromosome-r5.37.fasta DRR047093.fastq
[main] Real time: 6.259 sec; CPU: 6.196 sec



 bwa samse ../../dmel-all-chromosome-r5.37/dmel-all-chromosome-r5.37.fasta RAL357_1.sai  DRR047093.fastq > RAL357_1.sam

hadoop@Mcnode1:~/cloud/adam/xubo/data/down-sratool/sra$ bwa samse ../../dmel-all-chromosome-r5.37/dmel-all-chromosome-r5.37.fasta RAL357_1.sai  DRR047093.fastq > RAL357_1.sam
[bwa_aln_core] convert to sequence coordinate... 0.13 sec
[bwa_aln_core] refine gapped alignments... 0.19 sec
[bwa_aln_core] print alignments... 0.03 sec
[bwa_aln_core] 9261 sequences have been processed.
[main] Version: 0.7.12-r1039
[main] CMD: bwa samse ../../dmel-all-chromosome-r5.37/dmel-all-chromosome-r5.37.fasta RAL357_1.sai DRR047093.fastq
[main] Real time: 1.121 sec; CPU: 0.381 sec

查看生成的sam文件:
hadoop@Mcnode1:~/cloud/adam/xubo/data/down-sratool/sra$ more RAL357_1.sam 
@SQ	SN:YHet	LN:347038
@SQ	SN:dmel_mitochondrion_genome	LN:19517
@SQ	SN:2L	LN:23011544
@SQ	SN:X	LN:22422827
@SQ	SN:3L	LN:24543557
@SQ	SN:4	LN:1351857
@SQ	SN:2R	LN:21146708
@SQ	SN:3R	LN:27905053
@SQ	SN:Uextra	LN:29004656
@SQ	SN:2RHet	LN:3288761
@SQ	SN:2LHet	LN:368872
@SQ	SN:3LHet	LN:2555491
@SQ	SN:3RHet	LN:2517507
@SQ	SN:U	LN:10049037
@SQ	SN:XHet	LN:204112
@PG	ID:bwa	PN:bwa	VN:0.7.12-r1039	CL:bwa samse ../../dmel-all-chromosome-r5.37/dmel-all-chromosome-r5.37.fasta RAL357_1.sai DRR047093.fastq
DRR047093.1	4	*	0	0	*	*	0	0	CAAAGTGGCGTCGTCTTGAGCCCATCATCAATATCATCGTTTACATTAAGTAGAAAGTGTAACTAGACAAATGTTTTCATTTCCGCCTCGTTGTTGAACTCCCGTGGAGAA
CCCATGCTTCCCCTGATTTAACATCGGTATTGTATTCAATCCTTCTGCTCTCCCCGGCGAATGCATCGTTAATGGTTGGTTTCCGCGTAAACG	I555IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIBBBEHIIIIIIIIIIIIIIIIIIIIFIBD111;7///-;<557;;FFIIIIIIIII
IIIHH<<<@BIIIIIIIIIIIIIII>>>>HHIHHIIHIHHIHHHHHHIHIHIIIHIHHFCBBDDD<;1111779@DHIIIIIIIIIIIIIIIIIIB>>IIIIIIFFFII
DRR047093.2	4	*	0	0	*	*	0	0	CAAAGTGGCGTCGTCTTGAGCCCGTCATCAATATCATCGTTTACATTAAGTAGAAAAGTGTAACTAGACAAAATGTTTTTCATTTCCGCCTCGTTGTTGAACTCCCGTGGA
GAACTCATGCTTCCCCTCGATTTAACTATCGGTATTGTATTCAATCCTTCTGCTCTCCCCGGCCGAATGCATCGTTAATGGTTGGTTTCCGTCGTAAACG	IIIIIIIIIIIIIIIIIIIIIIIFIIIIIIIIIIIIIIIIFFIIIIIIIIIIIEECCIIIIIIIIIIIIECCCII99999HIIIIII
IIIIIIIIIIIIIIIIIIIIIIIIIIIIFCGGDC<<66667/66>>@?:9399FIIAAAAFIIIFFFFEFFBBBAEFEDD:===;;0038?@@@@BBDCC=====;;CCBBBE::3:=FIFF==
DRR047093.3	0	3R	17056839	37	235M	*	0	0	GAGAGATCCCGTGCCGTTAGCTTTAGATCCTCAGGAACCTGCGAGTAGTCAAAGTCCAGAACGATACTGGAGTCACCTTCGTTGTTATTGGCCGTCTCATAGG
TTTTGAGCAGCGCCTGGCGATCCACCTTGCCGTTGACCAGCAATGGAACGTGCTCCAGGATGACCACCTGCGGCGTCATGTAATCGGCTAGCTTGTCCTTGAGACGAGCCTCCATCTGCATCTCGGTGACCA	IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII
IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIBBBBIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII	XT:A:UNM:i:1	X0:i:1	X1:i:0	XM:i:1	XO:i:0	XG:i:0	MD:Z:0A234
DRR047093.4	16	3L	6707450	37	218M	*	0	0	ACCGGGAATACTATATCGCGTGTCTATATAGTCTAGGTAAATATTGTGAGAGGCATAATGAAGATAATAATAATACAAAAACAATTTTTGTCTGAGTATACAATCGGTTTT
TGTGTGGTACTTTGCCTACTAAGTGCGGATGTATCTGAACTTTGCTTTCCCAGCTTTTCACTTCACTTAATTCGCT

hadoop@Mcnode1:~/cloud/adam/xubo/data/down-sratool/sra$ cat RAL357_1.sam | wc -l
9277


数据源:

curl -O ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r5.37_FB2011_05/fasta/dmel-all-chromosome-r5.37.fasta.gz
gunzip dmel-all-chromosome-r5.37.fasta.gz

fastq-dump -Z DRR047093
fastq-dump  DRR047093
 
  

更多数据:

Beginning with a list of desired SRA data sets (e.g., a list of SRA Run accessions, “SRRs”), the exact download location for that data file can be determined as follows:

wget/FTP root: ftp://ftp-trace.ncbi.nih.gov

ascp root: vog.hin.mln.ibcn.ptf@ptfnona:

Remainder of path:

/sra/sra-instant/reads/ByRun/sra/{SRR|ERR|DRR}///.sra

Where


来源:http://www.ncbi.nlm.nih.gov/books/NBK158899/#SRA_download.downloading_sra_data_using




参考:

【1】 http://mingkang1217.blog.163.com/blog/static/2035227201101254921398/

【2】 http://ged.msu.edu/angus/tutorials-2011/bwa_tutorial.html




你可能感兴趣的:(云计算)