BWA软件安装和使用:
1.安装请参考【1】
2.使用:
hadoop@Mcnode1:~/cloud/adam/xubo/data/down-sratool/sra$ bwa aln ../../dmel-all-chromosome-r5.37/dmel-all-chromosome-r5.37.fasta DRR047093.fastq >RAL357_1.sai [bwa_aln] 17bp reads: max_diff = 2 [bwa_aln] 38bp reads: max_diff = 3 [bwa_aln] 64bp reads: max_diff = 4 [bwa_aln] 93bp reads: max_diff = 5 [bwa_aln] 124bp reads: max_diff = 6 [bwa_aln] 157bp reads: max_diff = 7 [bwa_aln] 190bp reads: max_diff = 8 [bwa_aln] 225bp reads: max_diff = 9 [bwa_aln_core] calculate SA coordinate... 6.11 sec [bwa_aln_core] write to the disk... 0.00 sec [bwa_aln_core] 9261 sequences have been processed. [main] Version: 0.7.12-r1039 [main] CMD: bwa aln ../../dmel-all-chromosome-r5.37/dmel-all-chromosome-r5.37.fasta DRR047093.fastq [main] Real time: 6.259 sec; CPU: 6.196 sec
bwa samse ../../dmel-all-chromosome-r5.37/dmel-all-chromosome-r5.37.fasta RAL357_1.sai DRR047093.fastq > RAL357_1.sam
hadoop@Mcnode1:~/cloud/adam/xubo/data/down-sratool/sra$ bwa samse ../../dmel-all-chromosome-r5.37/dmel-all-chromosome-r5.37.fasta RAL357_1.sai DRR047093.fastq > RAL357_1.sam [bwa_aln_core] convert to sequence coordinate... 0.13 sec [bwa_aln_core] refine gapped alignments... 0.19 sec [bwa_aln_core] print alignments... 0.03 sec [bwa_aln_core] 9261 sequences have been processed. [main] Version: 0.7.12-r1039 [main] CMD: bwa samse ../../dmel-all-chromosome-r5.37/dmel-all-chromosome-r5.37.fasta RAL357_1.sai DRR047093.fastq [main] Real time: 1.121 sec; CPU: 0.381 sec
hadoop@Mcnode1:~/cloud/adam/xubo/data/down-sratool/sra$ more RAL357_1.sam @SQ SN:YHet LN:347038 @SQ SN:dmel_mitochondrion_genome LN:19517 @SQ SN:2L LN:23011544 @SQ SN:X LN:22422827 @SQ SN:3L LN:24543557 @SQ SN:4 LN:1351857 @SQ SN:2R LN:21146708 @SQ SN:3R LN:27905053 @SQ SN:Uextra LN:29004656 @SQ SN:2RHet LN:3288761 @SQ SN:2LHet LN:368872 @SQ SN:3LHet LN:2555491 @SQ SN:3RHet LN:2517507 @SQ SN:U LN:10049037 @SQ SN:XHet LN:204112 @PG ID:bwa PN:bwa VN:0.7.12-r1039 CL:bwa samse ../../dmel-all-chromosome-r5.37/dmel-all-chromosome-r5.37.fasta RAL357_1.sai DRR047093.fastq DRR047093.1 4 * 0 0 * * 0 0 CAAAGTGGCGTCGTCTTGAGCCCATCATCAATATCATCGTTTACATTAAGTAGAAAGTGTAACTAGACAAATGTTTTCATTTCCGCCTCGTTGTTGAACTCCCGTGGAGAA CCCATGCTTCCCCTGATTTAACATCGGTATTGTATTCAATCCTTCTGCTCTCCCCGGCGAATGCATCGTTAATGGTTGGTTTCCGCGTAAACG I555IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIBBBEHIIIIIIIIIIIIIIIIIIIIFIBD111;7///-;<557;;FFIIIIIIIII IIIHH<<<@BIIIIIIIIIIIIIII>>>>HHIHHIIHIHHIHHHHHHIHIHIIIHIHHFCBBDDD<;1111779@DHIIIIIIIIIIIIIIIIIIB>>IIIIIIFFFII DRR047093.2 4 * 0 0 * * 0 0 CAAAGTGGCGTCGTCTTGAGCCCGTCATCAATATCATCGTTTACATTAAGTAGAAAAGTGTAACTAGACAAAATGTTTTTCATTTCCGCCTCGTTGTTGAACTCCCGTGGA GAACTCATGCTTCCCCTCGATTTAACTATCGGTATTGTATTCAATCCTTCTGCTCTCCCCGGCCGAATGCATCGTTAATGGTTGGTTTCCGTCGTAAACG IIIIIIIIIIIIIIIIIIIIIIIFIIIIIIIIIIIIIIIIFFIIIIIIIIIIIEECCIIIIIIIIIIIIECCCII99999HIIIIII IIIIIIIIIIIIIIIIIIIIIIIIIIIIFCGGDC<<66667/66>>@?:9399FIIAAAAFIIIFFFFEFFBBBAEFEDD:===;;0038?@@@@BBDCC=====;;CCBBBE::3:=FIFF== DRR047093.3 0 3R 17056839 37 235M * 0 0 GAGAGATCCCGTGCCGTTAGCTTTAGATCCTCAGGAACCTGCGAGTAGTCAAAGTCCAGAACGATACTGGAGTCACCTTCGTTGTTATTGGCCGTCTCATAGG TTTTGAGCAGCGCCTGGCGATCCACCTTGCCGTTGACCAGCAATGGAACGTGCTCCAGGATGACCACCTGCGGCGTCATGTAATCGGCTAGCTTGTCCTTGAGACGAGCCTCCATCTGCATCTCGGTGACCA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIBBBBIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII XT:A:UNM:i:1 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:0A234 DRR047093.4 16 3L 6707450 37 218M * 0 0 ACCGGGAATACTATATCGCGTGTCTATATAGTCTAGGTAAATATTGTGAGAGGCATAATGAAGATAATAATAATACAAAAACAATTTTTGTCTGAGTATACAATCGGTTTT TGTGTGGTACTTTGCCTACTAAGTGCGGATGTATCTGAACTTTGCTTTCCCAGCTTTTCACTTCACTTAATTCGCT
hadoop@Mcnode1:~/cloud/adam/xubo/data/down-sratool/sra$ cat RAL357_1.sam | wc -l 9277
数据源:
curl -O ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r5.37_FB2011_05/fasta/dmel-all-chromosome-r5.37.fasta.gz gunzip dmel-all-chromosome-r5.37.fasta.gz
fastq-dump -Z DRR047093 <pre name="code" class="plain">fastq-dump DRR047093
更多数据:
Beginning with a list of desired SRA data sets (e.g., a list of SRA Run accessions, “SRRs”), the exact download location for that data file can be determined as follows:
wget/FTP root: ftp://ftp-trace.ncbi.nih.gov
ascp root: vog.hin.mln.ibcn.ptf@ptfnona:
Remainder of path:
/sra/sra-instant/reads/ByRun/sra/{SRR|ERR|DRR}/<first 6 characters of accession>/<accession>/<accession>.sra
Where
来源:http://www.ncbi.nlm.nih.gov/books/NBK158899/#SRA_download.downloading_sra_data_using
参考:
【1】 http://mingkang1217.blog.163.com/blog/static/2035227201101254921398/
【2】 http://ged.msu.edu/angus/tutorials-2011/bwa_tutorial.html