使用fastq-dump下载SRA数据
环境和配置请见系列博文
1.下载:
fastq-dump -Z DRR047093
可以显示制定条数
fastq-dump -X 5 -Z DRR047093
文件位置:自己安装sratoolkit时配置的位置
hadoop@Mcnode1:~/cloud/adam/xubo/data/down-sratool/sra$ ll total 7532 drwxrwxr-x 2 hadoop hadoop 4096 1月 13 17:17 ./ drwxrwxr-x 7 hadoop hadoop 4096 1月 13 16:16 ../ -rw-rw-r-- 1 hadoop hadoop 5270322 1月 13 17:17 DRR047093.fastq -rw-rw-r-- 1 hadoop hadoop 1043468 1月 13 17:16 DRR047093.sra -rw-rw-r-- 1 hadoop hadoop 1701387928 1月 13 17:12 SRR003161.sra.cache
2.prefetch
hadoop@Mcnode1:~/.aspera/connect/bin$ prefetch -c DRR047084 Maximum file size download limit is 20,971,520KB 2016-01-13T13:42:20 prefetch.2.5.5: 1) Downloading 'DRR047084'... 2016-01-13T13:42:20 prefetch.2.5.5: Downloading via fasp... 2016-01-13T13:42:41 prefetch.2.5.5 err: process failed while waiting process - ascp failed with 1 2016-01-13T13:43:13 prefetch.2.5.5 err: process failed while waiting process - ascp failed with 1 2016-01-13T13:43:13 prefetch.2.5.5: fasp download failed 2016-01-13T13:43:13 prefetch.2.5.5: Downloading via http... 2016-01-13T13:43:41 prefetch.2.5.5: 1) 'DRR047084' was downloaded successfully 2016-01-13T13:43:41 prefetch.2.5.5: 'DRR047084' has 0 dependencies
hadoop@Mcnode1:~/cloud/adam/xubo/data/down-sratool/sra$ ll -h total 252M drwxrwxr-x 2 hadoop hadoop 4.0K 1月 13 21:43 ./ drwxrwxr-x 7 hadoop hadoop 4.0K 1月 13 16:16 ../ -rw-rw-r-- 1 hadoop hadoop 877K 1月 13 21:40 DRR047083.sra -rw-rw-r-- 1 hadoop hadoop 1007K 1月 13 21:43 DRR047084.sra -rw-rw-r-- 1 hadoop hadoop 1.1M 1月 13 21:33 DRR047091.sra -rw-rw-r-- 1 hadoop hadoop 1.2M 1月 13 17:22 DRR047092.sra -rw-rw-r-- 1 hadoop hadoop 5.1M 1月 13 20:31 DRR047093_1.fastq -rw-rw-r-- 1 hadoop hadoop 5.1M 1月 13 17:17 DRR047093.fastq -rw-rw-r-- 1 hadoop hadoop 1020K 1月 13 17:16 DRR047093.sra -rw-rw-r-- 1 hadoop hadoop 180K 1月 13 20:32 RAL357_1.sai -rw-rw-r-- 1 hadoop hadoop 5.2M 1月 13 20:36 RAL357_1.sam -rw-rw-r-- 1 hadoop hadoop 271M 1月 13 21:28 SRR002664.sra.cache -rw-rw-r-- 1 hadoop hadoop 1.6G 1月 13 20:26 SRR003161.sra.cache -rw-rw-r-- 1 hadoop hadoop 0 1月 13 21:19 SRR003162.sra.lock -rw-rw-r-- 1 hadoop hadoop 15M 1月 13 21:34 SRR003162.sra.tmp.98592.tmp -rw-rw-r-- 1 hadoop hadoop 0 1月 13 21:43 SRR1482462.sra.lock -rw-rw-r-- 1 hadoop hadoop 0 1月 13 21:40 --user=anonftp
链接到的下载地址是:http://sra-download.ncbi.nlm.nih.gov/srapub/SRR003162
文件大概1.6G
之前运行这两个语句都不行,不知道是不是网络的原因??
3.prefetch -v
hadoop@Mcnode1:~/.aspera/connect/bin$ prefetch -v DRR047083 Maximum file size download limit is 20,971,520KB 2016-01-13T13:38:40 prefetch.2.5.5: Using 'ascp' 2016-01-13T13:38:40 prefetch.2.5.5: Using 'ascp' 2016-01-13T13:38:40 prefetch.2.5.5: Using '/home/hadoop/.aspera/connect/bin/ascp' 2016-01-13T13:39:00 prefetch.2.5.5: 1) Downloading 'DRR047083'... 2016-01-13T13:39:00 prefetch.2.5.5: Downloading via fasp... /home/hadoop/.aspera/connect/bin/ascp /home/hadoop/.aspera/connect/bin/ascp -i /home/hadoop/.aspera/connect/etc/asperaweb_id_dsa.openssh -pQTk1 -l 1000m [email protected]:data/sracloud/srapub/DRR047083 /home/hadoop/cloud/adam/xubo/data/down-sratool/sra/DRR047083.sra.tmp.96547.tmp 2016-01-13T13:39:15 prefetch.2.5.5 err: process failed while waiting process - ascp failed with 1 /home/hadoop/.aspera/connect/bin/ascp /home/hadoop/.aspera/connect/bin/ascp -i /home/hadoop/.aspera/connect/etc/asperaweb_id_dsa.openssh -pQTk1 -l 1000m [email protected]:data/sracloud/srapub/DRR047083 /home/hadoop/cloud/adam/xubo/data/down-sratool/sra/DRR047083.sra.tmp.96547.tmp 2016-01-13T13:39:30 prefetch.2.5.5 err: process failed while waiting process - ascp failed with 1 2016-01-13T13:39:30 prefetch.2.5.5: fasp download failed 2016-01-13T13:39:30 prefetch.2.5.5: Downloading via http... 2016-01-13T13:40:06 prefetch.2.5.5: http://sra-download.ncbi.nlm.nih.gov/srapub/DRR047083 -> /home/hadoop/cloud/adam/xubo/data/down-sratool/sra/DRR047083.sra.tmp.96547.tmp 2016-01-13T13:40:10 prefetch.2.5.5: /home/hadoop/cloud/adam/xubo/data/down-sratool/sra/DRR047083.sra.tmp.96547.tmp (897071) 2016-01-13T13:40:10 prefetch.2.5.5: 1) 'DRR047083' was downloaded successfully 2016-01-13T13:40:10 prefetch.2.5.5: 'DRR047083' has 0 unresolved dependencies 2016-01-13T13:40:10 prefetch.2.5.5: 'DRR047083' is not cSRA
成功:
hadoop@Mcnode1:~/cloud/adam/xubo/data/down-sratool/sra$ ll -h total 251M drwxrwxr-x 2 hadoop hadoop 4.0K 1月 13 21:42 ./ drwxrwxr-x 7 hadoop hadoop 4.0K 1月 13 16:16 ../ -rw-rw-r-- 1 hadoop hadoop 877K 1月 13 21:40 DRR047083.sra -rw-rw-r-- 1 hadoop hadoop 0 1月 13 21:42 DRR047084.sra.lock -rw-rw-r-- 1 hadoop hadoop 1.1M 1月 13 21:33 DRR047091.sra -rw-rw-r-- 1 hadoop hadoop 1.2M 1月 13 17:22 DRR047092.sra -rw-rw-r-- 1 hadoop hadoop 5.1M 1月 13 20:31 DRR047093_1.fastq -rw-rw-r-- 1 hadoop hadoop 5.1M 1月 13 17:17 DRR047093.fastq -rw-rw-r-- 1 hadoop hadoop 1020K 1月 13 17:16 DRR047093.sra -rw-rw-r-- 1 hadoop hadoop 180K 1月 13 20:32 RAL357_1.sai -rw-rw-r-- 1 hadoop hadoop 5.2M 1月 13 20:36 RAL357_1.sam -rw-rw-r-- 1 hadoop hadoop 271M 1月 13 21:28 SRR002664.sra.cache -rw-rw-r-- 1 hadoop hadoop 1.6G 1月 13 20:26 SRR003161.sra.cache -rw-rw-r-- 1 hadoop hadoop 0 1月 13 21:19 SRR003162.sra.lock -rw-rw-r-- 1 hadoop hadoop 15M 1月 13 21:34 SRR003162.sra.tmp.98592.tmp -rw-rw-r-- 1 hadoop hadoop 0 1月 13 21:40 --user=anonftp