打包存档命令:
[zhouhh@Hadoop48 ~]$ hadoop archive
archive -archiveName NAME -p *
在父目录后面可以跟若干子目录,也可以不跟,直接打全部父目录。
如:
hadoop archive -archiveName foo.har -p /user/hadoop dir1/dir2 dir3 /user/zoo/
表示dir1/dir2和dir3都是/user/hadoop子目录,选择父目录下的部分目录打包。
实践:
要打包的目录:
[zhouhh@Hadoop48 ~]$ hadoop fs -lsr output1 drwxr-xr-x - zhouhh supergroup 0 2012-06-04 11:05 /user/zhouhh/output1/_logs drwxr-xr-x - zhouhh supergroup 0 2012-06-04 11:05 /user/zhouhh/output1/_logs/history -rw-r--r-- 3 zhouhh supergroup 16856 2012-06-04 11:05 /user/zhouhh/output1/_logs/history/job_201205231824_0007_1338779151666_zhouhh_wordcount.py+%281%2F1%29 -rw-r--r-- 3 zhouhh supergroup 22357 2012-06-04 11:05 /user/zhouhh/output1/_logs/history/job_201205231824_0007_conf.xml
打包
[zhouhh@Hadoop48 ~]$ hadoop archive -archiveName output1.har -p /user/zhouhh/output1 /user/zhouhh/ [zhouhh@Hadoop48 ~]$ hadoop fs -lsr output1.har -rw-r--r-- 3 zhouhh supergroup 0 2012-07-27 15:30 /user/zhouhh/output1.har/_SUCCESS -rw-r--r-- 5 zhouhh supergroup 555 2012-07-27 15:30 /user/zhouhh/output1.har/_index -rw-r--r-- 5 zhouhh supergroup 23 2012-07-27 15:30 /user/zhouhh/output1.har/_masterindex -rw-r--r-- 3 zhouhh supergroup 39213 2012-07-27 15:30 /user/zhouhh/output1.har/part-0
已经打包成功。
查看包内文件:
[zhouhh@Hadoop48 ~]$ hadoop fs -lsr har:///user/zhouhh/output1.har drwxr-xr-x - zhouhh supergroup 0 2012-06-04 11:05 /user/zhouhh/output1.har/_logs drwxr-xr-x - zhouhh supergroup 0 2012-06-04 11:05 /user/zhouhh/output1.har/_logs/history -rw-r--r-- 3 zhouhh supergroup 22357 2012-06-04 11:05 /user/zhouhh/output1.har/_logs/history/job_201205231824_0007_conf.xml -rw-r--r-- 3 zhouhh supergroup 16856 2012-06-04 11:05 /user/zhouhh/output1.har/_logs/history/job_201205231824_0007_1338779151666_zhouhh_wordcount.py+%281%2F1%29
或:
[zhouhh@Hadoop48 ~]$ hadoop fs -lsr har://hdfs-Hadoop48:54310/user/zhouhh/output1.har drwxr-xr-x - zhouhh supergroup 0 2012-06-04 11:05 /user/zhouhh/output1.har/_logs drwxr-xr-x - zhouhh supergroup 0 2012-06-04 11:05 /user/zhouhh/output1.har/_logs/history -rw-r--r-- 3 zhouhh supergroup 22357 2012-06-04 11:05 /user/zhouhh/output1.har/_logs/history/job_201205231824_0007_conf.xml -rw-r--r-- 3 zhouhh supergroup 16856 2012-06-04 11:05 /user/zhouhh/output1.har/_logs/history/job_201205231824_0007_1338779151666_zhouhh_wordcount.py+%281%2F1%29
其中54310是我Hadoop 在core-site.xml配置的hdfs的端口。
删除
[zhouhh@Hadoop48 ~]$ hadoop fs -rmr output1.har Deleted hdfs://Hadoop48:54310/user/zhouhh/output1.har
相关文章
原创文章,转载请注明出自瀚海星空.
本文链接地址:http://abloz.com/2012/07/27/hdfs-file-package-archive.html