Hadoop 常见指令

  • 一 概述
  • 二 HDFS 管理命令 fs
  • 三 作业管理命令 job
  • 四 作业提交命令 jar
  • 五 如何停止正在运行的 Hadoop 程序
  • 六 附录


一. 概述

bin 目录下的 Hadoop 脚本是最基础的集群管理脚本,用户可以通过该脚本完成各种功能,如 HDFS 文件管理、MapReduce 作业管理等。该脚本的使用方式:

hadoop [--config confdir] COMMAND
  • –config 是用于设置 Hadoop 配置文件目录,默认目录为 ${HADOOP_HOME}/etc/hadoop/
  • COMMAND 是具体的某个命令,常用的如下几个命令
    • HDFS 管理命令 fs
    • 作业管理命令 job
    • 作业提交命令 jar

我们可以键入 hadoop ,以查看更多的命令:

Usage: hadoop [--config confdir] COMMAND
       where COMMAND is one of:
  fs                   run a generic filesystem user client
  version              print the version
  jar             run a jar file
  checknative [-a|-h]  check native hadoop and compression libraries availability
  distcp   copy file or directories recursively
  archive -archiveName NAME -p  *  create a hadoop archive
  classpath            prints the class path needed to get the
  credential           interact with credential providers
                       Hadoop jar and the required libraries
  daemonlog            get/set the log level for each daemon
 or
  CLASSNAME            run the class named CLASSNAME

Most commands print help when invoked w/o parameters.



二. HDFS 管理命令 fs

[hadoop5@master5 ~]$ hadoop fs -help
Usage: hadoop fs [generic options]
        [-appendToFile  ... ]
        [-cat [-ignoreCrc]  ...]
        [-checksum  ...]
        [-chgrp [-R] GROUP PATH...]
        [-chmod [-R] ... | OCTALMODE> PATH...]
        [-chown [-R] [OWNER][:[GROUP]] PATH...]
        [-copyFromLocal [-f] [-p]  ... ]
        [-copyToLocal [-p] [-ignoreCrc] [-crc]  ... ]
        [-count [-q]  ...]
        [-cp [-f] [-p | -p[topax]]  ... ]
        [-createSnapshot  []]
        [-deleteSnapshot  ]
        [-df [-h] [ ...]]
        [-du [-s] [-h]  ...]
        [-expunge]
        [-get [-p] [-ignoreCrc] [-crc]  ... ]
        [-getfacl [-R] ]
        [-getfattr [-R] {-n name | -d} [-e en] ]
        [-getmerge [-nl]  ]
        [-help [cmd ...]]
        [-ls [-d] [-h] [-R] [ ...]]
        [-mkdir [-p]  ...]
        [-moveFromLocal  ... ]
        [-moveToLocal  ]
        [-mv  ... ]
        [-put [-f] [-p]  ... ]
        [-renameSnapshot   ]
        [-rm [-f] [-r|-R] [-skipTrash]  ...]
        [-rmdir [--ignore-fail-on-non-empty]  ...]
        [-setfacl [-R] [{-b|-k} {-m|-x } ]|[--set  ]]
        [-setfattr {-n name [-v value] | -x name} ]
        [-setrep [-R] [-w]   ...]
        [-stat [format]  ...]
        [-tail [-f] ]
        [-test -[defsz] ]
        [-text [-ignoreCrc]  ...]
        [-touchz  ...]
        [-usage [cmd ...]]

具体想查某个指令的用法,可以键入以下命令查看

hadoop fs -usage ls

更多详细信息,请参考:《Hadoop Shell命令 》 及 附录



三. 作业管理命令 job

hadoop5@master5 ~]$ hadoop job -help
DEPRECATED: Use of this script to execute mapred command is deprecated.
Instead use the mapred command for it.

Usage: CLI  
        [-submit file>]
        [-status id>]
        [-counter id> name> name>]
        [-kill id>]
        [-set-priority id> ]. Valid values for priorities are: VERY_HIGH HIGH NORMAL LOW VERY_LOW
        [-events id> <from-event-#> <#-of-events>]
        [-history ]
        [-list [all]]
        [-list-active-trackers]
        [-list-blacklisted-trackers]
        [-list-attempt-ids id>  ]. Valid values for  are MAP REDUCE. Valid values for  are running, completed
        [-kill-task id>]
        [-fail-task id>]
        [-logs id> id>]

Generic options supported are
-conf file>     specify an application configuration file
-D <property=value>            use value for given property
-fs <local|namenode:port>      specify a namenode
-jt <local|jobtracker:port>    specify a job tracker
-files list of files>    specify comma separated files to be copied to the map reduce cluster
-libjars list of jars>    specify comma separated jar files to include in the classpath.
-archives list of archives>    specify comma separated archives to be unarchived on the compute machines.

The general command line syntax is
bin/hadoop command [genericOptions] [commandOptions]



四. 作业提交命令 jar

hadoop jar <jar> [mainClass] args..
  • 表示 jar 包名
  • mainClass 表示 main class 名称,可以不必输入而由 jar 命名自动搜索
  • args 是 main class 输入参数
bin/hadoop  jar hadoop-examples-1.0.0.jar wordcount /text/input /test/output



五. 如何停止正在运行的 Hadoop 程序

这需要根据 Hadoop 的版本

1. version 小于2.3.0

  • 查看正在运行的 Hadoop 任务
hadoop job -list
  • 关闭 Hadoop 任务进程
hadoop job -kill $jobId

组合以上两条命令就可以实现 kill 掉指定用户的 job

for i in `hadoop job -list | grep -w  username| awk '{print $1}' | grep job_`; do hadoop job -kill $i; done
  • username 就是你希望关闭 Hadoop 任务的用户

2. version 大于等于2.3.0

  • 查看正在运行的 Hadoop 任务
yarn application -list
  • 关闭 Hadoop 任务进程
yarn application -kill $ApplicationId



六. 附录

Hadoop 常见指令_第1张图片


Hadoop 常见指令_第2张图片


Hadoop 常见指令_第3张图片

你可能感兴趣的:(Hadoop学习专辑)