概观(Overview)

所有HDFS命令都由$HADOOP_HOME/bin/hdfs脚本调用。不带任何参数运行hdfs脚本将打印所有命令的描述。

Usage: hdfs [--config confdir] [COMMAND] [GENERIC_OPTIONS] [COMMAND_OPTIONS]

命令选项

描述

--config confdir

Overwrites the default Configuration directory. Default is $HADOOP_HOME/conf.

覆盖默认的配置目录。默认值是$ HADOOP_HOME / conf。

GENERIC_OPTIONS

The common set of options supported by multiple commands. Full list is here.

多个命令支持的一组常用选项。

COMMAND_OPTIONS

Various commands with their options are described in the following sections. The commands have been grouped into and .

以下各节介绍了各种命令及其选项。这些命令已被分组。

用户命令(User Commands)

dfs

Usage: hdfs dfs [GENERIC_OPTIONS] [COMMAND_OPTIONS]

在Hadoop支持的文件系统上运行文件系统命令。各种COMMAND_OPTIONS可以在File System Shell Guide中找到


fetchdt

从NameNode获取授权令牌。有关更多信息,请参阅fetchdt

Usage: hdfs fetchdt [GENERIC_OPTIONS] [--webservice ]

命令选项

描述

fileName    

File name to store the token into.

用于存储令牌的文件名。

--webservice https_address

use http protocol instead of RPC

使用http协议而不是RPC

fsck

运行HDFS文件系统检查实用程序。有关更多信息,请参阅fsck

Usage: hdfs fsck [GENERIC_OPTIONS] [-list-corruptfileblocks | [-move | -delete | -openforwrite] [-files [-blocks [-locations | -racks]]]] [-includeSnapshots]

命令选项

描述

path

Start checking from this path.

从此路径开始检查。

-move

Move corrupted files to /lost+found

将损坏的文件移至/ lost + found

-delete

Delete corrupted files.

删除损坏的文件。

-files

Print out files being checked.

打印出正在检查的文件。

-openforwrite

Print out files opened for write.

打印出可以写入的文件。

-includeSnapshots

Include snapshot data if the given path indicates a snapshottable directory or there are snapshottable directories under it.

如果给定路径指示快照可用目录或其下有快照可用目录,则包含快照数据。

-list-corruptfileblocks

Print out list of missing blocks and files they belong to.

打印出它们所属的缺失块和文件列表。

-blocks

Print out block report.

打印块报告。

-locations

Print out locations for every block.

打印每个块的位置。

-racks

Print out network topology for data-node locations.

打印出数据节点位置的网络拓扑。

version

打印版本

Usage: hdfs version

管理命令(Administration Commands)

balancer

运行集群平衡实用程序。管理员可以简单地按下Ctrl-C来停止重新平衡过程。有关更多详情,请参阅 Balancer。

Usage: hdfs balancer [-threshold ] [-policy ]

命令选项

描述

-threshold threshold

Percentage of disk capacity. This overwrites the default threshold.

磁盘容量的百分比。这会覆盖默认的阈值。

-policy policy

datanode (default): Cluster is balanced if each datanode is balanced.  

blockpool: Cluster is balanced if each block pool in each datanode is balanced.

datanode(默认):如果每个datanode是平衡的,则集群是平衡的。  

blockpool:如果每个数据节点中的每个块池均衡,则集群将保持平衡。

datanode

运行HDFS数据节点。

Usage: hdfs datanode [-regular | -rollback | -rollingupgrace rollback]

命令选项

描述

-regular

Normal datanode startup (default).

正常的datanode启动(默认)。

-rollback

Rollback the datanode to the previous version. This should be used after stopping the datanode and distributing the old hadoop version.

将datanode回滚到以前的版本。这应该在停止Datanode并分发旧的hadoop版本后使用。

-rollingupgrade rollback

Rollback a rolling upgrade operation.

回滚滚动升级操作。

dfsadmin

运行HDFS dfsadmin客户端。

  Usage: hdfs dfsadmin [GENERIC_OPTIONS]

          [-report [-live] [-dead] [-decommissioning]]

          [-safemode enter | leave | get | wait]

          [-saveNamespace]

          [-rollEdits]

          [-restoreFailedStorage true|false|check]

          [-refreshNodes]

          [-setQuota ...]

          [-clrQuota ...]

          [-setSpaceQuota ...]

          [-clrSpaceQuota ...]

          [-setStoragePolicy ]

          [-getStoragePolicy ]

          [-finalizeUpgrade]

          [-rollingUpgrade [||]]

          [-metasave filename]

          [-refreshServiceAcl]

          [-refreshUserToGroupsMappings]

          [-refreshSuperUserGroupsConfiguration]

          [-refreshCallQueue]

          [-refresh [arg1..argn]]

          [-printTopology]

          [-refreshNamenodes datanodehost:port]

          [-deleteBlockPool datanode-host:port blockpoolId [force]]

          [-setBalancerBandwidth ]

          [-allowSnapshot ]

          [-disallowSnapshot ]

          [-fetchImage ]

          [-shutdownDatanode [upgrade]]

          [-getDatanodeInfo ]

          [-triggerBlockReport [-incremental] ]

          [-help [cmd]]

命令选项

描述

-report [-live] [-dead] [-decommissioning]

Reports basic filesystem information and statistics. Optional flags may be used to filter the list of displayed DataNodes.

报告基本的文件系统信息和统计。可选标志可用于过滤显示的数据节点列表。

-safemode enter|leave|get|wait

Safe mode maintenance command. Safe mode is a Namenode state in which it  

1. does not accept changes to the name space (read-only)  

2. does not replicate or delete blocks.  

Safe mode is entered automatically at Namenode startup, and leaves safe mode automatically when the configured minimum percentage of blocks satisfies the minimum replication condition. Safe mode can also be entered manually, but then it can only be turned off manually as well.

安全模式维护命令。安全模式是Namenode状态,其中   

1.不接受对名称空间的更改(只读)   

2.不复制或删除块。  

安全模式是在Namenode启动时自动输入的,当配置的块的最小百分比满足最小复制条件时,将自动离开安全模式。安全模式也可以手动输入,但只能手动关闭。

-saveNamespace

Save current namespace into storage directories and reset edits log. Requires safe mode.

将当前名称空间保存到存储目录并重置编辑日志。需要安全模式。

-rollEdits

Rolls the edit log on the active NameNode.

在活动的NameNode上滚动编辑日志。

-restoreFailedStorage true|false|check

This option will turn on/off automatic attempt to restore failed storage replicas. If a failed storage becomes available again the system will attempt to restore edits and/or fsimage during checkpoint. 'check' option will return current setting.

该选项将打开/关闭自动尝试恢复失败的存储副本。如果失败的存储器再次可用,则系统将在检查点期间尝试恢复编辑和/或fsimage。'check'选项将返回当前设置。

-refreshNodes

Re-read the hosts and exclude files to update the set of Datanodes that are allowed to connect to the Namenode and those that should be decommissioned or recommissioned.

重新读取主机并排除文件以更新允许连接到Namenode的那组Datanode以及那些应该停用或重新试用的Datanode。

-setQuota ...

See HDFS Quotas Guide for the detail.

有关详细信息,请参阅HDFS配额指南。

-clrQuota ...

See HDFS Quotas Guide for the detail.

有关详细信息,请参阅HDFS配额指南。

-setSpaceQuota ...

See HDFS Quotas Guide for the detail.

有关详细信息,请参阅HDFS配额指南。

-clrSpaceQuota ...

See HDFS Quotas Guide for the detail.

有关详细信息,请参阅HDFS配额指南。

-setStoragePolicy

Set a storage policy to a file or a directory.

将存储策略设置为文件或目录。

-getStoragePolicy

Get the storage policy of a file or a directory.

获取文件或目录的存储策略。

-finalizeUpgrade

Finalize upgrade of HDFS. Datanodes delete their previous version working directories, followed by Namenode doing the same. This completes the upgrade process.

完成HDFS的升级。Datanodes删除他们以前的版本工作目录,然后Namenode做相同的工作。这完成了升级过程。

-rollingUpgrade [||]

See Rolling Upgrade document for the detail.

有关详细信息,请参阅滚动升级文档。

-metasave filename

Save Namenode's primary data structures to filename in the directory specified by hadoop.log.dir property. filename is overwritten if it exists. filename will contain one line for each of the following

1. Datanodes heart beating with Namenode

2. Blocks waiting to be replicated

3. Blocks currently being replicated

4. Blocks waiting to be deleted

将Namenode的主要数据结构保存到由hadoop.log.dir属性指定的目录中的文件名。如果它存在,则文件名被覆盖。文件名将包含以下每行的一行  

1. Datanodes的心跳与Namenode  

2.等待被复制的块

3.当前正在复制的块

4.等待被删除的块

-refreshServiceAcl

Reload the service-level authorization policy file.

重新加载服务级别授权策略文件。

-refreshUserToGroupsMappings

Refresh user-to-groups mappings.

刷新用户到组映射。

-refreshSuperUserGroupsConfiguration

Refresh superuser proxy groups mappings

刷新超级用户代理组映射

-refreshCallQueue

Reload the call queue from config.

从配置重新加载呼叫队列。

-refresh [arg1..argn]

Triggers a runtime-refresh of the resource specified by on . All other args after are sent to the host.

触发上由指定的资源的运行时刷新。所有其他参数都发送给主机。

-printTopology

Print a tree of the racks and their nodes as reported by the Namenode

打印由Namenode报告的机架及其节点的树

-refreshNamenodes datanodehost:port

For the given datanode, reloads the configuration files, stops serving the removed block-pools and starts serving new block-pools.

对于给定的datanode,重新加载配置文件,停止提供已移除的块池并开始提供新的块池。

-deleteBlockPool datanode-host:port blockpoolId [force]

If force is passed, block pool directory for the given blockpool id on the given datanode is deleted along with its contents, otherwise the directory is deleted only if it is empty. The command will fail if datanode is still serving the block pool. Refer to refreshNamenodes to shutdown a block pool service on a datanode.

如果强制传递,给定数据节点上的给定块池ID的块池目录与其内容一起被删除,否则该目录仅在其为空时才被删除。如果datanode仍在服务块池,该命令将失败。请参阅refreshNamenodes关闭数据节点上的块池服务。

-setBalancerBandwidth

Changes the network bandwidth used by each datanode during HDFS block balancing. is the maximum number of bytes per second that will be used by each datanode. This value overrides the dfs.balance.bandwidthPerSec parameter.

NOTE: The new value is not persistent on the DataNode.

在HDFS块平衡期间更改每个数据节点使用的网络带宽。是每个datanode将使用的每秒最大字节数。该值将覆盖dfs.balance.bandwidthPerSec参数。

注意:新值在DataNode上不是永久的。

-allowSnapshot

Allowing snapshots of a directory to be created. If the operation completes successfully, the directory becomes snapshottable.

允许创建目录的快照。如果操作成功完成,则该目录变为快照可见。

-disallowSnapshot

Disallowing snapshots of a directory to be created. All snapshots of the directory must be deleted before disallowing snapshots.

禁止要创建的目录的快照。在禁用快照之前,必须删除目录的所有快照。

-fetchImage

Downloads the most recent fsimage from the NameNode and saves it in the specified local directory.

从NameNode下载最新的fsimage并将其保存在指定的本地目录中。

-shutdownDatanode [upgrade]

Submit a shutdown request for the given datanode. See Rolling Upgrade document for the detail.

为给定的datanode提交关闭请求。有关详细信息,请参阅滚动升级文档。

-getDatanodeInfo

Get the information about the given datanode. See Rolling Upgrade document for the detail.

获取有关给定datanode的信息。有关详细信息,请参阅滚动升级文档。

-triggerBlockReport [-incremental]

Trigger a block report for the given datanode. If 'incremental' is specified, it will be otherwise, it will be a full block report.

触发给定数据节点的块报告。如果指定了'增量',那么将会是一个完整的块报告。

-help [cmd]

Displays help for the given command or all commands if none is specified.

如果没有指定,则显示给定命令的帮助或所有命令。

mover

运行数据迁移实用程序。有关更多详情,请参阅移动器。

Usage: hdfs mover [-p | -f ]

命令选项

描述

-p

Specify a space separated list of HDFS files/dirs to migrate.

指定要迁移的HDFS文件/目录的空格分隔列表。

-f

Specify a local file containing a list of HDFS files/dirs to migrate.

指定一个包含要迁移的HDFS文件/目录列表的本地文件。

请注意,如果省略-p和-f选项,则默认路径是根目录。

namenode

运行namenode。有关升级,回滚和完成的更多信息,请参阅升级回滚。

   Usage: hdfs namenode [-backup] |

          [-checkpoint] |

          [-format [-clusterid cid ] [-force] [-nonInteractive] ] |

          [-upgrade [-clusterid cid] [-renameReserved] ] |

          [-upgradeOnly [-clusterid cid] [-renameReserved] ] |

          [-rollback] |

          [-rollingUpgrade ] |

          [-finalize] |

          [-importCheckpoint] |

          [-initializeSharedEdits] |

          [-bootstrapStandby] |

          [-recover [-force] ] |

          [-metadataVersion ]

命令选项

描述

-backup

Start backup node.

启动备份节点。

-checkpoint

Start checkpoint node.

启动检查点节点。

-format [-clusterid cid] [-force] [-nonInteractive]

Formats the specified NameNode. It starts the NameNode, formats it and then shut it down. -force option formats if the name directory exists. -nonInteractive option aborts if the name directory exists, unless -force option is specified.

格式化指定的NameNode。它启动NameNode,对其进行格式化,然后关闭它。-force选项格式,如果名称目录存在。如果名称目录存在,-nonInteractive选项将中止,除非指定了-force选项。

-upgrade [-clusterid cid] [-renameReserved]

Namenode should be started with upgrade option after the distribution of new Hadoop version.

在发布新的Hadoop版本后,Namenode应该以升级选项启动。

-upgradeOnly [-clusterid cid] [-renameReserved]

Upgrade the specified NameNode and then shutdown it.

升级指定的NameNode,然后关闭它。

-rollback

Rollback the NameNode to the previous version. This should be used after stopping the cluster and distributing the old Hadoop version.

将NameNode回滚到以前的版本。这应该在停止集群并分发旧的Hadoop版本后使用。

-rollingUpgrade

See Rolling Upgrade document for the detail.

有关详细信息,请参阅滚动升级文档。

-finalize

Finalize will remove the previous state of the files system. Recent upgrade will become permanent. Rollback option will not be available anymore. After finalization it shuts the NameNode down.

Finalize将删除文件系统的先前状态。最近的升级将会变得永久。回滚选项将不再可用。完成后它会关闭NameNode。

-importCheckpoint

Loads image from a checkpoint directory and save it into the current one. Checkpoint dir is read from property fs.checkpoint.dir

从检查点目录加载图像并将其保存到当前目录中。检查点目录是从属性fs.checkpoint.dir中读取的

-initializeSharedEdits

Format a new shared edits dir and copy in enough edit log segments so that the standby NameNode can start up.

格式化新的共享编辑目录并复制足够的编辑日志段,以便备用NameNode可以启动。

-bootstrapStandby

Allows the standby NameNode's storage directories to be bootstrapped by copying the latest namespace snapshot from the active NameNode. This is used when first configuring an HA cluster.

允许通过复制活动NameNode中的最新命名空间快照来引导备用NameNode的存储目录。这在首次配置HA群集时使用。

-recover [-force]

Recover lost metadata on a corrupt filesystem. See  HDFS User Guide  for the detail.

在损坏的文件系统上恢复丢失的元数据。详情请参阅HDFS用户指南。

-metadataVersion

Verify that configured directories exist, then print the metadata versions of the software and the image.

验证配置的目录是否存在,然后打印软件和图像的元数据版本。

secondarynamenode

运行HDFS辅助名称节点。有关更多信息,请参阅第二名称节点。

Usage: hdfs secondarynamenode [-checkpoint [force]] | [-format] | [-geteditsize]

命令选项

描述

-checkpoint [force]

Checkpoints the SecondaryNameNode if EditLog size >= fs.checkpoint.size. If force is used, checkpoint irrespective of EditLog size.

如果EditLog大小> = fs.checkpoint.size,则检查SecondaryNameNode。如果使用强制,则不考虑EditLog大小。

-format

Format the local storage during startup.

在启动期间格式化本地存储。

-geteditsize

Prints the number of uncheckpointed transactions on the NameNode.

在NameNode上打印未勾选事务的数量。