Monospaced |
Used for commands, HTTP request and responses and code blocks. |
|
User entered values. |
[Monospaced] |
Optional values. When the value is not specified, the default value is used. |
Italics |
Important phrases and words. |
HTTP REST API支持HDFSFileSystem/FileContext全部的API。HTTP操作和相应的FIleSystem/FileContext里的方法在下个部分展示。HTTP Query Parameter Dictionary部分详细的描述了默认值和有效值。
WebHDFS文件系统的scheme是“webhdfs://”。一个WebHDFS文件系统的URL有下面的格式:
webhdfs://:/
下面是对应的HDFS的URL:
hdfs://:/
在REST API中,前缀” /webhdfs/v1”插入到path之前,一个query被增加到最后。因此,相应的HTTPURL有下面的格式:
http://:/webhdfs/v1/?op=...
下面是HDFS配置中关于WebHDFS的配置属性:
Property Name |
Description |
dfs.webhdfs.enabled |
Enable/disable WebHDFS in Namenodes and Datanodes |
dfs.web.authentication.kerberos.principal |
The HTTP Kerberos principal used by Hadoop-Auth in the HTTP endpoint. The HTTP Kerberos principal MUST start with 'HTTP/' per Kerberos HTTP SPNEGO specification. |
dfs.web.authentication.kerberos.keytab |
The Kerberos keytab file with the credentials for the HTTP Kerberos principal used by Hadoop-Auth in the HTTP endpoint. |
当security关闭的时候,认证的用户是在user.name查询参数中指定的用户。如果user.name参数没被设置,服务器设置认证用户为默认的web用户,如果是, if there is any, or return an error response。
当security开启时,认证通过Hadoop Delegation Token或者Kerberos SPNEGO执行。如果在delegation查询参数中设置了一个token,认证的用户就是编码进token的用户。如果delegation查询参数没有被设置,用户通过Kerberos SPNEGO认证。
下面是用curl命令工具的一下例子:
1. 当security关闭时的认证:
curl -i"http://:/webhdfs/v1/?[user.name=&]op=..."
2. 当security开启时用Kerberos SPNEGO认证
curl -i--negotiate -u :"http://:/webhdfs/v1/?op=..."
3. 当security开启时用Hadoop Delegation Token认证
curl -i"http://:/webhdfs/v1/?delegation=&op=..."
也可以查看:HTTP Authentication。
当代理用户特性开启时,一个代理用户P可以代表其他的用户U提交一个请求。U的用户名必须在doas查询参数中被指定,除非一个Delegation Token出现在认证中。在这种情况下,用户P和U的信息必须被编码进Delegation Token。
l 当security关闭时一个代理请求:
curl -i"http://:/webhdfs/v1/?[user.name=&]doas=&op=..."
l 当security开启时用KerberosSPNEGO验证代理请求:
curl -i --negotiate -u :"http://:/webhdfs/v1/?doas=&op=..."
l 当security开启时用HadoopDelegation Token验证大力请求:
curl -i"http://:/webhdfs/v1/?delegation=&op=..."
u Step1:提交一个HTTP PUT请求,没有自动接着重定向,也没有发送文件数据。
curl -i -X PUT "http://:/webhdfs/v1/?op=CREATE
[&overwrite=][&blocksize=][&replication=]
[&permission=][&buffersize=]"
请求被重定向到要被写入数据的文件所在的DataNode:
HTTP/1.1 307 TEMPORARY_REDIRECT
Location: http://:/webhdfs/v1/?op=CREATE...
Content-Length: 0
u Step2:用要被写入的文件数据,提交另一个HTTP PUT请求到上边返回的Header中的location的URL。
curl -i -X PUT -T "http://:/webhdfs/v1/?op=CREATE..."
客户端收到一个201 Created响应,content-length为0,location header是一个WebHDFS的URL。
HTTP/1.1 201 Created
Location: webhdfs://:/
Content-Length: 0
注意:分成create/append两步的原因是为了防止客户端在重定向之前发送数据。这个问题在HTTP/1.1中可以通过加入"Expect: 100-continue"头解决。不幸的是,还有一些软件库存在bug(例如Jetty 6HTTP Server and java 6 HTTP Client),它们没有正确的实现"Expect: 100-continue"。通过create/append两步是针对软件库bug一个临时的解决方案。
See also: overwrite, blocksize, replication, permission, buffersize, FileSystem.create
u Step1:提交一个HTTP POST请求,不会自动接着重定向,不发送文件数据:
curl -i -X POST"http://:/webhdfs/v1/?op=APPEND[&buffersize=]"
请求重定向到要被附加数据的文件所在的datanode:
HTTP/1.1 307TEMPORARY_REDIRECT
Location:http://:/webhdfs/v1/?op=APPEND...
Content-Length: 0
u Step2:使用上边的Location Header的URL提交另一个附加了要被增加到文件的数据的HTTP POST请求:
curl -i -X POST -T"http://:/webhdfs/v1/?op=APPEND..."
客户端收到一个content-length为0的响应:
HTTP/1.1 200 OK
Content-Length: 0
可以参考上一部分的描述,为什么这个操作需要两步。
See also: buffersize, FileSystem.append
u 提交一个HTTP POST请求
curl -i -X POST"http://:/webhdfs/v1/?op=CONCAT&sources="
客户端收到一个content-length=0的响应:
HTTP/1.1 200 OK
Content-Length: 0
See also: sources, FileSystem.concat
u 提交一个自动重定向的HTTPGET请求
curl -i -L"http://:/webhdfs/v1/?op=OPEN
[&offset=][&length=][&buffersize=]"
请求重定向到可以读到文件数据的DataNode:
HTTP/1.1 307TEMPORARY_REDIRECT
Location:http://:/webhdfs/v1/?op=OPEN...
Content-Length: 0
客户端接着重定向到DataNode,然后读取文件数据:
HTTP/1.1 200 OK
Content-Type:application/octet-stream
Content-Length: 22
Hello, webhdfsuser!
See also: offset, length, buffersize, FileSystem.open
u 提交一个HTTP PUT请求
curl -i -X PUT"http://:/?op=MKDIRS[&permission=]"
客户端收到一个Boolean JSON对象:
HTTP/1.1 200 OK
Content-Type:application/json
Transfer-Encoding:chunked
{"boolean":true}
See also: permission, FileSystem.mkdirs
u 提交一个HTTP PUT请求。
curl -i -X PUT"http://:/?op=CREATESYMLINK
&destination=[&createParent=]"
客户端收到一个Content-Length=0的响应:
HTTP/1.1 200 OK
Content-Length: 0
See also: destination, createParent, FileSystem.createSymlink
u 提交一个HTTP PUT请求。
curl -i -X PUT":/webhdfs/v1/?op=RENAME&destination="
客户端收到一个Boolean JSON对象:
HTTP/1.1 200 OK
Content-Type:application/json
Transfer-Encoding:chunked
{"boolean":true}
See also: destination, FileSystem.rename
u 提交一个DELETE请求:
curl -i -X DELETE"http://:/webhdfs/v1/?op=DELETE
[&recursive=]"
客户端收到一个Boolean JSON对象:
HTTP/1.1 200 OK
Content-Type:application/json
Transfer-Encoding:chunked
{"boolean":true}
See also: recursive, FileSystem.delete
u 提交一个HTTP GET请求
curl -i "http://:/webhdfs/v1/?op=GETFILESTATUS"
客户端收到一个FIleStatus JSON对象的响应:
HTTP/1.1 200 OK
Content-Type:application/json
Transfer-Encoding:chunked
{
"FileStatus":
{
"accessTime" : 0,
"blockSize" : 0,
"group" : "supergroup",
"length" : 0, //in bytes, zero for directories
"modificationTime":1320173277227,
"owner" : "webuser",
"pathSuffix" : "",
"permission" : "777",
"replication" : 0,
"type" : "DIRECTORY" //enum {FILE, DIRECTORY, SYMLINK}
}
}
See also: FileSystem.getFileStatus
u 提交一个HTTP GET请求。
curl -i "http://:/webhdfs/v1/?op=LISTSTATUS"
客户端收到一个FileStatuses JSON对象:
HTTP/1.1 200 OK
Content-Type:application/json
Content-Length: 427
{
"FileStatuses":
{
"FileStatus":
[
{
"accessTime" : 1320171722771,
"blockSize" : 33554432,
"group" : "supergroup",
"length" : 24930,
"modificationTime":1320171722771,
"owner" : "webuser",
"pathSuffix" : "a.patch",
"permission" : "644",
"replication" : 1,
"type" : "FILE"
},
{
"accessTime" : 0,
"blockSize" : 0,
"group" : "supergroup",
"length" : 0,
"modificationTime": 1320895981256,
"owner" : "szetszwo",
"pathSuffix" : "bar",
"permission" : "711",
"replication" : 0,
"type" : "DIRECTORY"
},
...
]
}
}
See also: FileSystem.listStatus
u 提交一个HTTP GET请求。
curl -i"http://:/webhdfs/v1/?op=GETCONTENTSUMMARY"
客户端收到一个ContentSummaryJSON对象:
HTTP/1.1 200 OK
Content-Type: application/json
Transfer-Encoding: chunked
{
"ContentSummary":
{
"directoryCount": 2,
"fileCount" : 1,
"length" : 24930,
"quota" : -1,
"spaceConsumed" : 24930,
"spaceQuota" : -1
}
}
See also: FileSystem.getContentSummary
u 提交一个HTTP GET请求。
curl -i "http://:/webhdfs/v1/?op=GETFILECHECKSUM"
提交被重定向到一个DataNode。
HTTP/1.1 307 TEMPORARY_REDIRECT
Location: http://:/webhdfs/v1/?op=GETFILECHECKSUM...
Content-Length: 0
客户端跟着重定向到DataNode然后接收一个FileChecksum JSON对象:
HTTP/1.1 200 OK
Content-Type: application/json
Transfer-Encoding: chunked
{
"FileChecksum":
{
"algorithm": "MD5-of-1MD5-of-512CRC32",
"bytes" : "eadb10de24aa315748930df6e185c0d ...",
"length" : 28
}
}
See also: FileSystem.getFileChecksum
u 提交一个HTTP GET请求
curl -i "http://:/webhdfs/v1/?op=GETHOMEDIRECTORY"
客户端收到一个PATH JSON对象的响应:
HTTP/1.1 200 OK
Content-Type: application/json
Transfer-Encoding: chunked
{"Path": "/user/szetszwo"}
See also: FileSystem.getHomeDirectory
u 提交一个HTTP PUT请求
curl -i -X PUT "http://:/webhdfs/v1/?op=SETPERMISSION
[&permission=]"
客户端收到一个Content-Length=0的响应:
HTTP/1.1 200 OK
Content-Length: 0
See also: permission, FileSystem.setPermission
u 提交一个HTTP PUT请求
curl -i -X PUT "http://:/webhdfs/v1/?op=SETOWNER
[&owner=][&group=]"
客户端收到一个Content-Length=0的响应:
HTTP/1.1 200 OK
Content-Length: 0
See also: owner, group, FileSystem.setOwner
u 提交一个HTTP PUT请求
curl -i -X PUT "http://:/webhdfs/v1/?op=SETREPLICATION
[&replication=]"
客户端收到一个Boolean JSON对象的请求
curl -i -X PUT "http://:/webhdfs/v1/?op=SETREPLICATION
[&replication=]"
See also: replication, FileSystem.setReplication
u 提交一个HTTP PUT请求
curl -i -X PUT "http://:/webhdfs/v1/?op=SETTIMES
[&modificationtime=
客户端收到一个Content-Length=0的响应:
HTTP/1.1 200 OK
Content-Length: 0
See also: modificationtime, accesstime, FileSystem.setTimes
u 提交一个HTTP PUT请求
curl -i -X PUT "http://:/webhdfs/v1/?op=MODIFYACLENTRIES
&aclspec="
客户端收到一个content-length=0的请求
HTTP/1.1 200 OK
Content-Length: 0
See also: FileSystem.modifyAclEntries
u 提交一个HTTP PUT请求
curl -i -X PUT "http://:/webhdfs/v1/?op=REMOVEACLENTRIES
&aclspec="
客户端收到一个Content-Length=0的响应:
HTTP/1.1 200 OK
Content-Length: 0
See also: FileSystem.removeAclEntries
u 提交一个HTTP PUT请求
curl -i -X PUT "http://:/webhdfs/v1/?op=REMOVEDEFAULTACL"
客户端收到一个Content-Length=0的响应
HTTP/1.1 200 OK
Content-Length: 0
See also: FileSystem.removeDefaultAcl
u 提交一个HTTP PUT请求
curl -i -X PUT "http://:/webhdfs/v1/?op=REMOVEACL"
客户端收到一个Content-Length=0响应:
HTTP/1.1 200 OK
Content-Length: 0
See also: FileSystem.removeAcl
u 提交一个HTTP PUT请求
curl -i -X PUT "http://:/webhdfs/v1/?op=SETACL
&aclspec="
客户端收到一个Content-Length=0的请求
HTTP/1.1 200 OK
Content-Length: 0
See also: FileSystem.setAcl
u 提交一个HTTP GET请求
curl -i -X PUT "http://:/webhdfs/v1/?op=GETACLSTATUS"
客户端收到一个AclStatus JSON对象格式的响应:
HTTP/1.1 200 OK
Content-Type: application/json
Transfer-Encoding: chunked
{
"AclStatus": {
"entries": [
"user:carla:rw-",
"group::r-x"
],
"group": "supergroup",
"owner": "hadoop",
"stickyBit": false
}
}
See also: FileSystem.getAclStatus
u 提交一个HTTP GET 请求
curl -i "http://:/webhdfs/v1/?op=GETDELEGATIONTOKEN&renewer="
客户端收到一个Token JSON对象的请求
HTTP/1.1 200 OK
Content-Type: application/json
Transfer-Encoding: chunked
{
"Token":
{
"urlString": "JQAIaG9y..."
}
}
See also: renewer, FileSystem.getDelegationToken
u 提交一个HTTP GET请求
curl -i "http://:/webhdfs/v1/?op=GETDELEGATIONTOKENS&renewer="
客户端收到一个Tokens JSON格式的响应:
HTTP/1.1 200 OK
Content-Type: application/json
Transfer-Encoding: chunked
{
"Tokens":
{
"Token":
[
{
"urlString":"KAAKSm9i ..."
}
]
}
}
See also: renewer, FileSystem.getDelegationTokens
u 提交一个HTTP PUT 请求
curl -i -X PUT "http://:/webhdfs/v1/?op=RENEWDELEGATIONTOKEN&token="
客户端收到一个long型JSON对象的响应:
HTTP/1.1 200 OK
Content-Type: application/json
Transfer-Encoding: chunked
{"long": 1320962673997} //the new expiration time
See also: token, FileSystem.renewDelegationToken
u 提交一个HTTP PUT请求
curl -i -X PUT "http://:/webhdfs/v1/?op=CANCELDELEGATIONTOKEN&token="
客户端收到一个Content-Length=0的响应:
HTTP/1.1 200 OK
Content-Length: 0
See also: token, FileSystem.cancelDelegationToken
当一个操作失败,服务器可能会抛出一个错误。一个error响应的JSON格式定义在 RemoteExceptionJSON Schema中。下面的表格显示了exception到HTTP响应码的映射。
Exceptions |
HTTP Response Codes |
IllegalArgumentException |
400 Bad Request |
UnsupportedOperationException |
400 Bad Request |
SecurityException |
401 Unauthorized |
IOException |
403 Forbidden |
FileNotFoundException |
404 Not Found |
RumtimeException |
500 Internal Server Error |
下面是一些错误的响应的例子。
HTTP/1.1 400 Bad Request
Content-Type: application/json
Transfer-Encoding: chunked
{
"RemoteException":
{
"exception" : "IllegalArgumentException",
"javaClassName": "java.lang.IllegalArgumentException",
"message" : "Invalid value for webhdfs parameter \"permission\": ..."
}
}
HTTP/1.1 401 Unauthorized
Content-Type: application/json
Transfer-Encoding: chunked
{
"RemoteException":
{
"exception" : "SecurityException",
"javaClassName": "java.lang.SecurityException",
"message" : "Failed to obtain user group information: ..."
}
}
HTTP/1.1 403 Forbidden
Content-Type: application/json
Transfer-Encoding: chunked
{
"RemoteException":
{
"exception" : "AccessControlException",
"javaClassName": "org.apache.hadoop.security.AccessControlException",
"message" : "Permission denied: ..."
}
}
HTTP/1.1 404 Not Found
Content-Type: application/json
Transfer-Encoding: chunked
{
"RemoteException":
{
"exception" : "FileNotFoundException",
"javaClassName": "java.io.FileNotFoundException",
"message" : "File does not exist: /foo/a.patch"
}
}
HTTP/1.1 404 Not Found
Content-Type: application/json
Transfer-Encoding: chunked
{
"RemoteException":
{
"exception" : "FileNotFoundException",
"javaClassName": "java.io.FileNotFoundException",
"message" : "File does not exist: /foo/a.patch"
}
}
所有的操作,除了OPEN,要么返回一个长度为0的响应要么返回一个JSON响应。对于OPEN,响应是一个自己流。下面是JSON的模式。
注意,additionalProperties是默认值是一个空的模式,这允许为附加的属性设置任何值。因此,所有的WebHDFS JSON响应允许任何额外的属性。但是,如果响应中加入了额外的属性,为了保持兼容,它们被认为是可选的属性。
{
"name" : "AclStatus",
"properties":
{
"AclStatus":
{
"type" : "object",
"properties":
{
"entries":
{
"type": "array"
"items":
{
"description": "ACL entry.",
"type": "string"
}
},
"group":
{
"description": "The group owner.",
"type" : "string",
"required" : true
},
"owner":
{
"description": "The user who is the owner.",
"type" : "string",
"required" : true
},
"stickyBit":
{
"description": "True if the sticky bit is on.",
"type" : "boolean",
"required" : true
},
}
}
}
}
{
"name" : "boolean",
"properties":
{
"boolean":
{
"description": "A boolean value",
"type" : "boolean",
"required" : true
}
}
}
See also: MKDIRS, RENAME, DELETE, SETREPLICATION
{
"name" : "ContentSummary",
"properties":
{
"ContentSummary":
{
"type" : "object",
"properties":
{
"directoryCount":
{
"description": "The number of directories.",
"type" : "integer",
"required" : true
},
"fileCount":
{
"description": "The number of files.",
"type" : "integer",
"required" : true
},
"length":
{
"description": "The number of bytes used by the content.",
"type" : "integer",
"required" : true
},
"quota":
{
"description": "The namespace quota of this directory.",
"type" : "integer",
"required" : true
},
"spaceConsumed":
{
"description": "The disk space consumed by the content.",
"type" : "integer",
"required" : true
},
"spaceQuota":
{
"description": "The disk space quota.",
"type" : "integer",
"required" : true
}
}
}
}
}
{
"name" : "FileChecksum",
"properties":
{
"FileChecksum":
{
"type" : "object",
"properties":
{
"algorithm":
{
"description": "The name of the checksum algorithm.",
"type" : "string",
"required" : true
},
"bytes":
{
"description": "The byte sequence of the checksum in hexadecimal.",
"type" : "string",
"required" : true
},
"length":
{
"description": "The length of the bytes (not the length of the string).",
"type" : "integer",
"required" : true
}
}
}
}
}
See also: GETFILECHECKSUM
{
"name" : "FileStatus",
"properties":
{
"FileStatus": fileStatusProperties //See FileStatus Properties
}
}
See also: FileStatus Properties, GETFILESTATUS, FileStatus
使用了Javascript的语法定义一个fileStatusProperties ,因此它可被在FileStatus和FileStatusesJSON模式中使用。
var fileStatusProperties =
{
"type" : "object",
"properties":
{
"accessTime":
{
"description": "The access time.",
"type" : "integer",
"required" : true
},
"blockSize":
{
"description": "The block size of a file.",
"type" : "integer",
"required" : true
},
"group":
{
"description": "The group owner.",
"type" : "string",
"required" : true
},
"length":
{
"description": "The number of bytes in a file.",
"type" : "integer",
"required" : true
},
"modificationTime":
{
"description": "The modification time.",
"type" : "integer",
"required" : true
},
"owner":
{
"description": "The user who is the owner.",
"type" : "string",
"required" : true
},
"pathSuffix":
{
"description": "The path suffix.",
"type" : "string",
"required" : true
},
"permission":
{
"description": "The permission represented as a octal string.",
"type" : "string",
"required" : true
},
"replication":
{
"description": "The number of replication of a file.",
"type" : "integer",
"required" : true
},
"symlink": //an optional property
{
"description": "The link target of a symlink.",
"type" : "string"
},
"type":
{
"description": "The type of the path object.",
"enum" : ["FILE", "DIRECTORY", "SYMLINK"],
"required" : true
}
}
};
一个FileStatuses JSON对象代表一个FileStatusJSON对象的数组。
{
"name" : "FileStatuses",
"properties":
{
"FileStatuses":
{
"type" : "object",
"properties":
{
"FileStatus":
{
"description": "An array of FileStatus",
"type" : "array",
"items" : fileStatusProperties //See FileStatus Properties
}
}
}
}
}
See also: FileStatus Properties, LISTSTATUS, FileStatus