knox配置包含一下几步:
1、相关的集群配置必须在Hadoop集群中完成,以允许Knox与各种服务通信
2、网关服务器配置——这是服务器本身的可配置元素,适用于所有拓扑或托管Hadoop集群的行为
3、拓扑描述符是用来控制以各种方式访问Hadoop集群的描述符
必须对集群进行以下配置更改,以允许Apache Knox代表最终用户向各种服务组件分派请求。
Update core-site.xml
and add the following lines towards the end of the file.
Replace FQDN_OF_KNOX_HOST
with the fully qualified domain name of the host running the Knox gateway. You can usually find this by running hostname -f
on that host.
You can use *
for local developer testing if the Knox host does not have a static IP.
hadoop.proxyuser.knox.groups
users
hadoop.proxyuser.knox.hosts
FQDN_OF_KNOX_HOST
每种服务都有自己配置
下表通过Gateway -site.xml说明了Apache Knox网关在服务器级的可配置元素。
Property | Description | Default |
---|---|---|
gateway.deployment.dir |
The directory within GATEWAY_HOME that contains gateway topology deployments |
{GATEWAY_HOME}/data/deployments |
gateway.security.dir |
The directory within GATEWAY_HOME that contains the required security artifacts |
{GATEWAY_HOME}/data/security |
gateway.data.dir |
The directory within GATEWAY_HOME that contains the gateway instance data |
{GATEWAY_HOME}/data |
gateway.services.dir |
The directory within GATEWAY_HOME that contains the gateway services definitions |
{GATEWAY_HOME}/services |
gateway.hadoop.conf.dir |
The directory within GATEWAY_HOME that contains the gateway configuration |
{GATEWAY_HOME}/conf |
gateway.frontend.url |
The URL that should be used during rewriting so that it can rewrite the URLs with the correct “frontend” URL | none |
gateway.server.header.enabled |
Indicates whether Knox displays service info in HTTP response | false |
gateway.xforwarded.enabled |
Indicates whether support for some X-Forwarded-* headers is enabled | true |
gateway.trust.all.certs |
Indicates whether all presented client certs should establish trust | false |
gateway.client.auth.needed |
Indicates whether clients are required to establish a trust relationship with client certificates | false |
gateway.truststore.password.alias |
OPTIONAL Alias for the password to the truststore file holding the trusted client certificates. NOTE: An alias with the provided name should be created using knoxcli.sh create-alias inorder to provide the password; else the master secret will be used. |
gateway-truststore-password |
gateway.truststore.path |
Location of the truststore for client certificates to be trusted | null |
gateway.truststore.type |
Indicates the type of truststore at the path declared in gateway.truststore.path |
JKS |
gateway.jdk.tls.ephemeralDHKeySize |
jdk.tls.ephemeralDHKeySize , is defined to customize the ephemeral DH key sizes. The minimum acceptable DH key size is 1024 bits, except for exportable cipher suites or legacy mode (jdk.tls.ephemeralDHKeySize=legacy ) |
2048 |
gateway.threadpool.max |
The maximum concurrent requests the server will process. The default is 254. Connections beyond this will be queued. | 254 |
gateway.httpclient.connectionTimeout |
The amount of time to wait when attempting a connection. The natural unit is milliseconds, but a ‘s’ or ‘m’ suffix may be used for seconds or minutes respectively. | 20s |
gateway.httpclient.maxConnections |
The maximum number of connections that a single HttpClient will maintain to a single host:port. | 32 |
gateway.httpclient.socketTimeout |
The amount of time to wait for data on a socket before aborting the connection. The natural unit is milliseconds, but a ‘s’ or ‘m’ suffix may be used for seconds or minutes respectively. | 20s |
gateway.httpclient.truststore.password.alias |
OPTIONAL Alias for the password to the truststore file holding the trusted service certificates. NOTE: An alias with the provided name should be created using knoxcli.sh create-alias inorder to provide the password; else the master secret will be used. |
gateway-httpclient-truststore-password |
gateway.httpclient.truststore.path |
Location of the truststore for service certificates to be trusted | null |
gateway.httpclient.truststore.type |
Indicates the type of truststore at the path declared in gateway.httpclient.truststore.path |
JKS |
gateway.httpserver.requestBuffer |
The size of the HTTP server request buffer in bytes | 16384 |
gateway.httpserver.requestHeaderBuffer |
The size of the HTTP server request header buffer in bytes | 8192 |
gateway.httpserver.responseBuffer |
The size of the HTTP server response buffer in bytes | 32768 |
gateway.httpserver.responseHeaderBuffer |
The size of the HTTP server response header buffer in bytes | 8192 |
gateway.websocket.feature.enabled |
Enable/Disable WebSocket feature | false |
gateway.tls.keystore.password.alias |
OPTIONAL Alias for the password to the keystore file holding the Gateway’s TLS certificate and keypair. NOTE: An alias with the provided name should be created using knoxcli.sh create-alias inorder to provide the password; else the master secret will be used. |
gateway-identity-keystore-password |
gateway.tls.keystore.path |
OPTIONAL The path to the keystore file where the Gateway’s TLS certificate and keypair are stored. If not set, the default keystore file will be used - data/security/keystores/gateway.jks. | null |
gateway.tls.keystore.type |
OPTIONAL The type of the keystore file where the Gateway’s TLS certificate and keypair are stored. See gateway.tls.keystore.path . |
JKS |
gateway.tls.key.alias |
OPTIONAL The alias for the Gateway’s TLS certificate and keypair within the default keystore or the keystore specified via gateway.tls.keystore.path . |
gateway-identity |
gateway.tls.key.passphrase.alias |
OPTIONAL The alias for passphrase for the Gateway’s TLS private key stored within the default keystore or the keystore specified via gateway.tls.keystore.path . NOTE: An alias with the provided name should be created using knoxcli.sh create-alias inorder to provide the password; else the keystore password or the master secret will be used. See gateway.tls.keystore.password.alias |
gateway-identity-passphrase |
gateway.signing.keystore.name |
OPTIONAL Filename of keystore file that contains the signing keypair. NOTE: An alias needs to be created using knoxcli.sh create-alias for the alias name signing.key.passphrase in order to provide the passphrase to access the keystore. |
null |
gateway.signing.keystore.password.alias |
OPTIONAL Alias for the password to the keystore file holding the signing keypair. NOTE: An alias with the provided name should be created using knoxcli.sh create-alias inorder to provide the password; else the master secret will be used. |
signing.keystore.password |
gateway.signing.keystore.type |
OPTIONAL The type of the keystore file where the signing keypair is stored. See gateway.signing.keystore.name . |
JKS |
gateway.signing.key.alias |
OPTIONAL alias for the signing keypair within the keystore specified via gateway.signing.keystore.name |
null |
gateway.signing.key.passphrase.alias |
OPTIONAL The alias for passphrase for signing private key stored within the default keystore or the keystore specified via gateway.signing.keystore.name . NOTE: An alias with the provided name should be created using knoxcli.sh create-alias inorder to provide the password; else the keystore password or the master secret will be used. See gateway.signing.keystore.password.alias |
signing.key.passphrase |
ssl.enabled |
Indicates whether SSL is enabled for the Gateway | true |
ssl.include.ciphers |
A comma or pipe separated list of ciphers to accept for SSL. See the JSSE Provider docs for possible ciphers. These can also contain regular expressions as shown in the Jetty documentation. | all |
ssl.exclude.ciphers |
A comma or pipe separated list of ciphers to reject for SSL. See the JSSE Provider docs for possible ciphers. These can also contain regular expressions as shown in the Jetty documentation. | none |
ssl.exclude.protocols |
Excludes a comma or pipe separated list of protocols to not accept for SSL or “none” | SSLv3 |
gateway.remote.config.monitor.client |
A reference to the remote configuration registry client the remote configuration monitor will employ | null |
gateway.remote.config.monitor.client.allowUnauthenticatedReadAccess |
When a remote registry client is configured to access a registry securely, this property can be set to allow unauthenticated clients to continue to read the content from that registry by setting the ACLs accordingly. | false |
gateway.remote.config.registry. |
A named remote configuration registry client definition, where name is an arbitrary identifier for the connection | null |
gateway.cluster.config.monitor.ambari.enabled |
Indicates whether the Ambari cluster monitoring and associated dynamic topology updating is enabled | false |
gateway.cluster.config.monitor.ambari.interval |
The interval (in seconds) at which the Ambari cluster monitor will poll for cluster configuration changes | 60 |
gateway.cluster.config.monitor.cm.enabled |
Indicates whether the ClouderaManager cluster monitoring and associated dynamic topology updating is enabled | false |
gateway.cluster.config.monitor.cm.interval |
The interval (in seconds) at which the ClouderaManager cluster monitor will poll for cluster configuration changes | 60 |
gateway.remote.alias.service.enabled |
Turn on/off remote alias service | true |
gateway.read.only.override.topologies |
A comma-delimited list of topology names which should be forcibly treated as read-only. | none |
gateway.discovery.default.address |
The default discovery address, which is applied if no address is specified in a descriptor. | null |
gateway.discovery.default.cluster |
The default discovery cluster name, which is applied if no cluster name is specified in a descriptor. | null |
gateway.dispatch.whitelist |
A semicolon-delimited list of regular expressions for controlling to which endpoints Knox dispatches and redirects will be permitted. If DEFAULT is specified, or the property is omitted entirely, then a default domain-based whitelist will be derived from the Knox host. If HTTPS_ONLY is specified a default domain-based whitelist will be derived from the Knox host for only HTTPS urls. An empty value means no dispatches will be permitted. |
null |
gateway.dispatch.whitelist.services |
A comma-delimited list of service roles to which the gateway.dispatch.whitelist will be applied. |
none |
gateway.strict.topology.validation |
If true, topology XML files will be validated against the topology schema during redeploy | false |
gateway.topology.redeploy.requires.changes |
If true , XML topology redeployment will happen only if the topology content is different than the actually deployed one. That is, a simple touch command will not yield in topology redeployment in this case. |
false |
gateway.global.rules.services |
Set the list of service names that have global rules, all services that are not in this list have rules that are treated as scoped to only to that service. | "NAMENODE","JOBTRACKER", "WEBHDFS", "WEBHCAT", "OOZIE", "WEBHBASE", "HIVE", "RESOURCEMANAGER" |
gateway.xforwarded.header.context.append.servicename |
Add service name to x-forward-context header for the defined list of services. | LIVYSERVER |
gateway.knox.token.exp.server-managed |
Default server-managed token state configuration for all KnoxToken service and JWT provider deployments | false |
gateway.knox.token.eviction.interval |
The period (seconds) about which the token state reaper will evict state for expired tokens. This configuration only applies when server-managed token state is enabled either in gateway-site or at the topology level. | 300 (5 minutes) |
gateway.knox.token.eviction.grace.period |
A duration (seconds) beyond a token’s expiration to wait before evicting its state. This configuration only applies when server-managed token state is enabled either in gateway-site or at the topology level. | 86400 (24 hours) |
gateway.knox.token.permissive.validation |
When this feature is enabled and server managed state is enabled and Knox is presented with a valid token which is absent in server managed state, Knox will verify it without throwing an UnknownTokenException | false |
gateway.jetty.max.form.content.size |
This optional parameter allows end-user to configure the form content in Knox’s embedded Jetty server that a request can process is limited to protect from Denial of Service attacks. The size in bytes is limited by Jetty’s ContextHandler#getMaxFormContentSize() or if there is no context then the “org.eclipse.jetty.server.Request.maxFormContentSize” attribute. | 200000 |
gateway.jetty.max.form.keys |
This optional parameter allows end-user to configure the number of parameters keys is limited by Knox’s embedded Jetty’s ContextHandler#getMaxFormKeys() or if there is no context then the org.eclipse.jetty.server.Request.maxFormKeys attribute. |
拓扑描述符文件为网关提供每个集群的配置信息。这包括网关内的提供者和Hadoop集群内的服务的配置。这些文件位于{GATEWAY_HOME}/conf/ topology中。这份文件的大纲是这样的
/topology
Defines the provider and configuration and service topology for a single Hadoop cluster.
/topology/gateway
Groups all of the provider elements
/topology/gateway/provider
Defines the configuration of a specific provider for the cluster.
/topology/service
Defines the location of a specific Hadoop service within the Hadoop cluster.
提供程序配置用于自定义特定网关特性的行为。提供者元素的一般轮廓如下所示:
authentication
ShiroProvider
true
/topology/gateway/provider
Groups information for a specific provider.
/topology/gateway/provider/role
Defines the role of a particular provider. There are a number of pre-defined roles used by out-of-the-box provider plugins for the gateway. These roles are: authentication, identity-assertion, rewrite and hostmap
/topology/gateway/provider/name
Defines the name of the provider for which this configuration applies. There can be multiple provider implementations for a given role. Specifying the name is used to identify which particular provider is being configured. Typically each topology descriptor should contain only one provider for each role but there are exceptions.
/topology/gateway/provider/enabled
Allows a particular provider to be enabled or disabled via true
or false
respectively. When a provider is disabled any filters associated with that provider are excluded from the processing chain.
/topology/gateway/provider/param
These elements are used to supply provider configuration. There can be zero or more of these per provider.
/topology/gateway/provider/param/name
The name of a parameter to pass to the provider.
/topology/gateway/provider/param/value
The value of a parameter to pass to the provider.
服务配置用于指定Hadoop集群中服务的位置。服务元素的一般轮廓是这样的:
WEBHDFS
http://localhost:50070/webhdfs
/topology/service
Provider information about a particular service within the Hadoop cluster. Not all services are necessarily exposed as gateway endpoints.
/topology/service/role
Identifies the role of this service. Currently supported roles are: WEBHDFS, WEBHCAT, WEBHBASE, OOZIE, HIVE, NAMENODE, JOBTRACKER, RESOURCEMANAGER Additional service roles can be supported via plugins. Note: The role names are case sensitive and must be upper case.
topology/service/url
The URL identifying the location of a particular service within the Hadoop cluster.
Hostmap提供程序的目的是处理这样的情况,即主机在集群内使用一个名称,而在外部使用另一个名称。基本结构如下所示。
topology>
...
hostmap
static
true
external-host-name internal-host-name
...
...
这种映射是必需的,因为在集群内运行的Hadoop服务不知道它们正在从集群外部被访问。因此,作为REST API响应的一部分返回的url通常包含内部主机名。由于集群外部的客户端将无法解析这些主机名,因此必须将它们映射到外部主机名。
Details about each provider configuration element is enumerated below.
topology/gateway/provider/role
The role for a Hostmap provider must always be hostmap
.
topology/gateway/provider/name
The Hostmap provider supplied out-of-the-box is selected via the name static
.
topology/gateway/provider/enabled
Host mapping can be enabled or disabled by providing true
or false
.
topology/gateway/provider/param
Host mapping is configured by providing parameters for each external to internal mapping.
topology/gateway/provider/param/name
The parameter names represent the external host names associated with the internal host names provided by the value element. This can be a comma separated list of host names that all represent the same physical host. When mapping from internal to external host name the first external host name in the list is used.
topology/gateway/provider/param/value
The parameter values represent the internal host names associated with the external host names provider by the name element. This can be a comma separated list of host names that all represent the same physical host. When mapping from external to internal host names the first internal host name in the list is used.
Simplified Topology Descriptors(简化描述符)是促进提供者配置共享和服务端点发现的一种方法。不需要编辑XML拓扑描述符,可以创建更简单的YAML(或JSON)描述符,指定拓扑的所需内容,这将生成完整的拓扑描述符和部署。
有时,相同的提供程序配置应用于多个Knox拓扑。通过将提供程序配置从简单描述符外部化,单个配置可以被多个拓扑引用。这有助于减少配置的重复,并在需要更改策略时更新多个配置文件。更新提供程序配置将触发对引用它的所有拓扑的更新。
外部化提供程序配置细节的内容与完整拓扑描述符中的网关元素的内容相同。唯一的区别是,这些细节是在{GATEWAY_HOME}/conf/shared-providers/中单独的JSON/YAML文件中定义的,然后由一个或多个描述符引用。
{
"providers": [
{
"role": "authentication",
"name": "ShiroProvider",
"enabled": "true",
"params": {
"sessionTimeout": "30",
"main.ldapRealm": "org.apache.knox.gateway.shirorealm.KnoxLdapRealm",
"main.ldapContextFactory": "org.apache.knox.gateway.shirorealm.KnoxLdapContextFactory",
"main.ldapRealm.contextFactory": "$ldapContextFactory",
"main.ldapRealm.userDnTemplate": "uid={0},ou=people,dc=hadoop,dc=apache,dc=org",
"main.ldapRealm.contextFactory.url": "ldap://localhost:33389",
"main.ldapRealm.contextFactory.authenticationMechanism": "simple",
"urls./**": "authcBasic"
}
},
{
"name": "static",
"role": "hostmape",
"enabled": "true",
"params": {
"localhost": "sandbox,sandbox.hortonworks.com"
}
}
]
}
HA提供程序在共享提供程序配置方面需要特别关注,因为它们包含特定于服务(也可能是特定于集群)的配置。
这需要特别注意,因为与关联的HA Provider配置相对应的服务配置必须包含正确的内容才能正常工作。
对于带有HA提供商服务的共享提供商配置:
如果引用描述符没有声明相应的服务,那么HA Provider配置将被有效地忽略,因为该拓扑没有公开服务。
如果在描述符中声明了相应的服务,如果使用了服务端点发现,那么Knox应该正确填充url以支持HA行为。否则,必须在描述符中为该服务显式指定url。
如果描述符内容正确,但是集群服务没有配置HA,那么HA行为显然无法工作。
curl -iku admin:admin-password https://hadoop31:8443/gateway/admin/api/v1/version -H Accept:application/json