Flink on yarn 集群部署

Flink要求它使用的Hadoop集群必须是要添加Kerberos和SASL认证的。

环境准备

Hadoop集群的机器有三台,选择其中一台作为Master

主机名 IP 角色
master 192.168.0.121 主集群
slave1 192.168.0.111 从集群
slave2 192.168.0.222 从集群

确保每台主机的DNS解析正确,主机之间可以ping通,主机之间可以免密ssh,hosts配置

192.168.0.121 master
192.168.0.111 slave1
192.168.0.222 slave2

Kerberos的安装

1. 安装KDC

在master上安装KDC的服务,以及Kerberos Client:

$ yum install krb5-server krb5-libs krb5-auth-dialog krb5-workstation

在安装完上述的软件之后,会在KDC主机上生成配置文件/etc/krb5.conf和/var/kerberos/krb5kdc/kdc.conf,它们分别反映了realm name以及 domain-to-realm mappings

2. 配置kdc.conf

默认路径:/var/kerberos/krb5kdc/kdc.conf,以下为配置示例:

[kdcdefaults]
 kdc_ports = 88
 kdc_tcp_ports = 88

[realms]
 HADOOP.COM = {
  #master_key_type = aes256-cts
  acl_file = /var/kerberos/krb5kdc/kadm5.acl
  dict_file = /usr/share/dict/words
  admin_keytab = /var/kerberos/krb5kdc/kadm5.keytab
  supported_enctypes = aes256-cts:normal aes128-cts:normal des3-hmac-sha1:normal arcfour-hmac:normal camellia256-cts:normal camellia128-cts:normal des-hmac-sha1:normal des-cbc-md5:normal des-cbc-crc:normal
 }

说明:
HADOOP.COM:是设定的realms。名字随意。Kerberos可以支持多个realms,会增加复杂度。本文不探讨。大小写敏感,一般为了识别使用全部大写。这个realms跟机器的host没有大关系。
max_renewable_life = 7d 涉及到是否能进行ticket的renwe必须配置。
master_key_type:和supported_enctypes默认使用aes256-cts。由于,JAVA使用aes256-cts验证方式需要安装额外的jar包,推荐不使用。
acl_file:标注了admin的用户权限。文件格式是:Kerberos_principal permissions [target_principal] [restrictions]支持通配符等。
admin_keytab:KDC进行校验的keytab。后文会提及如何创建。
supported_enctypes:支持的校验方式,不需要修改。

3. 配置krb5.conf

默认路径:/etc/krb5.conf,包含Kerberos的配置信息。例如,KDC的位置,Kerberos的admin的realms 等。需要所有使用的Kerberos的机器上的配置文件都同步。配置示例:

[logging]
 default = FILE:/var/log/krb5libs.log
 kdc = FILE:/var/log/krb5kdc.log
 admin_server = FILE:/var/log/kadmind.log

[libdefaults]
 default_realm = HADOOP.COM
 dns_lookup_realm = false
 dns_lookup_kdc = false
 ticket_lifetime = 24h
 renew_lifetime = 7d
 forwardable = true
 rdns = false
 pkinit_anchors = /etc/pki/tls/certs/ca-bundle.crt
# default_realm = EXAMPLE.COM
 default_ccache_name = KEYRING:persistent:%{uid}

[realms]
 HADOOP.COM = {
  kdc = master
  admin_server = master
 }
# EXAMPLE.COM = {
#  kdc = kerberos.example.com
#  admin_server = kerberos.example.com
# }

[domain_realm]
 .hadoop.com = HADOOP.COM
 hadoop.com = HADOOP.COM
# .example.com = EXAMPLE.COM
# example.com = EXAMPLE.COM

说明:
[logging]:表示server端的日志的打印位置
[libdefaults]:每种连接的默认配置,需要注意以下几个关键的小配置:
default_realm = HADOOP.COM 默认的realm,必须跟要配置的realm的名称一致。
udp_preference_limit = 1 禁止使用udp可以防止一个Hadoop中的错误
oticket_lifetime表明凭证生效的时限,一般为24小时。
orenew_lifetime表明凭证最长可以被延期的时限,一般为一个礼拜。当凭证过期之后,
对安全认证的服务的后续访问则会失败。
[realms]:列举使用的realm。
kdc:代表要kdc的位置。格式是 机器:端口
admin_server:代表admin的位置。格式是机器:端口
default_domain:代表默认的域名
[appdefaults]:可以设定一些针对特定应用的配置,覆盖默认配置。

4. 创建并初始化Kerberos database

配置完上述两个文件,就可以进行Kerberos数据库的初始化了:

$ /usr/sbin/kdb5_util create -s -r HADOOP.COM
Loading random data
Initializing database '/var/kerberos/krb5kdc/principal' for realm 'HADOOP.COM',
master key name 'K/[email protected]'
You will be prompted for the database Master Password.
It is important that you NOT FORGET this password.
Enter KDC database master key:
Re-enter KDC database master key to verify:
kdb5_util: Required parameters in kdc.conf missing while initializing the Kerberos admin interface

其中,[-s]表示生成stash file,并在其中存储master server key(krb5kdc);还可以用[-r]来指定一个realm name,当krb5.conf中定义了多个realm时才是必要的。整个初始化的时间比较长,大约在10分钟左右。在此过程中,我们会输入database的管理密码。这里设置的密码一定要记住,如果忘记了,就无法管理Kerberos server。
当Kerberos database创建好后,可以看到目录 /var/kerberos/krb5kdc 下生成了几个文件:

$ ll /var/kerberos/krb5kdc/
total 24
-rw------- 1 root root   22 Mar 31  2016 kadm5.acl
-rw------- 1 root root  416 Jun 19 16:29 kdc.conf
-rw------- 1 root root 8192 Jun 19 16:52 principal
-rw------- 1 root root 8192 Jun 19 16:52 principal.kadm5
-rw------- 1 root root    0 Jun 19 16:52 principal.kadm5.lock
-rw------- 1 root root    0 Jun 19 16:52 principal.ok

数据库创建完成后,重启一下krb5的服务:

$ krb5kdc restart

5. 添加database administrator

我们需要为Kerberos database添加administrative principals (即能够管理database的principals) —— 至少要添加1个principal来使得Kerberos的管理进程kadmind能够在网络上与程序kadmin进行通讯。
在master上执行命令,并设置密码:

$ kadmin.local -q "addprinc admin/admin"
Authenticating as principal root/[email protected] with password.
WARNING: no policy specified for admin/[email protected]; defaulting to no policy
Enter password for principal "admin/[email protected]":
Re-enter password for principal "admin/[email protected]":
Principal "admin/[email protected]" created.

设置完成后,执行以下命令查看princ的列表:

$ kadmin.local -q "listprincs"
Authenticating as principal root/[email protected] with password.
K/[email protected]
admin/[email protected]

6. 为database administrator设置ACL权限

在KDC上我们需要编辑acl文件来设置权限,该acl文件的默认路径是 /var/kerberos/krb5kdc/kadm5.acl(也可以在文件kdc.conf中修改)。Kerberos的kadmind daemon会使用该文件来管理对Kerberos database的访问权限。对于那些可能会对pincipal产生影响的操作,acl文件也能控制哪些principal能操作哪些其他pricipals。
我们现在为administrator设置权限:将文件/var/kerberos/krb5kdc/kadm5.acl的内容编辑为

*/[email protected]      *

代表名称匹配*/[email protected] 都认为是admin,权限是 *。代表全部权限。
7. 启动Kerberos daemon

设置完成,可以启动Kerberos daemon,并设置为开机启动:

$ service krb5kdc start
$ chkconfig krb5kdc on

现在KDC已经在工作了。这两个daemons将会在后台运行,可以查看它们的日志文件(/var/log/krb5kdc.log 和 /var/log/kadmind.log)。
可以通过命令kinit来检查这两个daemons是否正常工作。
8.Kerberos Clients部署
在另外两台主机slave1和slave2,安装kerberos客户端:

$ yum install krb5-workstation krb5-libs krb5-auth-dialog

配置krb5.conf文件,该文件的内容和Master KDC的内容保持一致

[logging]
 default = FILE:/var/log/krb5libs.log
 kdc = FILE:/var/log/krb5kdc.log
 admin_server = FILE:/var/log/kadmind.log

[libdefaults]
 default_realm = HADOOP.COM
 dns_lookup_realm = false
 dns_lookup_kdc = false
 ticket_lifetime = 24h
 renew_lifetime = 7d
 forwardable = true
 rdns = false
 pkinit_anchors = /etc/pki/tls/certs/ca-bundle.crt
# default_realm = EXAMPLE.COM
 default_ccache_name = KEYRING:persistent:%{uid}

[realms]
 HADOOP.COM = {
  kdc = master
  admin_server = master
 }
# EXAMPLE.COM = {
#  kdc = kerberos.example.com
#  admin_server = kerberos.example.com
# }

[domain_realm]
 .hadoop.com = HADOOP.COM
 hadoop.com = HADOOP.COM
# .example.com = EXAMPLE.COM
# example.com = EXAMPLE.COM

详细部署Kerberos可访问这里链接

SASL的安装

1. 安装openssl

在之前配置好Kerberos Master KDC的机器,安装openssl:

yum install openssl

2. 生成keystore和truststore文件

创建CA,作为集群统一签发证书的机构

$ openssl req -new -x509 -keyout test_ca_key -out test_ca_cert -days 9999 -subj '/C=CN/ST=beijing/L=beijing/O=hadoop/OU=hadoop/CN=hadoop.com'
Generating a 2048 bit RSA private key
.........................+++
.........................................................................................+++
writing new private key to 'test_ca_key'
Enter PEM pass phrase:
Verifying - Enter PEM pass phrase:

将上面生成的test_ca_key和test_ca_cert丢到slave1,slave2机器上,在slave1,slave2和master机器上执行,为每一个节点创建key和证书:

$ keytool -keystore keystore -alias localhost -validity 9999 -genkey -keyalg RSA -keysize 2048 -dname "CN=hadoop.com, OU=hadoop, O=hadoop, L=beijing, ST=beijing, C=cn"
Enter keystore password:
Re-enter new password:
Enter key password for 
    (RETURN if same as keystore password):
Re-enter new password:
$ keytool -keystore truststore -alias CARoot -import -file test_ca_cert
Enter keystore password:
Re-enter new password:
Owner: CN=hadoop.com, OU=hadoop, O=hadoop, L=beijing, ST=beijing, C=CN
Issuer: CN=hadoop.com, OU=hadoop, O=hadoop, L=beijing, ST=beijing, C=CN
Serial number: f60b93dc251f2239
Valid from: Fri Jun 21 02:52:51 EDT 2019 until: Mon Nov 05 01:52:51 EST 2046
Certificate fingerprints:
     MD5:  CA:8B:6B:A5:6F:3B:E4:A2:30:25:26:1F:B0:45:9F:66
     SHA1: 09:2C:07:0F:18:3C:95:BB:27:0C:5B:B8:D5:12:B4:EC:5A:16:69:72
     SHA256: DB:C4:22:2F:E2:C9:0A:A0:B9:03:51:DA:21:9A:8F:E2:EE:A9:4F:35:1B:F4:53:E2:EC:4E:86:4C:C6:46:BD:C5
     Signature algorithm name: SHA256withRSA
     Version: 3

Extensions:

#1: ObjectId: 2.5.29.35 Criticality=false
AuthorityKeyIdentifier [
KeyIdentifier [
0000: DE DF 33 F3 77 C1 2B FE   C1 42 BB 25 52 D8 F0 BA  ..3.w.+..B.%R...
0010: FE BF DF 6A                                        ...j
]
]

#2: ObjectId: 2.5.29.19 Criticality=false
BasicConstraints:[
  CA:true
  PathLen:2147483647
]

#3: ObjectId: 2.5.29.14 Criticality=false
SubjectKeyIdentifier [
KeyIdentifier [
0000: DE DF 33 F3 77 C1 2B FE   C1 42 BB 25 52 D8 F0 BA  ..3.w.+..B.%R...
0010: FE BF DF 6A                                        ...j
]
]

Trust this certificate? [no]:  yes
Certificate was added to keystore
$ keytool -certreq -alias localhost -keystore keystore -file cert
Enter keystore password:
# 注意这里的passin参数,pass:{jackeryoung},{}里修改为实际的密码。
$ openssl x509 -req -CA test_ca_cert -CAkey test_ca_key -in cert -out cert_signed -days 9999 -CAcreateserial -passin pass:jackeryoung
Signature ok
subject=/C=cn/ST=beijing/L=beijing/O=hadoop/OU=hadoop/CN=hadoop.com
Getting CA Private Key
$ keytool -keystore keystore -alias CARoot -import -file test_ca_cert
Enter keystore password:
Owner: CN=hadoop.com, OU=hadoop, O=hadoop, L=beijing, ST=beijing, C=CN
Issuer: CN=hadoop.com, OU=hadoop, O=hadoop, L=beijing, ST=beijing, C=CN
Serial number: f60b93dc251f2239
Valid from: Fri Jun 21 02:52:51 EDT 2019 until: Mon Nov 05 01:52:51 EST 2046
Certificate fingerprints:
     MD5:  CA:8B:6B:A5:6F:3B:E4:A2:30:25:26:1F:B0:45:9F:66
     SHA1: 09:2C:07:0F:18:3C:95:BB:27:0C:5B:B8:D5:12:B4:EC:5A:16:69:72
     SHA256: DB:C4:22:2F:E2:C9:0A:A0:B9:03:51:DA:21:9A:8F:E2:EE:A9:4F:35:1B:F4:53:E2:EC:4E:86:4C:C6:46:BD:C5
     Signature algorithm name: SHA256withRSA
     Version: 3

Extensions:

#1: ObjectId: 2.5.29.35 Criticality=false
AuthorityKeyIdentifier [
KeyIdentifier [
0000: DE DF 33 F3 77 C1 2B FE   C1 42 BB 25 52 D8 F0 BA  ..3.w.+..B.%R...
0010: FE BF DF 6A                                        ...j
]
]

#2: ObjectId: 2.5.29.19 Criticality=false
BasicConstraints:[
  CA:true
  PathLen:2147483647
]

#3: ObjectId: 2.5.29.14 Criticality=false
SubjectKeyIdentifier [
KeyIdentifier [
0000: DE DF 33 F3 77 C1 2B FE   C1 42 BB 25 52 D8 F0 BA  ..3.w.+..B.%R...
0010: FE BF DF 6A                                        ...j
]
]

Trust this certificate? [no]:  yes
Certificate was added to keystore
$ keytool -keystore keystore -alias localhost -import -file cert_signed
Enter keystore password:

最终在当前目录下会生成的SASL的truststore和keystore文件。

Hadoop分布式部署

1. 配置JDK环境

通过yum安装JDK1.8版本的环境

$ yum install java-1.8.0-openjdk*

配置JAVA_HOME等环境变量

export JAVA_HOME=/home/shadow/JDK/jdk1.8.0_181
export PATH=$JAVA_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

2. 配置Hadoop
在Hadoop的下载页选择2.8.5版本的二进制文件,并下载在master节点的

$ wget http://mirror.bit.edu.cn/apache/hadoop/common/hadoop-2.8.5/hadoop-2.8.5.tar.gz
$ scp hadoop-2.8.5.tar.gz root@slave1:/home/jackeryoung
$ scp hadoop-2.8.5.tar.gz root@slave2:/home/jackeryoung
$ tar -xvf hadoop-2.8.5.tar.gz

Hadoop的环境变量配置
在所有机器的/etc/profile文件中添加以下的内容:

export HADOOP_HOME=/home/jackeryoung/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

配置kerberos账号
Hadoop中通常会使用三个kerberos账号:hdfs,yarn和HTTP,添加账号的命令如下:

$ kadmin.local -q "addprinc -randkey hdfs/[email protected]"
$ kadmin.local -q "addprinc -randkey yarn/[email protected]"
$ kadmin.local -q "addprinc -randkey HTTP/[email protected]"

生成每个账号的keytab文件:

$ kadmin.local -q "xst -k hdfs.keytab hdfs/[email protected]"
$ kadmin.local -q "xst -k yarn.keytab yarn/[email protected]"
$ kadmin.local -q "xst -k HTTP.keytab HTTP/[email protected]"

将三个keytab文件合并为一个:

$ ktutil
ktutil:  rkt hdfs.keytab
ktutil:  rkt yarn.keytab
ktutil:  rkt HTTP.keytab
ktutil:  wkt hadoop.keytab
ktutil:  q

分发keytab文件并登录:

$ mv hadoop.keytab /home/jackeryoung/hadoop/etc/hadoop/
$ scp /home/jackeryoung/hadoop/etc/hadoop/hadoop.keytab root@slave1:/home/jackeryoung/hadoop/etc/hadoop
$ scp /home/jackeryoung/hadoop/etc/hadoop/hadoop.keytab root@slave2:/home/jackeryoung/hadoop/etc/hadoop

配置crontab每天登录一次:

$ crontab -e
0  0  *  *  *   kinit -k -t /home/jackeryoung/hadoop/etc/hadoop/hadoop.keytab hdfs/[email protected]
0  0  *  *  *   kinit -k -t /home/jackeryoung/hadoop/etc/hadoop/hadoop.keytab yarn/[email protected]
0  0  *  *  *   kinit -k -t /home/jackeryoung/hadoop/etc/hadoop/hadoop.keytab HTTP/[email protected]

修改Hadoop配置文件
配置文件的路径在/home/jackeryoung/hadoop/etc/hadoop下

slaves

master
slave1
slave2

core-site.xml


   
      fs.defaultFS
      hdfs://master:9000
   
   
      hadoop.security.authentication
      kerberos
   
   
      hadoop.security.authorization
      true
   
   
      fs.permissions.umask-mode
      027
   

mapred-site.xml


   
      mapreduce.framework.name
      yarn
   

yarn-site.xml




   
      yarn.resourcemanager.hostname
      master
   
   
      yarn.nodemanager.aux-services
      mapreduce_shuffle
   
   
      yarn.resourcemanager.webapp.address
      192.168.0.121:8888
   
   
      yarn.resourcemanager.keytab
      /home/jackeryoung/hadoop/etc/hadoop/hadoop.keytab
   
   
      yarn.resourcemanager.principal
      yarn/[email protected]
   
   
      yarn.nodemanager.keytab
      /home/jackeryoung/hadoop/etc/hadoop/hadoop.keytab
   
   
      yarn.nodemanager.principal
      yarn/[email protected]
   
   
      yarn.nodemanager.resource.memory-mb
      16384
   
   
      yarn.scheduler.minimum-allocation-mb
      1024
   
   
      yarn.scheduler.maximum-allocation-mb
      16384
   
   
      yarn.nodemanager.vmem-check-enabled
      false
   

hdfs-site.xml


   
      dfs.namenode.secondary.http-address
      192.168.0.121:50090
   
   
      dfs.replication
      2
   
   
      dfs.http.address
      master:50070
   
   
      dfs.namenode.name.dir
      /home/jackeryoung/hadoop/dfs/name
   
   
      dfs.datanode.data.dir
      /home/jackeryoung/hadoop/dfs/data
   
   
      dfs.datanode.max.xcievers
      4096
   
   
      dfs.block.access.token.enable
      true
   
   
      dfs.namenode.keytab.file
      /home/jackeryoung/hadoop/etc/hadoop/hadoop.keytab
   
   
      dfs.namenode.kerberos.principal
      hdfs/[email protected]
   
   
      dfs.namenode.kerberos.https.principal
      HTTP/[email protected]
   
   
      dfs.datanode.address
      0.0.0.0:1034
   
   
      dfs.datanode.http.address
      0.0.0.0:1036
   
   
      dfs.datanode.keytab.file
      /home/jackeryoung/hadoop/etc/hadoop/hadoop.keytab
   
   
      dfs.datanode.kerberos.principal
      hdfs/[email protected]
   
   
      dfs.datanode.kerberos.https.principal
      HTTP/[email protected]
   
   
   
      dfs.http.policy
      HTTPS_ONLY
   
   
      dfs.data.transfer.protection
      integrity
   
   
      dfs.journalnode.keytab.file
      /home/jackeryoung/hadoop/etc/hadoop/hadoop.keytab
   
   
      dfs.journalnode.kerberos.principal
      hdfs/[email protected]
   
   
      dfs.journalnode.kerberos.internal.spnego.principal
      HTTP/[email protected]
   
   
   
      dfs.webhdfs.enabled
      true
   
   
      dfs.web.authentication.kerberos.principal
      HTTP/[email protected]
   
   
      dfs.web.authentication.kerberos.keytab
      /home/jackeryoung/hadoop/etc/hadoop/hadoop.keytab
   
   
      dfs.datanode.data.dir.perm
      700
   
   
      dfs.nfs.kerberos.principal
      hdfs/[email protected]
   
   
      dfs.nfs.keytab.file
      /home/jackeryoung/hadoop/etc/hadoop/hadoop.keytab
   
   
      dfs.secondary.https.address
      192.168.0.121:50495
   
   
      dfs.secondary.https.port
      50495
   
   
      dfs.secondary.namenode.keytab.file
      /home/jackeryoung/hadoop/etc/hadoop/hadoop.keytab
   
   
      dfs.secondary.namenode.kerberos.principal
      hdfs/[email protected]
   
   
      dfs.secondary.namenode.kerberos.https.principal
      HTTP/[email protected]
   

ssl-client.xml



  ssl.client.truststore.location
  /home/truststore
  Truststore to be used by clients like distcp. Must be
  specified.
  


  ssl.client.truststore.password
  jackeryoung
  Optional. Default value is "".
  


  ssl.client.truststore.type
  jks
  Optional. The keystore file format, default value is "jks".
  


  ssl.client.truststore.reload.interval
  10000
  Truststore reload check interval, in milliseconds.
  Default value is 10000 (10 seconds).
  


  ssl.client.keystore.location
  /home/keystore
  Keystore to be used by clients like distcp. Must be
  specified.
  


  ssl.client.keystore.password
  jackeryoung
  Optional. Default value is "".
  


  ssl.client.keystore.keypassword
  jackeryoung
  Optional. Default value is "".
  


  ssl.client.keystore.type
  jks
  Optional. The keystore file format, default value is "jks".
  


ssl-server.xml



  ssl.server.truststore.location
  /home/truststore
  Truststore to be used by NN and DN. Must be specified.
  


  ssl.server.truststore.password
  jackeryoung
  Optional. Default value is "".
  


  ssl.server.truststore.type
  jks
  Optional. The keystore file format, default value is "jks".
  


  ssl.server.truststore.reload.interval
  10000
  Truststore reload check interval, in milliseconds.
  Default value is 10000 (10 seconds).
  


  ssl.server.keystore.location
  /home/keystore
  Keystore to be used by NN and DN. Must be specified.
  


  ssl.server.keystore.password
  jackeryoung
  Must be specified.
  


  ssl.server.keystore.keypassword
  jackeryoung
  Must be specified.
  


  ssl.server.keystore.type
  jks
  Optional. The keystore file format, default value is "jks".
  


  ssl.server.exclude.cipher.list
  TLS_ECDHE_RSA_WITH_RC4_128_SHA,SSL_DHE_RSA_EXPORT_WITH_DES40_CBC_SHA,
  SSL_RSA_WITH_DES_CBC_SHA,SSL_DHE_RSA_WITH_DES_CBC_SHA,
  SSL_RSA_EXPORT_WITH_RC4_40_MD5,SSL_RSA_EXPORT_WITH_DES40_CBC_SHA,
  SSL_RSA_WITH_RC4_128_MD5
  Optional. The weak security cipher suites that you want excluded
  from SSL communication.


分别hadoop配置文件到其他机器

$ scp /home/jackeryoung/hadoop/etc/hadoop/* root@slave1:/home/jackeryoung/hadoop/etc/hadoop
$ scp /home/jackeryoung/hadoop/etc/hadoop/* root@slave2:/home/jackeryoung/hadoop/etc/hadoop

NameNode格式化

$ hdfs namenode -format
19/06/30 22:23:45 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   user = root
STARTUP_MSG:   host = master/127.0.0.1
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 2.8.5
...
19/06/30 22:23:46 INFO util.ExitUtil: Exiting with status 0
19/06/30 22:23:46 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at master/127.0.0.1

启动Hadoop集群

$ start-dfs.sh
$ jps
19282 DataNode
28324 Jps
19480 SecondaryNameNode
18943 NameNode

启动Yarn集群

$ start-yarn.sh
$ jps
21088 NodeManager
19282 DataNode
28324 Jps
19480 SecondaryNameNode
18943 NameNode
20959 ResourceManager

集群其他机器

30199 NodeManager
30056 DataNode
28254 Jps

配置Flink on Yarn

下载Flink

$ wget https://archive.apache.org/dist/flink/flink-1.8.0/flink-1.8.0-bin-scala_2.11.tgz
$ tar -xvf flink-1.8.0-bin-scala_2.11.tgz
$ cd flink/lib
$ wget https://repo.maven.apache.org/maven2/org/apache/flink/flink-shaded-hadoop-2-uber/2.8.3-7.0/flink-shaded-hadoop-2-uber-2.8.3-7.0.jar

修改Flink的配置文件
Flink的配置文件在目录:/home/jackeryoung/flink/conf
修改masters:

master

修改slaves:

slave1
slave2

修改flink-conf.yaml:

jobmanager.rpc.address: master
jobmanager.heap.size: 2048m
taskmanager.heap.size: 8192m
taskmanager.numberOfTaskSlots: 8
parallelism.default: 24

将master的flink目录拷贝到其他机器的相同目录:

scp -r /home/jackeryoung/flink root@slave1:/home/jackeryoung
scp -r /home/jackeryoung/flink root@slave2:/home/jackeryoung

nohup ./yarn-session.sh -n 3 -jm 1024m -tm 8192m -s 8 &
在master后台启动Flink Yarn Session

nohup ./yarn-session.sh -n 3 -jm 1024m -tm 8192m -s 8 &

启动后,可以打开JobManager的管理页面,地址在日志中。在yarn的管理页面中也可以看到一个运行中的应用:
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-1zRODkWs-1573718874948)
Flink on yarn 集群部署_第1张图片
执行程序:

./flink run -c com.xiaoying.core.LogAnalysis /home/jackeryoung/LoganalySrv.jar

你可能感兴趣的:(日志分析,Flink)