Spark Kerbrose 问题汇总

最近公司给Hadoop集群加了Kerbrose+KMS, 由于国内公司很少用的Hadoop kerbrose集群,国外的网站上 Spark + Kerbrose 的问题也基本没有搜到解决方案,故记录相关解决方案,供大家参考。

相关脚本命令

  • spark-yarn问题
    2.1.1之后不再有assembly包了,换成了jars后面的小包, 将spark下面的jar包zip后,通过hdfs命令传入hdfs
    cd jars/
    zip spark2.1.1-hadoop2.7.3.zip ./*

spark.yarn.archive hdfs://fake-name/user/spark/lib/spark2.1.1-hadoop2.7.3.zip
或者:
spark.yarn.jars hdfs://fake-name/user/spark/lib/*.jar

  • kinit -k -t /home/key/spark.keytab spark/[email protected]

  • /home/hadoop/spark/bin/spark-submit --class xxx.BailiWeeklyAggApp --master yarn --deploy-mode client --keytab /home/key/spark.keytab --principal spark/[email protected] --conf spark.hadoop.fs.hdfs.impl.disable.cache=true /home/mei/spark-lib-1.0-SNAPSHOT.jar

  • nohup java -Dspring.profiles.active=test -Dlogging.file=/home/mei/spark-tmp.log -Dserver.port=9000 -Dspring.application.name=spark-kb -jar /home/mei/spark-service.jar > /dev/null 2>&1 &

  • streaming:
    export HADOOP_CONF_DIR=/home/hadoop/current_hadoop/etc/hadoop
    nohup /home/hadoop/spark/bin/spark-submit --class
    xxx.BailiWeeklyAggApp --master yarn --deploy-mode client --keytab /home/key/spark.keytab --principal spark/xxx.hadoop.com@@HADOOP.COM --conf spark.hadoop.fs.hdfs.impl.disable.cache=true /home/mei/spark-lib-1.0-SNAPSHOT.jar > streaming.log 2>&1

  • 线上状态查看
    curl 'http://namenode1.hadoop.com:50070/jmxqry=Hadoop:service=NameNode,name=NameNodeStatus'

Kerbrose相关问题:

  • 读取hdfs文件的话(或底层有依赖hdfs文件的地方),需要将hdfs conf下面的配置文件加入resource, 核心代码片段如下:
public static void initKerberose(SparkConf sparkConf, String env) {
        if( sparkContext != null ) {


            try {
                //这段代码也许是非必须的....., 可去除,如果项目中没有直接操作hdfs文件的话
                Configuration configuration = new Configuration(false); //false

                logger.info(sparkConf.getAll().toString());

                String keytab = sparkConf.get("spark.yarn.keytab");
                String principal = sparkConf.get("spark.yarn.principal");

                configuration.addResource(env.concat("_conf/core-site.xml"));
                configuration.addResource(env.concat("_conf/hdfs-site.xml"));
                configuration.addResource(env.concat("_conf/yarn-site.xml"));
                logger.info(configuration.toString());

                UserGroupInformation.setConfiguration(configuration);
                UserGroupInformation.loginUserFromKeytab(principal, keytab);
                logger.info(UserGroupInformation.getCurrentUser().toString());
                logger.info(UserGroupInformation.getLoginUser().toString());

                FileSystem fs = FileSystem.get(configuration);

                InetSocketAddress active = HAUtil.getAddressOfActive(fs);
                logger.info("active:" + active.getHostName() + "/" + active.getAddress().getHostAddress() + ":" + active.getPort());

            } catch (Exception e) {
                logger.error(e.getMessage(), e);
            }

            sparkContext.hadoopConfiguration().addResource(env.concat("_conf/core-site.xml"));
            sparkContext.hadoopConfiguration().addResource(env.concat("_conf/hdfs-site.xml"));
            sparkContext.hadoopConfiguration().addResource(env.concat("_conf/yarn-site.xml"));
            logger.info("spark-hdfs-configuration" + sparkContext.hadoopConfiguration().toString());
        }
    }
  • kerbrose token 过期刷新问题:
    • 是否需要手动刷新???
      调用UserGroupInformation.loginUserFromKeyTabAndReturnGUI()
      UserGroupInformation.checkTGTAndReloginFromKeytab()
  • KMS 问题:
    org.apache.hadoop.ipc.RemoteException: No crypto protocol versions provided by the client are supported. Client provided: [] NameNode supports: [CryptoProtocolVersion{description='Unknown', version=1, unknownValue=null}, CryptoProtocolVersion{description='Encryption zones', version=2, unknownValue=null}]

    • https://issues.apache.org/jira/browse/HADOOP-13132?attachmentOrder=desc
  • org.apache.hadoop.ipc.RemoteException: No crypto protocol versions provided by the client are supported. Client provided: [] NameNode supports: [CryptoProtocolVersion{description='Unknown', version=1, unknownValue=null}, CryptoProtocolVersion{description='Encryption zones', version=2, unknownValue=null}]

  • 18/03/06 15:39:29 INFO yarn.Client:
    client token: N/A
    diagnostics: Application application_1504683065445_0174 failed 2 times due to AM Container for appattempt_1504683065445_0174_000002 exited with exitCode: -1000
    For more detailed output, check application tracking page:http://test71.hadoop.com:8088/cluster/app/application_1504683065445_0174Then, click on links to logs of each attempt.
    Diagnostics: org.apache.hadoop.security.authentication.client.AuthenticationException cannot be cast to java.security.GeneralSecurityException
    java.lang.ClassCastException: org.apache.hadoop.security.authentication.client.AuthenticationException cannot be cast to java.security.GeneralSecurityException
    at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.decryptEncryptedKey(LoadBalancingKMSClientProvider.java:189)
    at org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.decryptEncryptedKey(KeyProviderCryptoExtension.java:388)
    at org.apache.hadoop.hdfs.DFSClient.decryptEncryptedDataEncryptionKey(DFSClient.java:1381)
    at org.apache.hadoop.hdfs.DFSClient.createWrappedInputStream(DFSClient.java:1451)
    at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:305)
    at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:299)
    at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
    at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:299)
    at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:769)
    at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:364)
    at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:267)
    at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63)
    at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361)
    at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
    at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358)
    at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

你可能感兴趣的:(Spark Kerbrose 问题汇总)