HDP2.6.5更换spark版本为2.4.5 与carbondata2.0.1集成

文章目录

  • 一更换spark版本
    • 第一种方式
    • 第二种方式
    • 问题
      • 启动spark任务不成功
  • spark集成carbondata
    • 从官网下载carbondata
    • 构建
      • 先决条件
      • 构建命令
    • 在YARN群集上的Spark上安装和配置CarbonData
      • 先决条件
      • 部署
      • 使用CarbonData Thrift服务器执行查询

一更换spark版本

因为要使用的carbondata对spark版本有要求,项目中使用的carbondata版本为2.0.1,spark版本要求为2.4.5

第一种方式

1)、找到/usr/hdp/2.6.5.0-292/spark2/下的jars路径 并备份为jars_bak/

​ 2)、然后从官网下载spark-2.4.5-bin-hadoop2.7的tar包,把所有依赖的jar包拷贝到上面创建的jars路径下

cd /usr/hdp/2.6.5.0-292/spark2/
mv jars/ jars_bak/
cd /opt/spark-2.4.5-bin-hadoop2.7/
cp -r jars/ /usr/hdp/2.6.5.0-292/spark2/

第二种方式

  1. 安装rpmrebuild 和安装rpmbuild
    rpmrebuild下载链接: https://sourceforge.net/projects/rpmrebuild/files/rpmrebuild/2.12-1/

rpmbuild直接用yum安装 :yum install rpm-build

  1. 创建目录解压rpmrebuild文件到/data/rpmbuild中
mkdir -p /data
mkdir -p /data/rpmbuild
mkdir -p /data/rpmbuild/BUILDROOT
mkdir -p /data/rpmbuild/SPECS
cd /data/rpmbuild
echo "%_topdir /data/rpmbuild" >> ~/.rpmmacros
tar -zxvf rpmrebuild-2.14.tar.gz

  1. 反编译提取SPEC文件
    查看rpm安装名称
rpm -qa | grep spark2
./rpmrebuild.sh -s SPECS/spark2.spec spark2_2_6_5_0_292-2.3.0.2.6.5.0-292.noarch
  1. 替换或修改rpm包中的文件
    解压原版RPM包
rpm2cpio spark2_2_6_5_0_292-2.3.0.2.6.5.0-292.noarch.rpm |cpio -idv

接下来可根据需求替换修改解压后的文件后

  1. 修改spark2.spec
    vim spark2.spec
    修改jars内容就可以了
    修改后为:
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/activation-1.1.1.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/aircompressor-0.10.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/antlr-2.7.7.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/antlr4-runtime-4.7.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/antlr-runtime-3.4.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/aopalliance-1.0.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/aopalliance-repackaged-2.4.0-b34.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/apacheds-i18n-2.0.0-M15.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/apacheds-kerberos-codec-2.0.0-M15.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/apache-log4j-extras-1.2.17.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/api-asn1-api-1.0.0-M20.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/api-util-1.0.0-M20.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/arpack_combined_all-0.1.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/arrow-format-0.10.0.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/arrow-memory-0.10.0.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/arrow-vector-0.10.0.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/automaton-1.11-8.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/avro-1.8.2.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/avro-ipc-1.8.2.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/avro-mapred-1.8.2-hadoop2.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/bonecp-0.8.0.RELEASE.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/breeze_2.11-0.13.2.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/breeze-macros_2.11-0.13.2.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/calcite-avatica-1.2.0-incubating.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/calcite-core-1.2.0-incubating.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/calcite-linq4j-1.2.0-incubating.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/chill_2.11-0.9.3.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/chill-java-0.9.3.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/commons-beanutils-1.9.4.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/commons-cli-1.2.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/commons-codec-1.10.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/commons-collections-3.2.2.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/commons-compiler-3.0.9.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/commons-compress-1.8.1.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/commons-configuration-1.6.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/commons-crypto-1.0.0.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/commons-dbcp-1.4.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/commons-digester-1.8.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/commons-httpclient-3.1.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/commons-io-2.4.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/commons-lang-2.6.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/commons-lang3-3.5.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/commons-logging-1.1.3.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/commons-math3-3.4.1.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/commons-net-3.1.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/commons-pool-1.5.4.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/compress-lzf-1.0.3.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/core-1.1.2.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/curator-client-2.7.1.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/curator-framework-2.7.1.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/curator-recipes-2.7.1.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/datanucleus-api-jdo-3.2.6.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/datanucleus-core-3.2.10.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/datanucleus-rdbms-3.2.9.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/derby-10.12.1.1.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/eigenbase-properties-1.1.5.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/flatbuffers-1.2.0-3f79e055.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/generex-1.0.2.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/gson-2.2.4.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/guava-14.0.1.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/guice-3.0.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/guice-servlet-3.0.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/hadoop-annotations-2.7.3.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/hadoop-auth-2.7.3.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/hadoop-client-2.7.3.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/hadoop-common-2.7.3.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/hadoop-hdfs-2.7.3.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/hadoop-mapreduce-client-app-2.7.3.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/hadoop-mapreduce-client-common-2.7.3.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/hadoop-mapreduce-client-core-2.7.3.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/hadoop-mapreduce-client-jobclient-2.7.3.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/hadoop-mapreduce-client-shuffle-2.7.3.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/hadoop-yarn-api-2.7.3.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/hadoop-yarn-client-2.7.3.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/hadoop-yarn-common-2.7.3.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/hadoop-yarn-server-common-2.7.3.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/hadoop-yarn-server-web-proxy-2.7.3.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/hive-beeline-1.2.1.spark2.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/hive-cli-1.2.1.spark2.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/hive-exec-1.2.1.spark2.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/hive-jdbc-1.2.1.spark2.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/hive-metastore-1.2.1.spark2.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/hk2-api-2.4.0-b34.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/hk2-locator-2.4.0-b34.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/hk2-utils-2.4.0-b34.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/hppc-0.7.2.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/htrace-core-3.1.0-incubating.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/httpclient-4.5.6.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/httpcore-4.4.10.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/ivy-2.4.0.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/jackson-annotations-2.6.7.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/jackson-core-2.6.7.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/jackson-core-asl-1.9.13.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/jackson-databind-2.6.7.3.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/jackson-dataformat-yaml-2.6.7.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/jackson-jaxrs-1.9.13.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/jackson-mapper-asl-1.9.13.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/jackson-module-jaxb-annotations-2.6.7.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/jackson-module-paranamer-2.7.9.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/jackson-module-scala_2.11-2.6.7.1.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/jackson-xc-1.9.13.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/janino-3.0.9.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/JavaEWAH-0.3.2.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/javassist-3.18.1-GA.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/javax.annotation-api-1.2.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/javax.inject-1.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/javax.inject-2.4.0-b34.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/javax.servlet-api-3.1.0.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/javax.ws.rs-api-2.0.1.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/javax.ws.rs-api-2.1.1.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/javolution-5.5.1.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/jaxb-api-2.2.2.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/jcl-over-slf4j-1.7.16.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/jdo-api-3.0.1.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/jersey-client-1.19.4.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/jersey-client-2.22.2.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/jersey-common-2.22.2.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/jersey-container-servlet-2.22.2.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/jersey-container-servlet-core-2.22.2.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/jersey-core-1.19.4.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/jersey-guava-2.22.2.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/jersey-media-jaxb-2.22.2.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/jersey-server-2.22.2.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/jetty-6.1.26.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/jetty-util-6.1.26.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/jline-2.14.6.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/joda-time-2.9.3.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/jodd-core-3.5.2.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/jpam-1.1.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/json4s-ast_2.11-3.5.3.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/json4s-core_2.11-3.5.3.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/json4s-jackson_2.11-3.5.3.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/json4s-scalap_2.11-3.5.3.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/jsp-api-2.1.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/jsr305-1.3.9.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/jta-1.1.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/jtransforms-2.4.0.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/jul-to-slf4j-1.7.16.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/kryo-shaded-4.0.2.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/kubernetes-client-4.6.1.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/kubernetes-model-4.6.1.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/kubernetes-model-common-4.6.1.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/leveldbjni-all-1.8.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/libfb303-0.9.3.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/libthrift-0.9.3.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/log4j-1.2.17.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/logging-interceptor-3.12.0.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/lz4-java-1.4.0.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/machinist_2.11-0.6.1.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/macro-compat_2.11-1.1.1.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/mesos-1.4.0-shaded-protobuf.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/metrics-core-3.1.5.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/metrics-graphite-3.1.5.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/metrics-json-3.1.5.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/metrics-jvm-3.1.5.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/minlog-1.3.0.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/netty-3.9.9.Final.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/netty-all-4.1.42.Final.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/objenesis-2.5.1.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/okhttp-3.12.0.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/okio-1.15.0.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/opencsv-2.3.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/orc-core-1.5.5-nohive.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/orc-mapreduce-1.5.5-nohive.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/orc-shims-1.5.5.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/oro-2.0.8.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/osgi-resource-locator-1.0.1.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/paranamer-2.8.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/parquet-column-1.10.1.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/parquet-common-1.10.1.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/parquet-encoding-1.10.1.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/parquet-format-2.4.0.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/parquet-hadoop-1.10.1.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/parquet-hadoop-bundle-1.6.0.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/parquet-jackson-1.10.1.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/protobuf-java-2.5.0.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/py4j-0.10.7.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/pyrolite-4.13.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/RoaringBitmap-0.7.45.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/scala-compiler-2.11.12.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/scala-library-2.11.12.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/scala-parser-combinators_2.11-1.1.0.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/scala-reflect-2.11.12.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/scala-xml_2.11-1.0.5.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/shapeless_2.11-2.3.2.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/shims-0.7.45.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/slf4j-api-1.7.16.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/slf4j-log4j12-1.7.16.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/snakeyaml-1.15.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/snappy-0.2.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/snappy-java-1.1.7.3.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/spark-catalyst_2.11-2.4.5.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/spark-core_2.11-2.4.5.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/spark-graphx_2.11-2.4.5.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/spark-hive_2.11-2.4.5.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/spark-hive-thriftserver_2.11-2.4.5.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/spark-kubernetes_2.11-2.4.5.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/spark-kvstore_2.11-2.4.5.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/spark-launcher_2.11-2.4.5.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/spark-mesos_2.11-2.4.5.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/spark-mllib_2.11-2.4.5.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/spark-mllib-local_2.11-2.4.5.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/spark-network-common_2.11-2.4.5.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/spark-network-shuffle_2.11-2.4.5.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/spark-repl_2.11-2.4.5.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/spark-sketch_2.11-2.4.5.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/spark-sql_2.11-2.4.5.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/spark-streaming_2.11-2.4.5.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/spark-tags_2.11-2.4.5.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/spark-tags_2.11-2.4.5-tests.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/spark-unsafe_2.11-2.4.5.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/spark-yarn_2.11-2.4.5.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/spire_2.11-0.13.0.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/spire-macros_2.11-0.13.0.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/ST4-4.0.4.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/stax-api-1.0.1.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/stax-api-1.0-2.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/stream-2.7.0.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/stringtemplate-3.2.1.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/super-csv-2.2.0.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/univocity-parsers-2.7.3.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/validation-api-1.1.0.Final.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/xbean-asm6-shaded-4.8.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/xercesImpl-2.9.1.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/xmlenc-0.52.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/xz-1.5.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/zjsonpatch-0.3.0.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/zookeeper-3.4.6.jar"
%attr(0644, root, root) "/usr/hdp/2.6.5.0-292/spark2/jars/zstd-jni-1.3.2-2.jar"

这样就升级完了
当然后面也可以吧carbondata加进来

  1. 编译RPM包
rpmbuild -s SPECS/spark2.spec spark2_2_6_5_0_292-2.3.0.2.6.5.0-292.noarch

生成的RPM位置在/data/rpmbuild/RPMS/

问题

启动spark任务不成功

查看yarn上报错

Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
/hadoop/yarn/local/usercache/root/appcache/application_1592908057656_0006/container_e11_1592908057656_0006_02_000001/launch_container.sh: line 29: /usr/hdp/2.6.5.0-292/spark2/carbonlib/*:$PWD:$PWD/__spark_conf__:$PWD/__spark_libs__/*:/etc/hadoop/conf:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure:$PWD/__spark_conf__/__hadoop_conf__: bad substitution
Failing this attempt. Failing the application.
  • 原因:

找不到${hdp.version}

  • 解决办法:

第一种:
直接修改MapReduce2的config/Advanced mapred-site中的
mapreduce.application.classpath参数
修改前:

$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure

修改后:

$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/2.6.5.0-292/hadoop/lib/hadoop-lzo-0.6.0.2.6.5.0-292.jar:/etc/hadoop/conf/secure

第二种:
在MapReduce2的config/Custom mapred-site中添加

hdp.version=2.6.5.0-292
# 版本为HDP版本

spark集成carbondata

从官网下载carbondata

https://github.com/apache/carbondata
直接下载apache-carbondata-2.0.1-bin-spark2.4.5-hadoop2.7.2.jar 或者自己构建

构建

先决条件

  • Unix-like environment (Linux, Mac OS X)
  • Git
  • Apache Maven (Recommend version 3.3 or later)
  • Oracle Java 8

构建命令

默认情况下使用Spark 2.4.5使用不同的受支持版本的Spark进行构建

mvn -DskipTests -Pspark-2.4 -Dspark.version=2.4.5 clean package
  • 注意:
    如果您在Windows环境中工作,请记住-Pwindows在构建项目时进行添加。
    mv功能默认情况下未编译。如果要使用此功能,请记住-Pmv在构建项目时进行添加。

在YARN群集上的Spark上安装和配置CarbonData

本节提供了在“ Spark on YARN”群集上安装CarbonData的过程。

先决条件

  • Hadoop HDFS和Yarn应该已安装并正在运行。
  • 应该在所有客户端中安装并运行Spark。
  • CarbonData用户应具有访问HDFS的权限。

部署

以下步骤仅适用于驱动程序节点。(驱动程序节点是启动spark上下文的节点。)

  1. 生成CarbonData项目并从中获取程序集jar ./assembly/target/scala-2.1x/apache-carbondata_xxx.jar并复制到$SPARK_HOME/carbonlib文件夹。

注意:如果$SPARK_HOME路径中不存在carbonlib文件夹,请创建它。

  1. 将./conf/carbon.properties.template文件从CarbonData存储库复制到文件$SPARK_HOME/conf/夹,然后将文件重命名为carbon.properties。

  2. 创建tar.gzcarbonlib文件夹文件,并将其移入carbonlib文件夹内。

cd $SPARK_HOME
tar -zcvf carbondata.tar.gz carbonlib/
mv carbondata.tar.gz carbonlib/
  1. 在$SPARK_HOME/conf/spark-defaults.conf文件中配置下表中提到的属性。
Property Description Value
spark.master 设置此值可在yarn cluster模式下运行Spark。 设置yarn-client以在yar cluster模式下运行Spark。
spark.yarn.dist.files 以逗号分隔的文件列表,将其放置在每个执行程序的工作目录中。 $SPARK_HOME/conf/carbon.properties
spark.yarn.dist.archives 以逗号分隔的归档列表,将其提取到每个执行程序的工作目录中。 $SPARK_HOME/carbonlib/carbondata.tar.gz
spark.executor.extraJavaOptions 一串额外的JVM选项传递给执行者。例如 注意:您可以输入多个值,以空格分隔。 -Dcarbon.properties.filepath = carbon.properties
spark.executor.extraClassPath 额外的类路径条目,以附加到执行者的类路径。注意:如果在spark-env.sh中定义了SPARK_CLASSPATH,请对其进行注释,并将值附加在以下参数spark.driver.extraClassPath中 carbondata.tar.gz/carbonlib/*
spark.driver.extraClassPath 要附加在驱动程序的类路径前面的其他类路径条目。注意:如果在spark-env.sh中定义了SPARK_CLASSPATH,请对其进行注释,并将值附加在以下参数spark.driver.extraClassPath中。 $SPARK_HOME/carbonlib/*
spark.driver.extraJavaOptions 传递给驱动程序的一串额外的JVM选项。例如,GC设置或其他日志记录。 -Dcarbon.properties.filepath = $SPARK_HOME/conf/carbon.properties
spark.driver.extraClassPath=/usr/hdp/2.6.5.0-292/spark2/carbonlib/*
     
spark.driver.extraJavaOptions=-Dcarbon.properties.filepath=/usr/hdp/2.6.5.0-292/spark2/conf/carbon.properties
     
spark.executor.extraClassPath=carbondata.tar.gz/carbonlib/*
     
spark.executor.extraJavaOptions=-Dcarbon.properties.filepath=carbon.properties
     
spark.yarn.dist.archives=/usr/hdp/2.6.5.0-292/spark2/carbonlib/carbondata.tar.gz
     
spark.yarn.dist.files=/usr/hdp/2.6.5.0-292/spark2/conf/carbon.properties
     

  1. 验证安装。
    ./bin/spark-shell
    –master yarn-client
    –driver-memory 1G
    –executor-memory 2G
    –executor-cores 2

注意:

  1. carbondata 2.0版本中已弃用属性“ carbon.storelocation”。只有在先前版本中使用过此属性的用户仍可以在carbon 2.0版本中使用它。
  2. 确保您具有CarbonData JAR和文件的权限,驱动程序和执行程序将通过它们启动。
  3. 如果使用Spark + Hive 1.1.X,则需要将carbondata程序集jar和carbondata-hive jar添加到spark-default.conf文件的参数’spark.sql.hive.metastore.jars’中。

使用CarbonData Thrift服务器执行查询

  1. 提交spark任务
spark-submit \
	--master yarn \
	--deploy-mode client  \
	--class org.apache.carbondata.spark.thriftserver.CarbonThriftServer  \
	--num-executors 20 \
	--driver-memory 64G \
	--executor-memory 16G \
	--executor-cores 4 \
	--queue s1 \
    --conf spark.yarn.executor.memoryOverhead=4G \
	$SPARK_HOME/carbonlib/apache-carbondata-2.0.1-bin-spark2.4.5-hadoop2.7.2.jar \
	hdfs://host1:8020/carbon/spark2.4.5/carbon.store 
  1. 使用Beeline连接到CarbonData Thrift服务器。
cd $SPARK_HOME
./sbin/start-thriftserver.sh
./bin/beeline -u  jdbc:hive2://host1:10000  -n root
  1. 建表测试
create table test(id int) stored as carbondata;
# 在NameNode UI中查看 提交spark任务时的hdfs路径
# hdfs://host1:8020/carbon/spark2.4.5/carbon.store下有test
# 代表创建成功

你可能感兴趣的:(HDP,carbondata,AMBARI)