在先前装的CDH5.14集群中,默认安装的spark是1.6.0版本。我们现在可以现有的集群中再装spark2.x版本,能和spark1.6版本并存。
当前CDH支持的Spark2.X最新版本是Spark2.3.0,目前Apache Spark最近版本是2.3.1,即CDH的版本更新是慢半拍的,但基本上不影响使用。
下面是在CDH中安装Spark2.3的步骤:
这是官方给出安装和升级方法说明:
https://www.cloudera.com/documentation/spark2/latest/topics/spark2_installing.html
http://archive.cloudera.com/spark2/csd/
SPARK2_ON_YARN-2.3.0.cloudera3.jar
http://archive.cloudera.com/spark2/parcels/2.3.0.cloudera3/
SPARK2-2.3.0.cloudera3-1.cdh5.13.3.p0.458809-el6.parcel
SPARK2-2.3.0.cloudera3-1.cdh5.13.3.p0.458809-el6.parcel.sha1
manifest.json
说明:要选择相对应的系统、CDH版本
我的系统是CentOS6.7所以选择了el6,都选择cloudera3相应的parcel包
● CDH Versions
● Cloudera Manager Versions
● JDK1.8+
● Scala 2.11, Python 2.7 or higher, Python 3.4 or higher
# chown cloudera-scm:cloudera-scm SPARK2_ON_YARN-2.3.0.cloudera3.jar
[root@hadoop0 parcel-repo]# ls
SPARK2-2.3.0.cloudera3-1.cdh5.13.3.p0.458809-el6.parcel
SPARK2-2.3.0.cloudera3-1.cdh5.13.3.p0.458809-el6.parcel.sha
manifest.json
如之前有manifest.json文件,先把之前的更名后再上传
安装Spark2.X必需要安装JDK1.8以上版本,因我们在安装CDH5.14时默认选择安装的是JDK1.7.0_67版本,所以要把JDK进行版本升级。否则在后面在安装spark2时会报错:
Java version 1.8 is required for Spark 2.3.
具体安装过程和报错解决在后面详细介绍。
# service cloudera-scm-agent restart
CM页面-> 主机 -> Parcel页面可以看到新的spark2的parcel包
2.3.0.cloudera3-1.cdh5.13.3.p0.458809
然后点击 下载-进行分配-激活
点击-添加服务,选择Spark2服务
选择一组依赖关系
进行角色分配:
加密这时默认不做选择:
进行下步安装
安装完成
完成后启动Spark2服务。
可以看到正常启动
登录Spark2的安节点(hadoop[1-8])
下面解决安装Spark2时出现jdk版本低的问题,
问题:在添加Spark服务进出现下面的错误
日志:
hu Aug 30 14:34:08 CST 2018
using /usr/java/jdk1.7.0_67-cloudera as JAVA_HOME
using 5 as CDH_VERSION
using /var/run/cloudera-scm-agent/process/ccdeploy_spark2-conf_etcspark2conf.cloudera.spark2_on_yarn_3511819582822760396 as CONF_DIR
using spark2-conf as DIRECTORY_NAME
using /etc/spark2/conf.cloudera.spark2_on_yarn as DEST_PATH
using spark2-conf as ALT_NAME
using /etc/spark2/conf as ALT_LINK
using 51 as PRIORITY
using scripts/control.sh as RUNNER_PROGRAM
using client as RUNNER_ARGS
using /usr/sbin/update-alternatives as UPDATE_ALTERNATIVES
Deploying service client configs to /etc/spark2/conf.cloudera.spark2_on_yarn
invoking optional deploy script scripts/control.sh
/var/run/cloudera-scm-agent/process/ccdeploy_spark2-conf_etcspark2conf.cloudera.spark2_on_yarn_3511819582822760396/spark2-conf /var/run/cloudera-scm-agent/process/ccdeploy_spark2-conf_etcspark2conf.cloudera.spark2_on_yarn_3511819582822760396
Thu Aug 30 14:34:08 CST 2018: Running Spark2 CSD control script...
Thu Aug 30 14:34:08 CST 2018: Detected CDH_VERSION of [5]
Java version 1.8 is required for Spark 2.3.
解决:
[root@hadoop1 ~]# rpm -ivh jdk-8u181-linux-x64.rpm
warning: jdk-8u181-linux-x64.rpm: Header V3 RSA/SHA256 Signature, key ID ec551f03: NOKEY
Preparing... ########################################### [100%]
1:jdk1.8 ########################################### [100%]
[root@hadoop0 ~]# vi /etc/default/cloudera-scm-server
export JAVA_HOME=/usr/java/jdk1.8.0_181-amd64
在CM中的点 -> 主机 -> 选择一host
在高级页添加新的JAVA_HOME目录
# service cloudera-scm-server restart
安装完成后,再重新添加Spark2服务正常。
参考:
https://blog.csdn.net/u010936936/article/details/73650417
https://blog.csdn.net/chenguangchun1993/article/details/78903463