第一步:ssh免密登陆
详情局域网ssh登陆
添加hosts
vim /etc/hosts
#ip 对应的主机名
202.4.136.218 master
202.4.136.186 node1
202.4.136.15 node2
第二步:下载所需软件
1.java
2.scala
3.hadoop
4.spark
第三步:环境变量配置
确保第二步所下的软件的位置与如下对应,PYSPARK_PYTHON的地址是防止driver与executor所用的python版本不一致导致报错。
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export HADOOP_HOME=/usr/local/hadoop
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
export SPARK_HOME=/usr/local/spark
export PATH=$PATH:$SPARK_HOME/sbin:$SPARK_HOME/bin
export PATH=$PATH:$HADOOP_HOME/etc/hadoop
export PYSPARK_PYTHON=/home/sparknode/anaconda3/bin/python
第四步:hadoop与spark配置文件
hadoop:core-site.xml,hdfs-site.xml,mapred-site.xml,yarn-site.xml,slaves,hadoop-env.sh
#core-site.xml
fs.default.name
hdfs://master:9000
hadoop.temp.dir
/usr/local/hadoop/tmp
#hdfs-site.xml
dfs.namenode.secondary.http-address
master:50090
dfs.replication
2
dfs.namenode.name.dir
file:/usr/local/hadoop/tmp/dfs/name
dfs.datanode.data.dir
file:/usr/local/hadoop/tmp/dfs/data
dfs.permissions
true
#mapred-site.xml
mapreduce.framework.name
yarn
#yarn-site.xml
yarn.resourcemanager.hostname
master
yarn.nodemanager.aux-services
mapreduce_shuffle
yarn.log-aggregation-enable
true
yarn.log-aggregation.retain-seconds
604800
yarn.nodemanager.pmem-check-enabled
true
yarn.nodemanager.vmem-check-enabled
true
#vim slaves
#注释掉localhost
#localhost
master
node1
node2
#hadoop-env.sh
#这个主要是添加JAVA_HOME,如果没有在这添加,即使环境变量添加了,启动hadoop时仍会报为找到JAVE_home
export HADOOP_IDENT_STRING=$USER
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export HADOOP_COMMON_LIB_NATIVE_DIR="/usr/local/hadoop/lib/native/"
export HADOOP_OPTS="$HADOOP_OPTS -Djava.library.path=/usr/local/hadoop/lib/"
配置好文件后,初始化
hadoop namenode -format
(如果有新加节点,只需对新节点初始化,若对其它节点也重新初始化时,要保证clusterID是一致的,否则会找不到datanode节点,骚操作是把 /usr/local/hadoop的tmp删除,当然要保证tmp的东西不重要了,全部节点的tmp都要删除,不然未删除的clusterID就不一样了,当然也可手动修改clusterID,具体位置 /usr/local/hadoop/tmp/dfs/name/current/VERSION,未试过新增节点时不动tmp直接 -format ,可尝试)
启动hadoop
start-all.sh
-----------------------------------------------------------------------------------------------------
spark:spark-env.sh,spark-defaults.conf
#spark-env.sh
export SCALA_HOME=/usr/share/scala
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export HADOOP_HOME=/usr/local/hadoop
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
SPARK_MASTER_IP=master
SPARK_LOCAL_DIRS=/usr/local/spark
#SPARK_DRIVER_MEMORY=6g
#SPARK_DRIVER_CORES=8
export LD_LIBRARY_PATH=/usr/local/hadoop/lib/native/:$LD_LIBRARY_PATH
#spark-defaults.conf
spark.yarn.jars hdfs:///usr/local/spark/spark_jars/*
spark-defaults.conf是在使用spark yarn报错时添加的,具体看最后坑4, 该路径是hdfs路径,先把/usr/local/spark/jars上传到hdfs上
启动spark
因为start-all.sh与hadoop的冲突,所以最好把 /usr/local/spark/sbin/start-all.sh 修改为 spark-all.sh
spark-start.sh
---------------------------------------------------------------------------------------------------
UI界面默认端口
Hadoop namenode 50070 master:50070
yarn 8088 master:8088
spark 集群 8080 master:8080
spark.job 4040 master:4040
-----------------------------------------------------------------------------------------------------------
踩过的坑:(遇到就更新)
1.启动后 jps 正常,但是hadoop UI界面检测不到从节点信息,重启后有可能修正
2.添加节点,对hadoop -format 导致clusterID不一致,节点出现异常,上面讲过
3.yarn界面检测到的核数量与内存大小与真实集群的不一致
yarn默认每台机器 8核8G 如果不是,则需要修改配置文件yarn-site.xml
添加(修改为自己机器的实际大小,内存以M为单位,每台机器的配置文件都得修改)
yarn-nodemanager.resource.memory-mb
4096
yarn-nodemanager.resource.cpu-vcores
4
4.spark-submit --master yarn --deploy-mode cluster **.py 使用yarn集群运行spark出错
#报错内容
Exception in thread "main" org.apache.spark.SparkException: Application application_1543628881761_0001 finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1165)
at org.apache.spark.deploy.yarn.YarnClusterApplication.start(Client.scala:1520)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
原因是找不到jars包。
可把spark安装目录下的jars包上传到hdfs上,并在 spark-defaults.conf中添加该hdfs路径
#spark-defaults.conf
spark.yarn.jars hdfs:///usr/local/spark/spark_jars/*
5.YARN UI界面发现不健康的节点
原因:可能是安装hadoop的磁盘空间满了,hadoop好像每两分钟会自动检测一下磁盘空间问题?磁盘如果使用超90%会导致logs无法写入,则报错。
解决办法:把logs的写入目录移动到空间足够大的磁盘上。
#修改hadoop-env.sh,添加log路径
export HADOOP_LOG_DIR=/+路径
#以及修改 yarn-env.sh 添加log路径
export YARN_LOG_DIR=
注意:还要把新目录的权限交出来,否则 master无权访问
sudo chmod -R 777 +文件路径