Spark完全分布式环境搭建

  1. 配置环境变量【/etc/profile】
 	#SPARK_HOME
   	export SPARK_HOME=/home/bduser/opt/module/spark-2.1.3
  	export PATH=$PATH:$SPARK_HOME/bin
  1. spark onyarn版本的搭建需要修改【spark-env.sh】文件
	export SPARK_MASTER_IP=my121
	export JAVA_HOME=/opt/module/jdk1.8.0_172
	export HADOOP_HOME=/opt/module/hadoop-2.7.6
	export SCALA_HOME=/opt/module/scala-2.11.8
	export HADOOP_CONF_DIR=/opt/module/hadoop-2.7.6/etc/hadoop
  1. 修改slaves配置文件中的内容,将所有worker节点的地址(IP或者HostName)添加进去
	my121
	my122
	my123
  1. 如果所有节点进程都成功启动,则可以通过执行SparkPi示例程序对当前on
    yarn版本集群环境进行测试。执行程序SparkPi的命令如下,该条命令可以在任意节点上执行
./bin/spark-submit --master yarn --deploy-mode client --class org.apache.spark.examples.SparkPi examples/jars/spark-examples_2.11-2.1.3.jar

出现Pi is roughly 3.144155720778604即成功【每次执行的结果都是估算值】
Spark完全分布式环境搭建_第1张图片
5. 进入UI界面

Spark完全分布式环境搭建_第2张图片
有可能会出现如下报错:

ERROR spark.SparkContext: Error initializing SparkContext.
org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.

Spark完全分布式环境搭建_第3张图片

 hdfs://my121:9000/user/bduser/.sparkStaging/application_1543306777138_0001
18/11/27 16:22:05 ERROR spark.SparkContext: Error initializing SparkContext.
org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.
	at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:85)
	at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:62)
	at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:156)
	at org.apache.spark.SparkContext.(SparkContext.scala:509)
	at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2323)
	at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:876)
	at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:868)
	at scala.Option.getOrElse(Option.scala:121)
	at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:868)
	at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:31)
	at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:744)
	at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
	at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

解决方法:
找到hadoop配置文件yarn-site.xml
添加下面内容



	yarn.nodemanager.vmem-check-enabled
	false

设置的时候不知道碰到什么了,vi编辑老是出现黄色标识
解决:

:noh

你可能感兴趣的:(环境搭建)