SparkCore(5):Spark on Standalone配置和测试

1.实现功能

 Spark应用运行在Standalone资源管理框架系统上,Standalone是spark自带的一种资源管理框架,类似yarn,分布式的。

2.Standalone的框架

        Worker: 执行节点服务,管理当前节点的资源及启动executor
        Master: 集群资源管理及申请

3.配置信息

(1)要求:spark的local本地模式可以成功运行,配置spark-env.sh

JAVA_HOME=/opt/jdk1.8.0_151
SCALA_HOME=/opt/modules/scala-2.11.8

HADOOP_CONF_DIR=/opt/modules/apache/hadoop-2.7.3/etc/hadoop
SPARK_LOCAL_IP=bigdata.ibeifeng.com

(2)在spark-env.sh添加master和worker信息

(a)虚拟机

SPARK_MASTER_IP=bigdata.ibeifeng.com
SPARK_MASTER_PORT=7070
SPARK_MASTER_WEBUI_PORT=8080
SPARK_WORKER_CORES=2
SPARK_WORKER_MEMORY=2g
SPARK_WORKER_PORT=7071
SPARK_WORKER_WEBUI_PORT=8081
SPARK_WORKER_INSTANCES=2

(b)服务器配置

SPARK_MASTER_HOST=hadoop
SPARK_WORKER_CORES=2
SPARK_WORKER_MEMORY=2g
SPARK_WORKER_INSTANCES=1

(3)配置slaves文件

mv slaves.template slaves

添加

(a)虚拟机

# A Spark Worker will be started on each of the machines listed below.
bigdata.ibeifeng.com

(b)服务器

# A Spark Worker will be started on each of the machines listed below.
hadoop

(4)启动服务

sbin/start-all.sh

结果:

(a)服务器

starting org.apache.spark.deploy.master.Master, logging to /opt/modules/spark-2.1.0-bin-2.6.0-cdh5.7.0/logs/spark-root-org.apache.spark.deploy.master.Master-1-hadoop.out
hadoop: starting org.apache.spark.deploy.worker.Worker, logging to /opt/modules/spark-2.1.0-bin-2.6.0-cdh5.7.0/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-hadoop.out

其中,master和worker分别记录在/opt/modules/spark-2.1.0-bin-2.6.0-cdh5.7.0/logs/spark-root-org.apache.spark.deploy.master.Master-1-hadoop.out和/opt/modules/spark-2.1.0-bin-2.6.0-cdh5.7.0/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-hadoop.out

 

4.测试

(1)启动spark-shell

(a)虚拟机

bin/spark-shell --master spark://bigdata.ibeifeng.com:7070

(b)服务器

bin/spark-shell --master spark://hadoop:7077
结果:
Spark context available as 'sc' (master = spark://hadoop:7077, app id = app-20190116000819-0001).
Spark session available as 'spark'.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.1.0
      /_/
         
Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_151)
Type in expressions to have them evaluated.
Type :help for more information.
scala> 

 

(2)测试topN

val lines = sc.textFile("/README.md")    #这个是HDFS上的路径
val words = lines.flatMap(line => line.split(" "))
val words2 = words.map(word => (word,1))
val wordCountRDD= words2.reduceByKey(_ + _)
wordCountRDD.sortBy(t => -t._2).take(10)

(测试成功~)

你可能感兴趣的:(大数据开发,SparkCore,Spark)