1.spark-submit方式:将jar上传到集群,然后到/bin目录下通过spark-submit的方式,执行spark任务:
格式:
spark-submit --master spark的地址 --class 全类名 jar包地址 参数
举个栗子:运行spark自带的测试程序,计算pi的值
./spark-submit --master spark://node3:7077 --class org.apache.spark.examples.SparkPi /usr/local/spark-2.1.0-bin-hadoop2.7/examples/jars/spark-examples_2.11-2.1.0.jar 500
运行结果:
Pi is roughly 3.1414508628290174
2.spark-shell方式:相当于REPL工具,命令行工具,本身也是一个Application
2.1本地模式:不需要连接到Spark集群,在本地直接运行,用于测试
启动命令:bin/spark-shell 后面不写任何参数,代表本地模式:
[root@bigdata111 bin]# ./spark-shell
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
19/06/18 17:52:17 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
19/06/18 17:52:27 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
19/06/18 17:52:27 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
19/06/18 17:52:29 WARN ObjectStore: Failed to get database global_temp, returning NoSuchObjectException
Spark context Web UI available at http://192.168.226.111:4040
Spark context available as 'sc' (master = local[*], app id = local-1560851538355).
Spark session available as 'spark'.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 2.1.0
/_/
Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_181)
Type in expressions to have them evaluated.
Type :help for more information.
scala> [root@bigdata111 bin]#
2.2集群模式
启动命令:bin/spark-shell --master spark://.....
[root@bigdata111 spark-2.1.0-bin-hadoop2.7]# ./bin/spark-shell --master spark://bigdata111:7077
启动之后:
[root@bigdata111 spark-2.1.0-bin-hadoop2.7]# ./bin/spark-shell --master spark://bigdata111:7077
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
19/06/18 22:47:54 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
19/06/18 22:48:07 WARN ObjectStore: Failed to get database global_temp, returning NoSuchObjectException
Spark context Web UI available at http://192.168.226.111:4040
Spark context available as 'sc' (master = spark://bigdata111:7077, app id = app-20190618224755-0000).
Spark session available as 'spark'.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 2.1.0
/_/
Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_181)
Type in expressions to have them evaluated.
Type :help for more information.
scala>
说明:
Spark context available as 'sc' (master = spark://bigdata111:7077, app id = app-20190618224755-0000).
Spark session available as 'spark'.
Spark session : Spark2.0以后提供的,利用session可以访问所有spark组件(core sql..)
'spark' 'sc' 两个对象,可以直接使用
举个栗子:在Spark shell中 开发一个wordCount程序
(*)读取一个本地文件,将结果打印到屏幕上。
注意:示例必须只有一个worker 且本地文件与worker在同一台服务器上。
scala> sc.textFile("/usr/local/tmp_files/test_WordCount.txt").flatMap(_.split(" ")).map((_,1)).reduceByKey(_+_).collect
结果:
res0: Array[(String, Int)] = Array((is,1), (love,2), (capital,1), (Beijing,2), (China,2), (hehehehehe,1), (I,2), (of,1), (the,1))
(*)读取一个hdfs文件,进行WordCount操作,并将结果写回hdfs
scala> sc.textFile("hdfs://bigdata111:9000/word.txt").flatMap(_.split(" ")).map((_,1)).reduceByKey(_+_).saveAsTextFile("hdfs://bigdata111:9000/result")
说明:这里textFile()里的地址是HDFS上地址
spark任务执行完成之后,会把结果存放在hdfs上的result文件夹里:
查看:
[root@bigdata111 opt]# hdfs dfs -ls /result/
Found 3 items
-rw-r--r-- 3 root supergroup 0 2019-06-18 23:02 /result/_SUCCESS
-rw-r--r-- 3 root supergroup 73 2019-06-18 23:02 /result/part-00000
-rw-r--r-- 3 root supergroup 22 2019-06-18 23:02 /result/part-00001
[root@bigdata111 opt]# hdfs dfs -cat /result/*
(shuai,1)
(are,1)
(b,1)
(best,1)
(zouzou,1)
(word,1)
(hello,1)
(world,1)
(you,1)
(a,1)
(the,1)