如果没有服务器环境,可以在本地搭建Scala开发环境,单机版,然后安装IDE编程工具,就可以在本地机器上进行scala程序的开发!
jdk1.8下载地址
scala2.10.5下载地址
spark1.6.1下载地址
hadoop2.6.0下载地址
scala IDE2.12下载地址
注意:
这里需要注意一点,scala版本必须是2.10.5,因为spark中也有scala,是2.10.5版本的,而使用IDE的时候,需要引入spark包,所以如果scala单独环境如果不是2.10.5,那么会提示scala版本不一致的错误,很麻烦!
当你想在linux端运行scala程序,或者想通过scala交互式操作来执行代码,首先得在linux端搭建Scala环境,下面来进行部署操作!
scala2.11.6下载地址,下载后上传到linux的opt目录下
# cd /opt
# tar -zxf scala-2.11.6.tgz
# mv scala-2.11.6 scala
# vim /etc/profile
export SCALA_HOME=/opt/scala
export PATH=$PATH:$SCALA_HOME/bin
# source /etc/profile #使配置文件生效
# scala -version
Scala code runner version 2.11.6 -- Copyright 2002-2013, LAMP/EPFL
# scala
Welcome to Scala version 2.11.6 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_71).
Type in expressions to have them evaluated.
Type :help for more information.
scala> 9*9
res0: Int = 81
注:使用scala之前,需要安装jdk
jdk包下载地址
当本地部署搭建完Scala开发环境后,那么就可以利用IDE开发工具进行Scala语言编程,编写一些工具等等,这里写一些简单的语法涉及,来熟悉下Scala。
package epoint.com.cn.test001
object test001 {
def main(args: Array[String]) {
val msg = "hello world"
val greetStrings = Array("i", "like", "scala")
println(msg)
println(max(5, 6))
greet()
printargs(greetStrings)
val oneTwo = List(1, 2)
val threeFour = List(3, 4)
val oneTwoThreeFour = oneTwo ::: threeFour
println(oneTwo + " and " + threeFour + " were not mutated.")
println("Thus, " + oneTwoThreeFour + " is a new list")
val pair = (99, "Luftballons")
println(pair._1)
println(pair._2)
var jetSet = Set("Boeing", "Airbus")
jetSet += "Lear"
println(jetSet.contains("Boeing"))
val romanNumeral = Map(1 -> "I", 2 -> "II",
3 -> "III", 4 -> "IV", 5 -> "V")
println(romanNumeral(4))
}
def max(x: Int, y: Int): Int = {
if (x > y) x
else
y
}
def greet() = println("xubin nihao")
def printargs(args: Array[String]) {
var i = 0
while (i < args.length) {
println(args(i))
i += 1
}
}
}
输出打印:
hello world
6
xubin nihao
i
like
scala
List(1, 2) and List(3, 4) were not mutated.
Thus, List(1, 2, 3, 4) is a new list
99
Luftballons
true
IV
spark是用scala写的一种极其强悍的计算工具,spark内存计算,提供了图计算,流式计算,机器学习,即时查询等十分方便的工具,所以利用scala来进行spark编程是十分必要的,下面简单书写一个spark连接mysql读取信息的例子。
按照windows搭建Scala开发环境博文,搭建scala开发环境,实际已经将Spark环境部署完成了,所以直接可以用scala语言写一些spark相关的程序!
package epoint.com.cn.test001
import org.apache.spark.sql.SQLContext
import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
import org.apache.spark.rdd.RDD
object SparkConnMysql {
def main(args: Array[String]) {
println("Hello, world!")
val conf = new SparkConf()
conf.setAppName("wow,my first spark app")
conf.setMaster("local")
val sc = new SparkContext(conf)
val sqlContext = new SQLContext(sc)
val url = "jdbc:mysql://192.168.114.67:3306/user"
val table = "user"
val reader = sqlContext.read.format("jdbc")
reader.option("url", url)
reader.option("dbtable", table)
reader.option("driver", "com.mysql.jdbc.Driver")
reader.option("user", "root")
reader.option("password", "11111")
val df = reader.load()
df.show()
}
}
运行结果:
Hello, world!
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/D:/spark1.6/lib/spark-assembly-1.6.1-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/D:/spark1.6/lib/spark-examples-1.6.1-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/D:/kettle7.1/inceptor-driver.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
17/11/21 11:43:53 INFO SparkContext: Running Spark version 1.6.1
17/11/21 11:43:55 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/11/21 11:43:56 INFO SecurityManager: Changing view acls to: lenovo
17/11/21 11:43:56 INFO SecurityManager: Changing modify acls to: lenovo
17/11/21 11:43:56 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(lenovo); users with modify permissions: Set(lenovo)
17/11/21 11:43:59 INFO Utils: Successfully started service 'sparkDriver' on port 55824.
17/11/21 11:43:59 INFO Slf4jLogger: Slf4jLogger started
17/11/21 11:43:59 INFO Remoting: Starting remoting
17/11/21 11:43:59 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem@192.168.114.67:55837]
17/11/21 11:43:59 INFO Utils: Successfully started service 'sparkDriverActorSystem' on port 55837.
17/11/21 11:43:59 INFO SparkEnv: Registering MapOutputTracker
17/11/21 11:43:59 INFO SparkEnv: Registering BlockManagerMaster
17/11/21 11:43:59 INFO DiskBlockManager: Created local directory at C:\Users\lenovo\AppData\Local\Temp\blockmgr-16383e3c-7cb6-43c7-b300-ccc1a1561bb4
17/11/21 11:43:59 INFO MemoryStore: MemoryStore started with capacity 1129.9 MB
17/11/21 11:44:00 INFO SparkEnv: Registering OutputCommitCoordinator
17/11/21 11:44:00 INFO Utils: Successfully started service 'SparkUI' on port 4040.
17/11/21 11:44:00 INFO SparkUI: Started SparkUI at http://192.168.114.67:4040
17/11/21 11:44:00 INFO Executor: Starting executor ID driver on host localhost
17/11/21 11:44:00 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 55844.
17/11/21 11:44:00 INFO NettyBlockTransferService: Server created on 55844
17/11/21 11:44:00 INFO BlockManagerMaster: Trying to register BlockManager
17/11/21 11:44:00 INFO BlockManagerMasterEndpoint: Registering block manager localhost:55844 with 1129.9 MB RAM, BlockManagerId(driver, localhost, 55844)
17/11/21 11:44:00 INFO BlockManagerMaster: Registered BlockManager
17/11/21 11:44:05 INFO SparkContext: Starting job: show at SparkConnMysql.scala:25
17/11/21 11:44:05 INFO DAGScheduler: Got job 0 (show at SparkConnMysql.scala:25) with 1 output partitions
17/11/21 11:44:05 INFO DAGScheduler: Final stage: ResultStage 0 (show at SparkConnMysql.scala:25)
17/11/21 11:44:05 INFO DAGScheduler: Parents of final stage: List()
17/11/21 11:44:05 INFO DAGScheduler: Missing parents: List()
17/11/21 11:44:05 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[1] at show at SparkConnMysql.scala:25), which has no missing parents
17/11/21 11:44:06 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 5.2 KB, free 5.2 KB)
17/11/21 11:44:06 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 2.5 KB, free 7.7 KB)
17/11/21 11:44:06 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:55844 (size: 2.5 KB, free: 1129.9 MB)
17/11/21 11:44:06 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1006
17/11/21 11:44:06 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at show at SparkConnMysql.scala:25)
17/11/21 11:44:06 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks
17/11/21 11:44:06 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, partition 0,PROCESS_LOCAL, 1922 bytes)
17/11/21 11:44:06 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
17/11/21 11:44:06 INFO JDBCRDD: closed connection
17/11/21 11:44:06 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 3472 bytes result sent to driver
17/11/21 11:44:06 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 224 ms on localhost (1/1)
17/11/21 11:44:06 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
17/11/21 11:44:06 INFO DAGScheduler: ResultStage 0 (show at SparkConnMysql.scala:25) finished in 0.261 s
17/11/21 11:44:06 INFO DAGScheduler: Job 0 finished: show at SparkConnMysql.scala:25, took 1.467252 s
+---+----+----+------------+------------------+---------+-------+
| id|name| age| phone| email|startdate|enddate|
+---+----+----+------------+------------------+---------+-------+
| 11| 徐心三| 24| 2423424| 2423424@qq.com| null| null|
| 33| 徐心七| 23| 23232323| 13131@qe| null| null|
| 55| 徐彬| 22| 15262301036|徐彬757661238@ww.com| null| null|
| 44| 徐成|3333| 23423424332| 2342423@qq.com| null| null|
| 66| 徐心四| 23|242342342423| 徐彬23424@qq.com| null| null|
| 11| 徐心三| 24| 2423424| 2423424@qq.com| null| null|
| 33| 徐心七| 23| 23232323| 13131@qe| null| null|
| 55| 徐彬| 22| 15262301036|徐彬757661238@ww.com| null| null|
| 44| 徐成|3333| 23423424332| 2342423@qq.com| null| null|
| 66| 徐心四| 23|242342342423| 徐彬23424@qq.com| null| null|
| 88| 徐心八| 123| 131231312| 123123@qeqe| null| null|
| 99| 徐心二| 23| 13131313| 1313133@qeq.com| null| null|
|121| 徐心五| 13| 123131231| 1231312@qq.com| null| null|
|143| 徐心九| 23| 234234| 徐彬234@wrwr| null| null|
+---+----+----+------------+------------------+---------+-------+
only showing top 14 rows
17/11/21 11:44:06 INFO SparkContext: Invoking stop() from shutdown hook
17/11/21 11:44:06 INFO SparkUI: Stopped Spark web UI at http://192.168.114.67:4040
17/11/21 11:44:06 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
17/11/21 11:44:06 INFO MemoryStore: MemoryStore cleared
17/11/21 11:44:06 INFO BlockManager: BlockManager stopped
17/11/21 11:44:06 INFO BlockManagerMaster: BlockManagerMaster stopped
17/11/21 11:44:06 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
17/11/21 11:44:06 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
17/11/21 11:44:06 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
17/11/21 11:44:06 INFO SparkContext: Successfully stopped SparkContext
17/11/21 11:44:07 INFO ShutdownHookManager: Shutdown hook called
17/11/21 11:44:07 INFO ShutdownHookManager: Deleting directory C:\Users\lenovo\AppData\Local\Temp\spark-7877d903-f8f7-4efb-9e0c-7a11ac147153