Spark重要源码解读

SparkConf类

/**
*Configuration for a Spark application. Used to set various Spark parameters as key-value pairs.
*
*Most of the time, you would create a SparkConf object with new SparkConf(), which will load
*values from any spark.* Java system properties set in your application as well. In this case,
*parameters you set directly on the SparkConf object take priority over system properties.
*
*For unit tests, you can also call new SparkConf(false) to skip loading external settings and
*get the same configuration no matter what the system properties are.
*
*All setter methods in this class support chaining. For example, you can write
*new SparkConf().setMaster("local").setAppName("My app").
*
*@param loadDefaults whether to also load values from Java system properties
*
*@note Once a SparkConf object is passed to Spark, it is cloned and can no longer be modified
*by the user. Spark does not support modifying the configuration at runtime.
*/

SparkContext实例化的时候需要传进一个SparkConf作为参数,SparkConf描述整个Spark应用程序的配置信息, SparkConf可以进行链式的调用,即:
new SparkConf().setMaster(“local”).setAppName(“TestApp”)

SparkConf的部分源码如下:
// 用来存储key-value的配置信息
private val settings = new ConcurrentHashMapString, String
// 默认会加载“spark.”格式的配置信息
if (loadDefaults) {
// Load any spark.
system properties
for ((key, value) <- Utils.getSystemProperties if key.startsWith(“spark.”)) { set(key, value)
}
}
/** Set a configuration variable. */
def set(key: String, value: String): SparkConf = { if (key == null) {
throw new NullPointerException(“null key”)
}
if (value == null) {
throw new NullPointerException("null value for " + key)

}
logDeprecationWarning(key) settings.put(key, value)
// 每次进行设置后都会返回SparkConf自身,所以可以进行链式的调用
this
}

SparkContext类

/**
*Main entry point for Spark functionality. A SparkContext represents the connection to a Spark
*cluster, and can be used to create RDDs, accumulators and broadcast variables on that cluster.
*
*Only one SparkContext may be active per JVM. You must stop() the active SparkContext before
*creating a

你可能感兴趣的:(大数据,spark源码)