Spark 自带demo学习日志

  1. the way that to build the RDD
    (1) generate from the folder : offer the folder path which has been upload the hdfs
SparkSession spark = SparkSession
      .builder()
    .appName("JavaHdfsLR").master("local")//note:master better no set in here ,this field show                                                                   //which model you will start the spark cluster 
      .getOrCreate();
JavaRDD rdd=spark.read().textFile("filePath").javaRDD();

(2)generate from the List

    JavaSparkContext jsc=new JavaSparkContext(spark.sparkContext());

    JavaRDD<String> lines = jsc.parallelize(new ArrayList());
  1. JavaRDD->JavaRDD
    一对一类型的转变可以用到这个函数:map(new Function(T,R))
    JavaRDD.map(new Function(T,R)),输入T,输出R,在函数里面完成了转化.

另外还有一个函数: flatMap(FlatMapFunction

你可能感兴趣的:(spark,spark)