spark的sortbykey的二次排序

基本思路是自定义一个sortbykey的类,然后是使用map转换,其中key为该对象即可,最后调用算子sortbykey,基本实现如下:

1、自定义类

class SecondSortByKeyScala(val first :String,val second :Int)extends Ordered[SecondSortByKeyScala]with Serializable {

override def compare(that: SecondSortByKeyScala): Int = {

val compare =this.first.compareTo(that.first)

if(compare ==0){

return this.second.compareTo(that.second)

}

return compare

}

}

2、spark执行代码如下

val spark = SparkSession.builder().appName("spark1").master("local[1]").getOrCreate();

val sc = spark.sparkContext;

val list =Array("xiao,76","xiao,56","xiao1,98","xiao1,65",

"xiao2,24","xiao2,98","xiao3,77","xiao3,56","xiao3,96");

val rdd = sc.parallelize(list)

val sortStartValue = rdd.map(x =>(new SecondSortByKeyScala(

x.split(",")(0),x.split(",")(1).toInt),x))

val rddsortbeing = sortStartValue.sortByKey(false)

rddsortbeing.foreach(x =>{

println(x._2)

})

3、打印结果如下


你可能感兴趣的:(spark的sortbykey的二次排序)