RDD take 和 takeOrdered 方法

http://stackoverflow.com/questions/33563575/on-sparks-rdds-take-and-takeordered-methods

In order to explain how ordering works we create an RDD with integers from 0 to 99:
val myRdd = sc.parallelize(Seq.range(0, 100))

We can now perform:
myRdd.take(5)

Wich will extract the first 5 elements of the RDD and we will obtain an Array[Int] containig the first 5 integers of myRDD: '0 1 2 3 4 5' (with no ordering function, just the first 5 elements in the first 5 position)
The takeOrdered(5) operation works in a similar way: it will extract the first 5 elements of the RDD as an Array[Int] but we have to opportunity to specify the ordering criteria:
myRdd.takeOrdered(5)( Ordering[Int].reverse)

Will extract the first 5 elements according to specified ordering. In our case the result will be: '99 98 97 96 95'
If you have a more complex data structure in your RDD you may want to perform your own ordering function with the operation:
myRdd.takeOrdered(5)( Ordering[Int].reverse.on { x => ??? })

Which will extract the first 5 elements of your RDD as an Array[Int] according to your custom ordering function.

你可能感兴趣的:(RDD take 和 takeOrdered 方法)