Transformation算子,action算子,产生shuffle的算子

Transformation的官方文档方法集合如下:

map
filter
flatMap
mapPartitions
mapPartitionsWithIndex
sample
union
intersection
distinct
groupByKey
reduceByKey
aggregateByKey
sortByKey
join
cogroup
cartesian
pipe
coalesce
repartition
repartitionAndSortWithinPartitions

Action的官方文档方法集合如下:

reduce
collect
count
first
take
takeSample
takeOrdered
saveAsTextFile
saveAsSequenceFile
saveAsObjectFile
countByKey
foreach

Spark会产生shuffle的算子

去重

distinct

聚合

reduceByKey
groupBy
groupByKey
aggregateByKey
combineByKey

排序

sortByKey
sortBy

重分区

coalesce
repartition

集合或者表操作

intersection
subtract
subtractByKey
join
leftOuterJoin

你可能感兴趣的:(spark)