pyspark-Rdd-groupby-groupByKey-cogroup-groupWith用法
一、groupBy()groupBy(f,numPartitions=None,partitionFunc=)ReturnanRDDofgroupeditems.代码:rdd=sc.parallelize([1,42,3,4,5,1,4,5,0])res=rdd.groupBy(lambdax:x%2).collect()print(res)拿到迭代器的具体值:forx,yinres:print(