spark reduceByKey()和 reduceByKey(,para)的时间差


import time

t=[]


for i in range(1,10000000000):
    t.append((i,i))
tsc=sc.parallelize(t)
def fun1(d):
    t1=time.time()
    d.reduceByKey(lambda x,y:x*y)
    t2=time.time()
    return t2-t1
def fun2(d):
    t1=time.time()
    d.reduceByKey(lambda x,y:x*y,10)
    t2=time.time()
    return t2-t1


>>> fun1(tsc)
0.033590078353881836
>>> fun2(tsc)
0.03184199333190918

你可能感兴趣的:(spark)