python_day12_flatMap

flatMap算子,与map类似,但可解除嵌套

from pyspark import SparkConf, SparkContext
import os

os.environ['PYSPARK_PYTHON'] = "D:/dev/python/python3.10.4/python.exe"
conf = SparkConf().setMaster("local[*]").setAppName("test_spark")
sc = SparkContext(conf=conf)

准备一个RDD

rdd_1 = sc.parallelize(["java python c", "云纹 术学 武理", "With great power comes great responsibility"])

将RDD数据中单词提取
a、使用map

rdd_2 = rdd_1.map(lambda x: x.split(" "))
print(rdd_2.collect())

# 关闭
sc.stop()

python_day12_flatMap_第1张图片

b、使用flatMap

rdd_2 = rdd_1.flatMap(lambda x: x.split(" "))
print(rdd_2.collect())

# 关闭
sc.stop()

python_day12_flatMap_第2张图片
python_day12_flatMap_第3张图片

你可能感兴趣的:(python,python,开发语言)