pyspark 遍历表数据 返回某列的值

from pyspark.sql import SparkSession

创建SparkSession

spark = SparkSession.builder.appName(“example”).getOrCreate()

读取表

example_table = spark.read.table(“example_table”)

选择要返回的列

column_name = “column_name”
data = example_table.select(column_name).collect()

将收集到的数据转换为列表

data_list = [row[column_name] for row in data]
print(data_list)

关闭SparkSession

spark.stop()

你可能感兴趣的:(数据库,spark,python)