SPARK 重命名DataFrame的列名

val df = Seq((2L, "a", "foo", 3.0)).toDF
df.printSchema
// root
//  |-- _1: long (nullable = false)
//  |-- _2: string (nullable = true)
//  |-- _3: string (nullable = true)
//  |-- _4: double (nullable = false)

最简单的办法toDF方法

val schemas= Seq("id", "x1", "x2", "x3")
val dfRenamed = df.toDF(schemas: _*)

dfRenamed.printSchema
// root
// |-- id: long (nullable = false)
// |-- x1: string (nullable = true)
// |-- x2: string (nullable = true)
// |-- x3: double (nullable = false)

如果要重命名单个列,可以使用以下任select一项alias

df.select($"_1".alias("x1"))

可以很容易到多列:

val lookup = Map("_1" -> "foo", "_3" -> "bar")

df.select(df.columns.map(c => col(c).as(lookup.getOrElse(c, c))): _*)

或者withColumnRenamed

df.withColumnRenamed("_1", "x1")

 

你可能感兴趣的:(spark)