Spark对Dataframe列名进行操作

  1. 获取df所有列的名字
    df.columns.toList
val data1 = Seq(  
 | ("1", "ming", "hlj"),  
 | ("2", "tian", "jl"),
 | ("3", "wang", "ln"),
 | ("4", "qi", "bj"),
 | ("5", "sun", "tj")
 | ).toDF("useid", "name", "live") 
 
data1.columns.toList
结果:List[String] = List(id, name, age)
  1. 获取df所有列的类型并转化为Map
    df.dtypes.toMap
df.dtypes.toMap

结果:
scala.collection.immutable.Map[String,String] = Map(id -> IntegerType, name -> StringType, age -> IntegerType)
  1. 改名字
# 使用selectExpr方法
color_df2 = color_df.selectExpr('color as color2','length as length2')
color_df2.show()

# withColumnRenamed方法
color_df2 = color_df.withColumnRenamed('color','color2')\
                    .withColumnRenamed('length','length2')
color_df2.show()

# alias 方法
color_df.select(color_df.color.alias('color2')).show()

文章参考:https://zhuanlan.zhihu.com/p/34901683

你可能感兴趣的:(Spark,SQL,DateFrame)