Spark SQL,正则替换,regexp_extract

val regexString1=simpleColors.map(_.toUpperCase).mkString("(","|",")")
df.select(regexp_extract(col("Description"),regexString1,1).as("color_clean"),
col("Description")).show(2)
spark.sql("select regexp_extract(Description,'(BLACk|WHITE|RED|GREEN|BLUE)',1),
Description from dfTable").show(2)

你可能感兴趣的:(Spark SQL,正则替换,regexp_extract)