Exception in thread "main" org.apache.spark.sql.AnalysisException: "to_account_date" is not a numeri

以下spark dataframe 代码

df.groupBy("name").min("date")```

报错信息如下:

Exception in thread "main" org.apache.spark.sql.AnalysisException: "to_account_date" is not a numeric column. Aggregation function can only be applied on a numeric column.;
	at org.apache.spark.sql.RelationalGroupedDataset$$anonfun$3.apply(RelationalGroupedDataset.scala:103)
	at org.apache.spark.sql.RelationalGroupedDataset$$anonfun$3.apply(RelationalGroupedDataset.scala:100)
	at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
	at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
	at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
	at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:35)
	at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
	at scala.collection.AbstractTraversable.map(Traversable.scala:104)
	at org.apache.spark.sql.RelationalGroupedDataset.aggregateNumericColumns(RelationalGroupedDataset.scala:100)
	at org.apache.spark.sql.RelationalGroupedDataset.min(RelationalGroupedDataset.scala:286)
	at com.bdt.doep.dhe.ba.huitui.Achievement$.load(Achievement.scala:65)
	at com.bdt.doep.dhe.ba.huitui.Main.run(Main.scala:27)
	at com.bdt.doep.dhe.ba.huitui.Main$.main(Main.scala:84)
	at com.bdt.doep.dhe.ba.huitui.Main.main(Main.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
	at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:904)
	at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
	at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

错误信息很明显:groupby 后的min聚集(或其他聚集)只支持数值类型
知道错误信息,目前已知两种修改方式:
方式1:

df.groupBy(“name”).agg(“date”->“min”)

方式2:
修改列类型为数值类型

以后有其他方式,继续完善

你可能感兴趣的:(经验分享,hadoop)