scala-sparkML学习笔记:struct type tinyint size int indices array int values array double type

错误类型:

CSV data source does not support struct,values:array> data type.

 

predictPredict.select("user_id", "probability", "label").coalesce(1) 
          .write.format("com.databricks.spark.csv").mode("overwrite") 
          .option("header", "true").option("delimiter","\t").option("nullValue", Const.NULL) 
          .save(fileName.predictResultFile + day) 

predictPredict选择probability列保存会出现'`probability`' is of struct,values:array> type 这个错误, 因为是DenseVector不可以直接报保存到csv文件, 可以有下面两种解决方法: (主要思想是选择DenseVector中预测为1的那一列,类型为double)

 

        /*
        import org.apache.spark

你可能感兴趣的:(机器学习,spark学习,MachineLP成长记,probability,sparkml问题总结)