SparkML

SparkML

SparkML_lr_train :读取py处理后的train表用于训练,将训练模型保存好。
SparkML_lr_predict :读取训练好的模型,读取py处理后的test表用于预测。将预测结果写入normal_data中,根据id修改stream_is_normal的值。

提交spark任务

bin/spark-submit \
--class SparkML_lr_train \
--master yarn \
--deploy-mode cluster \
./SparkML_lr_train1.jar \
10


bin/spark-submit \
--class SparkML_lr_train \
--master yarn \
--deploy-mode client \
./SparkML_lr_train4.jar \
10


bin/spark-submit \
--class SparkML_lr_predict \
--master yarn \
--deploy-mode client \
./SparkML_lr_predict.jar \
10


bin/spark-submit \
--class lr_train\
--master yarn \
--deploy-mode client \
./lr_train.jar \
10


bin/spark-submit \
--class lr_predict\
--master yarn \
--deploy-mode client \
./lr_predict.jar \
10


启动hadoop(启动脚本)
hdp.sh start
启动spark(命令行启动)
sbin/start-all.sh

bin/spark-submit
–class SparkSQL_lr_train
–master yarn
–deploy-mode client
./SparkSQL_lr_train.jar
10

bin/spark-submit
–class lr_train
–master yarn
–deploy-mode client
./lr_train.jar
10

你可能感兴趣的:(#,3计算Spark,spark-ml)