Spark集成TensorflowOnSpark standalone模式下测试mnist

微信公众号(SZBigdata-Club):后续博客的文档都会转到微信公众号中。 
1、公众号会持续给大家推送技术文档、学习视频、技术书籍、数据集等。 
2、接受大家投稿支持。 
3、对于各公司hr招聘的,可以私下联系我,把招聘信息发给我我会在公众号中进行推送。 

Spark集成TensorflowOnSpark standalone模式下测试mnist_第1张图片
技术交流群:59701880 深圳广州hadoop好友会 

Spark集成TensorflowOnSpark standalone模式下测试mnist_第2张图片

预先条件

安装tensorflow环境

下载tensorflowonspark代码


 

1

2

3


 

git clone https://github.com/yahoo/TensorFlowOnSpark.git

cd TensorFlowOnSpark

export TFoS_HOME=$(pwd)

安装Spark

这里tensorflowOnSpark中提供了一个脚本用于下载spark,我们直接执行这个命令。


 

1

2

3

4


 

${TFoS_HOME}/scripts/local-setup-spark.sh

rm spark-1.6.0-bin-hadoop2.6.tar

export SPARK_HOME=$(pwd)/spark-1.6.0-bin-hadoop2.6

export PATH=${SPARK_HOME}/bin:${PATH}

 

安装tensorflow以及tensorflowOnSpark

这里我们通过pip命令来安装tensorflow以及tensorflowOnSpark,目前最新版本的tensorflow是1.2.x,不过我这边测试是用的0.12.1版本。安装指定tensorflow版本可以通过==${version}来指定。


 

1

2


 

sudo pip install tensorflow==0.12.1

sudo pip install tensorflowonspark

 

下载mnist数据


 

1

2

3

4

5

6

7


 

mkdir ${TFoS_HOME}/mnist

pushd ${TFoS_HOME}/mnist

curl -O "http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz"

curl -O "http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz"

curl -O "http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz"

curl -O "http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz"

popd

运行standalone spark集群


 

1

2

3

4

5


 

export MASTER=spark://$(hostname):7077

export SPARK_WORKER_INSTANCES=2

export CORES_PER_WORKER=1

export TOTAL_CORES=$((${CORES_PER_WORKER}*${SPARK_WORKER_INSTANCES}))

${SPARK_HOME}/sbin/start-master.sh; ${SPARK_HOME}/sbin/start-slave.sh -c $CORES_PER_WORKER -m 3G ${MASTER}

测试pyspark、tensorflow以及tensorflowOnSpark


 

1

2

3

4


 

pyspark

>>> import tensorflow as tf

>>> from tensorflowonspark import TFCluster

>>> exit()

使用spark转换mnist压缩文件


 

1

2

3

4

5

6

7

8


 

cd ${TFoS_HOME}

# rm -rf examples/mnist/csv

${SPARK_HOME}/bin/spark-submit \

--master ${MASTER} \

${TFoS_HOME}/examples/mnist/mnist_data_setup.py \

--output examples/mnist/csv \

--format csv

ls -lR examples/mnist/csv

运行分布式mnist训练(使用feed_dict)


 

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16


 

# rm -rf mnist_model

${SPARK_HOME}/bin/spark-submit \

--master ${MASTER} \

--py-files ${TFoS_HOME}/examples/mnist/spark/mnist_dist.py \

--conf spark.cores.max=${TOTAL_CORES} \

--conf spark.task.cpus=${CORES_PER_WORKER} \

--conf spark.executorEnv.JAVA_HOME="$JAVA_HOME" \

${TFoS_HOME}/examples/mnist/spark/mnist_spark.py \

--cluster_size ${SPARK_WORKER_INSTANCES} \

--images examples/mnist/csv/train/images \

--labels examples/mnist/csv/train/labels \

--format csv \

--mode train \

--model mnist_model

ls -l mnist_model

运行分布式mnist推论(使用feed_dict)


 

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17


 

# rm -rf predictions

${SPARK_HOME}/bin/spark-submit \

--master ${MASTER} \

--py-files ${TFoS_HOME}/examples/mnist/spark/mnist_dist.py \

--conf spark.cores.max=${TOTAL_CORES} \

--conf spark.task.cpus=${CORES_PER_WORKER} \

--conf spark.executorEnv.JAVA_HOME="$JAVA_HOME" \

${TFoS_HOME}/examples/mnist/spark/mnist_spark.py \

--cluster_size ${SPARK_WORKER_INSTANCES} \

--images examples/mnist/csv/test/images \

--labels examples/mnist/csv/test/labels \

--mode inference \

--format csv \

--model mnist_model \

--output predictions

less predictions/part-00000

预测结果如下所示:


 

1

2

3

4

5

6

7

8

9

10

11


 

2017-02-10T23:29:17.009563 Label: 7, Prediction: 7

2017-02-10T23:29:17.009677 Label: 2, Prediction: 2

2017-02-10T23:29:17.009721 Label: 1, Prediction: 1

2017-02-10T23:29:17.009761 Label: 0, Prediction: 0

2017-02-10T23:29:17.009799 Label: 4, Prediction: 4

2017-02-10T23:29:17.009838 Label: 1, Prediction: 1

2017-02-10T23:29:17.009876 Label: 4, Prediction: 4

2017-02-10T23:29:17.009914 Label: 9, Prediction: 9

2017-02-10T23:29:17.009951 Label: 5, Prediction: 6

2017-02-10T23:29:17.009989 Label: 9, Prediction: 9

2017-02-10T23:29:17.010026 Label: 0, Prediction: 0

 

关闭spark集群


 

1


 

${SPARK_HOME}/sbin/stop-slave.sh; ${SPARK_HOME}/sbin/stop-master.sh

原链接:
https://github.com/yahoo/TensorFlowOnSpark/wiki/GetStarted_standalone

 

你可能感兴趣的:(Apache,Spark,Tensorflow)