在Kubernetes 集群使用 SPARK

增加权限

kubectl create rolebinding default-admin --clusterrole=admin --serviceaccount=default:default --namespace=default

Cluster 模式 测试

开启代理

# kubectl proxy
Starting to serve on 127.0.0.1:8001

spark-pi

bin/spark-submit \
    --master k8s://http://127.0.0.1:8001 \
    --deploy-mode cluster \
    --name spark-pi \
    --class org.apache.spark.examples.SparkPi \
    --conf spark.executor.instances=1 \
    --conf spark.kubernetes.container.image=zhixingheyitian/spark:spark2.4.1 \
    /opt/spark/examples/jars/spark-examples_2.11-2.4.1.jar

注意是http 不是 https
如果是https 会报一下错误

Caused by: javax.net.ssl.SSLException: Unrecognized SSL message, plaintext connection?

response

19/05/14 10:32:29 INFO LoggingPodStatusWatcherImpl: State changed, new state: 
	 pod name: spark-pi-1557801148384-driver
	 namespace: default
	 labels: spark-app-selector -> spark-1285a340b30f42349cca621b3015768f, spark-role -> driver
	 pod uid: 84e3af6b-75f0-11e9-ba83-001e67d8acca
	 creation time: 2019-05-14T02:32:29Z
	 service account name: default
	 volumes: spark-local-dir-1, spark-conf-volume, default-token-qqr9h
	 node name: sr535
	 start time: 2019-05-14T02:26:46Z
	 container images: zhixingheyitian/spark:spark2.4.1
	 phase: Pending
	 status: [ContainerStatus(containerID=null, image=zhixingheyitian/spark:spark2.4.1, imageID=, lastState=ContainerState(running=null, terminated=null, waiting=null, additionalProperties={}), name=spark-kubernetes-driver, ready=false, restartCount=0, state=ContainerState(running=null, terminated=null, waiting=ContainerStateWaiting(message=null, reason=ContainerCreating, additionalProperties={}), additionalProperties={}), additionalProperties={})]
19/05/14 10:32:29 INFO Client: Waiting for application spark-pi to finish...
19/05/14 10:32:31 INFO LoggingPodStatusWatcherImpl: State changed, new state: 
	 pod name: spark-pi-1557801148384-driver
	 namespace: default
	 labels: spark-app-selector -> spark-1285a340b30f42349cca621b3015768f, spark-role -> driver
	 pod uid: 84e3af6b-75f0-11e9-ba83-001e67d8acca
	 creation time: 2019-05-14T02:32:29Z
	 service account name: default
	 volumes: spark-local-dir-1, spark-conf-volume, default-token-qqr9h
	 node name: sr535
	 start time: 2019-05-14T02:26:46Z
	 container images: zhixingheyitian/spark:spark2.4.1
	 phase: Running
	 status: [ContainerStatus(containerID=docker://5850465eae76f274593987938f680a1d80eb2abf5fabd534f10d919adf01602c, image=zhixingheyitian/spark:spark2.4.1, imageID=docker-pullable://zhixingheyitian/spark@sha256:8f08f90c68a5c4806f6ba57d1eb675c239d6fd96ca7c83cec1b6fe0d1ff25d06, lastState=ContainerState(running=null, terminated=null, waiting=null, additionalProperties={}), name=spark-kubernetes-driver, ready=true, restartCount=0, state=ContainerState(running=ContainerStateRunning(startedAt=2019-05-14T02:26:48Z, additionalProperties={}), terminated=null, waiting=null, additionalProperties={}), additionalProperties={})]
19/05/14 10:32:40 INFO LoggingPodStatusWatcherImpl: State changed, new state: 
	 pod name: spark-pi-1557801148384-driver
	 namespace: default
	 labels: spark-app-selector -> spark-1285a340b30f42349cca621b3015768f, spark-role -> driver
	 pod uid: 84e3af6b-75f0-11e9-ba83-001e67d8acca
	 creation time: 2019-05-14T02:32:29Z
	 service account name: default
	 volumes: spark-local-dir-1, spark-conf-volume, default-token-qqr9h
	 node name: sr535
	 start time: 2019-05-14T02:26:46Z
	 container images: zhixingheyitian/spark:spark2.4.1
	 phase: Succeeded
	 status: [ContainerStatus(containerID=docker://5850465eae76f274593987938f680a1d80eb2abf5fabd534f10d919adf01602c, image=zhixingheyitian/spark:spark2.4.1, imageID=docker-pullable://zhixingheyitian/spark@sha256:8f08f90c68a5c4806f6ba57d1eb675c239d6fd96ca7c83cec1b6fe0d1ff25d06, lastState=ContainerState(running=null, terminated=null, waiting=null, additionalProperties={}), name=spark-kubernetes-driver, ready=false, restartCount=0, state=ContainerState(running=null, terminated=ContainerStateTerminated(containerID=docker://5850465eae76f274593987938f680a1d80eb2abf5fabd534f10d919adf01602c, exitCode=0, finishedAt=2019-05-14T02:26:57Z, message=null, reason=Completed, signal=null, startedAt=2019-05-14T02:26:48Z, additionalProperties={}), waiting=null, additionalProperties={}), additionalProperties={})]
19/05/14 10:32:40 INFO LoggingPodStatusWatcherImpl: Container final statuses:


	 Container name: spark-kubernetes-driver
	 Container image: zhixingheyitian/spark:spark2.4.1
	 Container state: Terminated
	 Exit code: 0

不用代理,直接提交spark-pi

cluster-info

# kubectl cluster-info
Kubernetes master is running at https://10.0.2.131:6443
KubeDNS is running at https://10.0.2.131:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

run

bin/spark-submit \
    --master k8s://https://10.0.2.131:6443 \
    --deploy-mode cluster \
    --name spark-pi \
    --class org.apache.spark.examples.SparkPi \
    --conf spark.executor.instances=1 \
    --conf spark.kubernetes.container.image=zhixingheyitian/spark:spark2.4.1 \
    /opt/spark/examples/jars/spark-examples_2.11-2.4.1.jar

倘若此时,选用 http,则会报以下错误

19/05/14 10:40:43 WARN Utils: Kubernetes master URL uses HTTP instead of HTTPS.
19/05/14 10:40:44 WARN WatchConnectionManager: Exec Failure: HTTP 400, Status: 400 - Client sent an HTTP request to an HTTPS server.

java.net.ProtocolException: Expected HTTP 101 response but was '400 Bad Request'
	at okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:216)
	at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:183)
	at okhttp3.RealCall$AsyncCall.execute(RealCall.java:141)
	at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

查看日志

# kubectl logs -f spark-pi-1557801409338-driver
++ id -u
+ myuid=0
++ id -g
+ mygid=0
+ set +e
++ getent passwd 0
+ uidentry=root:x:0:0:root:/root:/bin/ash
+ set -e
+ '[' -z root:x:0:0:root:/root:/bin/ash ']'
+ SPARK_K8S_CMD=driver
+ case "$SPARK_K8S_CMD" in
+ shift 1
+ SPARK_CLASSPATH=':/opt/spark/jars/*'
+ env
+ grep SPARK_JAVA_OPT_
+ sort -t_ -k4 -n
+ sed 's/[^=]*=\(.*\)/\1/g'
+ readarray -t SPARK_EXECUTOR_JAVA_OPTS
+ '[' -n '' ']'
+ '[' -n '' ']'
+ PYSPARK_ARGS=
+ '[' -n '' ']'
+ R_ARGS=
+ '[' -n '' ']'
+ '[' '' == 2 ']'
+ '[' '' == 3 ']'
+ case "$SPARK_K8S_CMD" in
+ CMD=("$SPARK_HOME/bin/spark-submit" --conf "spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS" --deploy-mode client "$@")
+ exec /sbin/tini -s -- /opt/spark/bin/spark-submit --conf spark.driver.bindAddress=10.32.0.8 --deploy-mode client --properties-file /opt/spark/conf/spark.properties --class org.apache.spark.examples.SparkPi spark-internal
19/05/14 02:31:11 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
19/05/14 02:31:11 INFO SparkContext: Running Spark version 2.4.1
19/05/14 02:31:11 INFO SparkContext: Submitted application: Spark Pi
19/05/14 02:31:11 INFO SecurityManager: Changing view acls to: root
19/05/14 02:31:11 INFO SecurityManager: Changing modify acls to: root
19/05/14 02:31:11 INFO SecurityManager: Changing view acls groups to: 
19/05/14 02:31:11 INFO SecurityManager: Changing modify acls groups to: 
19/05/14 02:31:11 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(root); groups with view permissions: Set(); users  with modify permissions: Set(root); groups with modify permissions: Set()
19/05/14 02:31:11 INFO Utils: Successfully started service 'sparkDriver' on port 7078.
19/05/14 02:31:11 INFO SparkEnv: Registering MapOutputTracker
19/05/14 02:31:11 INFO SparkEnv: Registering BlockManagerMaster
19/05/14 02:31:11 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
19/05/14 02:31:11 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
19/05/14 02:31:11 INFO DiskBlockManager: Created local directory at /var/data/spark-eabbba32-543b-41da-8b32-4d77d0b053ff/blockmgr-8b6d4829-7558-4e9f-92cd-34bc24063f99
19/05/14 02:31:11 INFO MemoryStore: MemoryStore started with capacity 413.9 MB
19/05/14 02:31:11 INFO SparkEnv: Registering OutputCommitCoordinator
19/05/14 02:31:12 INFO Utils: Successfully started service 'SparkUI' on port 4040.
19/05/14 02:31:12 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://spark-pi-1557801409338-driver-svc.default.svc:4040
19/05/14 02:31:12 INFO SparkContext: Added JAR file:///opt/spark/examples/jars/spark-examples_2.11-2.4.1.jar at spark://spark-pi-1557801409338-driver-svc.default.svc:7078/jars/spark-examples_2.11-2.4.1.jar with timestamp 1557801072208
19/05/14 02:31:13 INFO ExecutorPodsAllocator: Going to request 1 executors from Kubernetes.
19/05/14 02:31:13 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 7079.
19/05/14 02:31:13 INFO NettyBlockTransferService: Server created on spark-pi-1557801409338-driver-svc.default.svc:7079
19/05/14 02:31:13 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
19/05/14 02:31:13 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, spark-pi-1557801409338-driver-svc.default.svc, 7079, None)
19/05/14 02:31:13 INFO BlockManagerMasterEndpoint: Registering block manager spark-pi-1557801409338-driver-svc.default.svc:7079 with 413.9 MB RAM, BlockManagerId(driver, spark-pi-1557801409338-driver-svc.default.svc, 7079, None)
19/05/14 02:31:13 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, spark-pi-1557801409338-driver-svc.default.svc, 7079, None)
19/05/14 02:31:13 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, spark-pi-1557801409338-driver-svc.default.svc, 7079, None)
19/05/14 02:31:17 INFO KubernetesClusterSchedulerBackend$KubernetesDriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.32.0.10:55182) with ID 1
19/05/14 02:31:17 INFO KubernetesClusterSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
19/05/14 02:31:17 INFO BlockManagerMasterEndpoint: Registering block manager 10.32.0.10:35701 with 413.9 MB RAM, BlockManagerId(1, 10.32.0.10, 35701, None)
19/05/14 02:31:17 INFO SparkContext: Starting job: reduce at SparkPi.scala:38
19/05/14 02:31:17 INFO DAGScheduler: Got job 0 (reduce at SparkPi.scala:38) with 2 output partitions
19/05/14 02:31:17 INFO DAGScheduler: Final stage: ResultStage 0 (reduce at SparkPi.scala:38)
19/05/14 02:31:17 INFO DAGScheduler: Parents of final stage: List()
19/05/14 02:31:17 INFO DAGScheduler: Missing parents: List()
19/05/14 02:31:17 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34), which has no missing parents
19/05/14 02:31:18 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 1936.0 B, free 413.9 MB)
19/05/14 02:31:18 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 1256.0 B, free 413.9 MB)
19/05/14 02:31:18 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on spark-pi-1557801409338-driver-svc.default.svc:7079 (size: 1256.0 B, free: 413.9 MB)
19/05/14 02:31:18 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1161
19/05/14 02:31:18 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34) (first 15 tasks are for partitions Vector(0, 1))
19/05/14 02:31:18 INFO TaskSchedulerImpl: Adding task set 0.0 with 2 tasks
19/05/14 02:31:18 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, 10.32.0.10, executor 1, partition 0, PROCESS_LOCAL, 7885 bytes)
19/05/14 02:31:18 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 10.32.0.10:35701 (size: 1256.0 B, free: 413.9 MB)
19/05/14 02:31:18 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, 10.32.0.10, executor 1, partition 1, PROCESS_LOCAL, 7885 bytes)
19/05/14 02:31:18 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 790 ms on 10.32.0.10 (executor 1) (1/2)
19/05/14 02:31:19 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 42 ms on 10.32.0.10 (executor 1) (2/2)
19/05/14 02:31:19 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 
19/05/14 02:31:19 INFO DAGScheduler: ResultStage 0 (reduce at SparkPi.scala:38) finished in 1.062 s
19/05/14 02:31:19 INFO DAGScheduler: Job 0 finished: reduce at SparkPi.scala:38, took 1.139292 s
Pi is roughly 3.142155710778554
19/05/14 02:31:19 INFO SparkUI: Stopped Spark web UI at http://spark-pi-1557801409338-driver-svc.default.svc:4040
19/05/14 02:31:19 INFO KubernetesClusterSchedulerBackend: Shutting down all executors
19/05/14 02:31:19 INFO KubernetesClusterSchedulerBackend$KubernetesDriverEndpoint: Asking each executor to shut down
19/05/14 02:31:19 WARN ExecutorPodsWatchSnap

非主节点提交spark 任务,需要配置 token

# bin/spark-submit \
>     --master k8s://https://10.0.2.131:6443 \
>     --deploy-mode cluster \
>     --name spark-pi \
>     --class org.apache.spark.examples.SparkPi \
>     --conf spark.executor.instances=1 \
>     --conf spark.kubernetes.container.image=zhixingheyitian/spark:spark2.4.1 \
>     /opt/spark/examples/jars/spark-examples_2.11-2.4.1.jar
19/05/14 10:38:52 WARN client.Config: Error reading service account token from: [/var/run/secrets/kubernetes.io/serviceaccount/token]. Ignoring.
19/05/14 10:38:53 WARN internal.WatchConnectionManager: Exec Failure
javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
	at sun.security.ssl.Alerts.getSSLException(Alerts.java:192)
	at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1949)
	at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:302)
	at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:296)
	at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1509)
	at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:216)
	at sun.security.ssl.Handshaker.processLoop(Handshaker.java:979)

Client 模式

排除环境变量的干扰

unset HADOOP_HOME
unset HADOOP_CONF_DIR

run

bin/spark-sql \
    --master k8s://https://10.0.2.131:6443 \
    --name spark-sql \
    --conf spark.executor.instances=1 \
    --conf spark.kubernetes.container.image=zhixingheyitian/spark:spark2.4.1-oap-aep-executor-centos

response

19/05/14 15:10:22 INFO OapIndexInfo: 
host 10.32.0.8 executor id: 1
partition file: hdfs://bdpe101:9000/tpcds_parquet_gzip/tpcds_1g_parquet/store_sales/part-00000-a0ceba57-9659-45cd-a1ad-687011e9c91e-c000.gz.parquet use OAP index: false
19/05/14 15:10:29 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 12409 ms on 10.32.0.8 (executor 1) (1/1)
19/05/14 15:10:29 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 
19/05/14 15:10:29 INFO DAGScheduler: ResultStage 0 (processCmd at CliDriver.java:376) finished in 12.564 s
19/05/14 15:10:29 INFO DAGScheduler: Job 0 finished: processCmd at CliDriver.java:376, took 12.636331 s
2451897	50904	5695	38271	442082	2693	46104	4	26	1	91	47.2	65.13	31.26	0.0	2844.66	4295.2	5926.83	170.67	0.0	2844.66	3015.33	-1450.54
2451897	50904	1871	38271	442082	2693	46104	4	263	1	75	83.79	93.84	3.75	0.0	281.25	6284.25	7038.0	14.06	0.0	281.25	295.31	-6003.0
2451897	50904	3533	38271	442082	2693	46104	4	179	1	15	90.79	128.01	24.32	240.76	364.8	1361.85	1920.15	8.68	240.76	124.04	132.72	-1237.81
2451897	50904	14989	38271	442082	2693	46104	4	89	1	33	28.98	40.57	23.53	0.0	776.49	956.34	1338.81	7.76	0.0	776.49	784.25	-179.85
2451897	50904	7928	38271	442082	2693	46104	4	30	1	14	87.61	145.43	84.34	0.0	1180.76	1226.54	2036.02	11.8	0.0	1180.76	1192.56	-45.78
2451897	50904	15233	38271	442082	2693	46104	4	72	1	70	96.43	109.93	47.26	0.0	3308.2	6750.1	7695.1	165.41	0.0	3308.2	3473.61	-3441.9
2451897	NULL	9497	38271	NULL	NULL	NULL	NULL	NULL	1	NULL	NULL	NULL	21.32	NULL	NULL	2389.86	4492.62	NULL	NULL	527.67	538.22	-1862.19
NULL	50904	15193	38271	442082	NULL	46104	NULL	NULL	1	24	NULL	9.8	NULL	NULL	NULL	NULL	NULL	NULL	NULL	NULL	NULL	NULL
2452129	31625	1219	95293	605087	4481	40275	1	53	2	40	4.75	8.02	0.88	0.0	35.2	190.0	320.8	1.4	0.0	35.2	36.6	-154.8
2452129	31625	16745	95293	605087	4481	40275	1	130	2	99	70.57	141.14	77.62	0.0	7684.38	6986.43	13972.86	537.9	0.0	7684.38	8222.28	697.95
Time taken: 16.377 seconds, Fetched 10 row(s)
19/05/14 15:10:29 INFO SparkSQLCLIDriver: Time taken: 16.377 seconds, Fetched 10 row(s)

你可能感兴趣的:(kubernetes,云计算,spark)