The submission of a spark job to a spark cluster, in which the driver will be executed in one of the worker nodes.
Our current Spark service uses client deploy-mode, where the driver is executed on the local machine( Jupyter, web-shell, zeppelin).
more info
Scala, Java.
Python is NOT supported currently (v2.4.4) in spark standalone clusters.
edit deployment.yaml so the value of the environment value 'NAMESPACE' will be ‘default-tenant’
env: - name: NAMESPACE value: default-tenant |
kubectl apply -f rbac.yaml
kubectl apply -f deployment.yaml
Modify the spark-worker deployment to include the v3io-jars in its classpath,
and add an annotation to the created pods for the k8s-pod-headless-service-operator.
kubectl -n default-tenant edit deployment <spark-worker-deployment name>
To include the v3io-jars you need to edit the command and args under ‘spec.template.spec.containers’
The annotation should be added under ‘spec.template.metadata’
it should look like -
spec: template: metadata: annotations: srcd.host/create-headless-service: "true" spec: containers: - args: - cp /igz/java/libs/v3io-*.jar /spark/jars; /bin/bash /etc/config/v3io/v3io-spark.sh command: - /bin/bash - -c |
Verify by checking that the new worker pods were deployed successfully.
kubectl -n default-tenant edit deployment <spark-master-deployment name>
The ports section to modify is under ‘spec.template.spec.containers’
The annotation should be added under ‘spec.template.metadata’
it should looke like -
spec: template: metadata: annotations: srcd.host/create-headless-service: "true" spec: containers: ports: - containerPort: 6066 protocol: TCP - containerPort: 7077 protocol: TCP - containerPort: 8088 protocol: TCP |
Verify by checking that the new master pod was deployed successfully.