Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 36 additions & 0 deletions docs/modules/spark-k8s/examples/getting_started/application.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
---
apiVersion: spark.stackable.tech/v1alpha1
kind: SparkApplication
metadata:
name: pyspark-pi # <1>
namespace: default
spec:
sparkImage: # <2>
productVersion: 3.5.7
mode: cluster # <4>
mainApplicationFile: local:///stackable/spark/examples/src/main/python/pi.py # <4>
job: # <5>
config:
resources:
cpu:
min: "1"
max: "2"
memory:
limit: "1Gi"
driver: # <6>
config:
resources:
cpu:
min: "1"
max: "2"
memory:
limit: "1Gi"
executor: # <7>
replicas: 1
config:
resources:
cpu:
min: "1"
max: "2"
memory:
limit: "1Gi"
Original file line number Diff line number Diff line change
Expand Up @@ -47,36 +47,7 @@ esac

echo "Creating a Spark Application..."
# tag::install-sparkapp[]
kubectl apply -f - <<EOF
---
apiVersion: spark.stackable.tech/v1alpha1
kind: SparkApplication
metadata:
name: pyspark-pi
namespace: default
spec:
sparkImage:
productVersion: 3.5.7
mode: cluster
mainApplicationFile: local:///stackable/spark/examples/src/main/python/pi.py
driver:
config:
resources:
cpu:
min: "1"
max: "2"
memory:
limit: "1Gi"
executor:
replicas: 1
config:
resources:
cpu:
min: "1"
max: "2"
memory:
limit: "1Gi"
EOF
kubectl apply -f application.yaml
# end::install-sparkapp[]

sleep 15
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -47,36 +47,7 @@ esac

echo "Creating a Spark Application..."
# tag::install-sparkapp[]
kubectl apply -f - <<EOF
---
apiVersion: spark.stackable.tech/v1alpha1
kind: SparkApplication
metadata:
name: pyspark-pi
namespace: default
spec:
sparkImage:
productVersion: 3.5.7
mode: cluster
mainApplicationFile: local:///stackable/spark/examples/src/main/python/pi.py
driver:
config:
resources:
cpu:
min: "1"
max: "2"
memory:
limit: "1Gi"
executor:
replicas: 1
config:
resources:
cpu:
min: "1"
max: "2"
memory:
limit: "1Gi"
EOF
kubectl apply -f application.yaml
# end::install-sparkapp[]

sleep 15
Expand Down
22 changes: 13 additions & 9 deletions docs/modules/spark-k8s/pages/getting_started/first_steps.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -12,23 +12,27 @@ A Spark application is made of up three components:
* Driver: the driver starts the designated number of executors and removes them when the job is completed.
* Executor(s): responsible for executing the job itself

Create a `SparkApplication`:
Create a Spark application by running:

[source,bash]
----
include::example$getting_started/getting_started.sh[tag=install-sparkapp]
----

Where:
The application manifest file points to an application file that is to be started as well as it's configuration and resources needed.

* `metadata.name` contains the name of the SparkApplication
* `spec.version`: SparkApplication version (1.0). This can be freely set by the users and is added by the operator as label to all workload resources created by the application.
* `spec.sparkImage`: the image used by the job, driver and executor pods. This can be a custom image built by the user or an official Stackable image. Available official images are stored in the Stackable https://oci.stackable.tech/[image registry,window=_blank]. Information on how to browse the registry can be found xref:contributor:project-overview.adoc#docker-images[here,window=_blank].
* `spec.mode`: only `cluster` is currently supported
* `spec.mainApplicationFile`: the artifact (Java, Scala or Python) that forms the basis of the Spark job.
[source,bash]
----
include::example$getting_started/application.yaml[Create a Spark application]
----
<1> `metadata.name` contains the name of the SparkApplication
<2> `spec.sparkImage`: the image used by the job, driver and executor pods. This can be a custom image built by the user or an official Stackable image. Available official images are stored in the Stackable https://oci.stackable.tech/[image registry,window=_blank]. Information on how to browse the registry can be found xref:contributor:project-overview.adoc#docker-images[here,window=_blank].
<3> `spec.mode`: only `cluster` is currently supported
<4> `spec.mainApplicationFile`: the artifact (Java, Scala or Python) that forms the basis of the Spark job.
This path is relative to the image, so in this case an example python script (that calculates the value of pi) is running: it is bundled with the Spark code and therefore already present in the job image
* `spec.driver`: driver-specific settings.
* `spec.executor`: executor-specific settings.
<5> `spec.job`: submit command specific settings.
<6> `spec.driver`: driver-specific settings.
<7> `spec.executor`: executor-specific settings.

== Verify that it works

Expand Down