GCP
Deploy the Quanton Operator on Google Cloud Platform using GKE.
New to GKE? Follow the GKE deployment guide for a step-by-step walkthrough from cluster creation to your first Spark job.
GKE
Google Kubernetes Engine (GKE) is the recommended deployment target for Quanton on GCP. The Quanton Operator runs on your GKE cluster and manages the full Spark job lifecycle via Kubernetes.
Prerequisites
- GKE cluster running Kubernetes >= 1.28 (with
container.googleapis.comenabled and billing linked on the project) - Helm >= 3.x and
gcloudCLI logged in (gcloud auth login) onehouse-values.yamldownloaded from the Onehouse console- Outbound network access from your cluster to
*.onehouse.aiand*.docker.io
If you don't have a cluster yet, follow the GKE deployment guide.
Step 1: Configure kubectl for GKE
GKE needs an auth plugin and an explicit cluster-admin binding before Helm will work. Run these once after creating your cluster:
# Install the GKE auth plugin (required for kubectl)
gcloud components install gke-gcloud-auth-plugin
# Point kubectl at your cluster
gcloud container clusters get-credentials <cluster-name> --zone <zone>
# Grant yourself cluster-admin (Helm needs this to create CRDs)
kubectl create clusterrolebinding cluster-admin-binding \
--clusterrole=cluster-admin \
--user=$(gcloud config get-value account)
Without the cluster-admin binding, the Helm installs in steps 2–3 will fail with forbidden errors. If the binding command itself fails with a permissions error, grant yourself roles/container.admin on the project first — see Step 6 of the GKE deployment guide for the exact command.
Step 2: Install the Spark Operator
The Quanton Operator builds on top of the kubeflow Spark Operator. Install it first:
helm repo add spark-operator https://kubeflow.github.io/spark-operator
helm repo update
helm install spark-operator spark-operator/spark-operator \
--namespace spark-operator \
--create-namespace \
--set "spark.jobNamespaces={default}"
Verify it's running:
kubectl get pods -n spark-operator
Step 3: Install the Quanton Operator
helm upgrade --install quanton-operator oci://registry-1.docker.io/onehouseai/quanton-operator \
--namespace quanton-operator \
--create-namespace \
--set "quantonOperator.jobNamespaces={default}" \
-f /path/to/onehouse-values.yaml
Verify the operator pod is running:
kubectl get pods -n quanton-operator
Step 4: Submit a Spark job
apiVersion: quantonsparkoperator.onehouse.ai/v1beta2
kind: QuantonSparkApplication
metadata:
name: my-spark-job
namespace: default
spec:
sparkApplicationSpec:
type: Python
mode: cluster
image: "dist.onehouse.ai/onehouseai/quanton-spark:release-v1.29.0-al2023"
mainApplicationFile: "gs://my-bucket/jobs/my_job.py"
sparkVersion: "3.5.0"
sparkConf:
"spark.hadoop.google.cloud.auth.service.account.enable": "true"
driver:
cores: 4
memory: "8192m"
serviceAccount: spark-operator-spark
executor:
cores: 4
instances: 4
memory: "8192m"
kubectl apply -f my-spark-job.yaml
GCS access via Workload Identity
Use Workload Identity to authenticate pods to GCS without service account keys.
1. Enable Workload Identity on the cluster. The GKE deployment guide creates a cluster without it — enable it before binding service accounts:
gcloud container clusters update <cluster-name> \
--zone <zone> \
--workload-pool=<project>.svc.id.goog
For node pools, also enable the GKE metadata server:
gcloud container node-pools update <pool-name> \
--cluster <cluster-name> \
--zone <zone> \
--workload-metadata=GKE_METADATA
2. Bind the Kubernetes service account to a GCP service account:
gcloud iam service-accounts add-iam-policy-binding \
spark-gcs@<project>.iam.gserviceaccount.com \
--role roles/iam.workloadIdentityUser \
--member "serviceAccount:<project>.svc.id.goog[default/spark-operator-spark]"
kubectl annotate serviceaccount spark-operator-spark \
iam.gke.io/gcp-service-account=spark-gcs@<project>.iam.gserviceaccount.com \
-n default
The GCP service account needs roles/storage.objectAdmin on your GCS buckets.
Dedicated node pool (optional)
For best performance, run Spark pods on a dedicated node pool:
gcloud container node-pools create spark-pool \
--cluster=my-cluster \
--machine-type=n2-standard-8 \
--num-nodes=4 \
--node-labels=workload=spark
Set a matching node selector in onehouse-values.yaml:
quantonOperator:
nodeSelector:
workload: spark
Then re-apply the Helm install with the updated values file.