Azure
Deploy the Quanton Operator on Microsoft Azure using AKS.
New to AKS? Follow the AKS deployment guide for a step-by-step walkthrough from cluster creation to your first Spark job.
AKS
Azure Kubernetes Service (AKS) is the recommended deployment target for Quanton on Azure. The Quanton Operator runs on your AKS cluster and manages the full Spark job lifecycle via Kubernetes.
Prerequisites
- AKS cluster running Kubernetes >= 1.28
- Helm >= 3.x and kubectl configured for your cluster
onehouse-values.yamldownloaded from the Onehouse console- Outbound network access from your cluster to
*.onehouse.aiand*.docker.io
If you don't have a cluster yet — or your Azure subscription is fresh and needs Microsoft.ContainerService / Microsoft.Compute provider registration — follow the AKS deployment guide first.
Step 1: Install the Spark Operator
The Quanton Operator builds on top of the kubeflow Spark Operator. Install it first:
helm repo add spark-operator https://kubeflow.github.io/spark-operator
helm repo update
helm install spark-operator spark-operator/spark-operator \
--namespace spark-operator \
--create-namespace \
--set "spark.jobNamespaces={default}"
Verify it's running:
kubectl get pods -n spark-operator
Step 2: Install the Quanton Operator
helm upgrade --install quanton-operator oci://registry-1.docker.io/onehouseai/quanton-operator \
--namespace quanton-operator \
--create-namespace \
--set "quantonOperator.jobNamespaces={default}" \
-f /path/to/onehouse-values.yaml
Verify the operator pod is running and that the image-pull secret was created from your onehouse-values.yaml:
kubectl get pods -n quanton-operator
kubectl get secret -n quanton-operator | grep onehouse
If the registry secret is missing, the onehouseConfig.imagePullSecrets.accessToken field in your values file is empty or invalid — re-download onehouse-values.yaml from the console.
Step 3: Submit a Spark job
apiVersion: quantonsparkoperator.onehouse.ai/v1beta2
kind: QuantonSparkApplication
metadata:
name: my-spark-job
namespace: default
spec:
sparkApplicationSpec:
type: Python
mode: cluster
image: "dist.onehouse.ai/onehouseai/quanton-spark:release-v1.29.0-al2023"
mainApplicationFile: "abfss://my-container@mystorageaccount.dfs.core.windows.net/jobs/my_job.py"
sparkVersion: "3.5.0"
sparkConf:
"spark.hadoop.fs.azure.account.auth.type.mystorageaccount.dfs.core.windows.net": "OAuth"
"spark.hadoop.fs.azure.account.oauth.provider.type.mystorageaccount.dfs.core.windows.net": "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider"
"spark.hadoop.fs.azure.account.oauth2.client.id.mystorageaccount.dfs.core.windows.net": "<client-id>"
"spark.hadoop.fs.azure.account.oauth2.client.secret.mystorageaccount.dfs.core.windows.net": "<client-secret>"
"spark.hadoop.fs.azure.account.oauth2.client.endpoint.mystorageaccount.dfs.core.windows.net": "https://login.microsoftonline.com/<tenant-id>/oauth2/token"
driver:
cores: 4
memory: "8192m"
serviceAccount: spark-operator-spark
executor:
cores: 4
instances: 4
memory: "8192m"
kubectl apply -f my-spark-job.yaml
ADLS Gen2 access
The sparkConf block above uses literal client ID/secret values for simplicity. For production, use AKS Workload Identity so pods authenticate to ADLS Gen2 without storing credentials in your job spec:
- Enable Workload Identity on the cluster (
az aks update --enable-oidc-issuer --enable-workload-identity ...). - Federate a User-Assigned Managed Identity to the
spark-operator-sparkservice account indefault. - Grant that identity Storage Blob Data Contributor on your storage account.
- Swap the
oauth2.client.secretconfig for the Workload Identity provider class — see the Hadoop ABFS authentication docs for the exactspark.hadoop.fs.azure.account.oauth.provider.typevalue.
Dedicated node pool (optional)
For best performance, run Spark pods on a dedicated node pool:
az aks nodepool add \
--resource-group my-rg \
--cluster-name my-cluster \
--name sparkpool \
--node-vm-size Standard_D8s_v3 \
--node-count 4 \
--labels workload=spark
Set a matching node selector in onehouse-values.yaml:
quantonOperator:
nodeSelector:
workload: spark
Then re-apply the Helm install with the updated values file.