Skip to main content

Parallel Compilation

Parallel compilation allows Feldera to compile multiple pipelines concurrently by distributing the workload across several compiler server pods. This dramatically reduces total compile time for large numbers of pipelines, especially in production environments.


How Parallel Compilation Works

Feldera deploys the compiler server as a Kubernetes StatefulSet with N replicas. Each replica (pod) acts as a worker responsible for compiling a subset of pipelines. The assignment is deterministic:

  • Each pipeline is assigned to a worker using: pipeline_id % N == worker_id The worker ID is the pod index (e.g., feldera-compiler-server-1 has worker ID 1).
  • Each pod compiles only the pipelines assigned to it.
  • Worker 0 (the pod with index 0) acts as the leader. All other workers transfer their compiled binaries to the leader over HTTP/HTTPS before marking the program as successfully compiled.

To further accelerate builds, Feldera optionally supports sccache with an S3-compatible backend. This allows workers to share compiled operator artifacts instead of rebuilding identical code.

info

Autoscaling based on workload is not yet supported. You must set the number of compiler server replicas at install time or scale them manually later.


Configuration

Enabling Parallel Compilation

To enable parallel compilation with 3 compiler server replicas:

Via file values.yaml

parallelCompilation:
# Enable parallel compilation features
enabled: true
# Number of compiler server replicas when parallel compilation is enabled
replicas: 3
# ...other configuration

In Helm command

helm upgrade --install feldera \
oci://public.ecr.aws/feldera/feldera-chart --version "${VERSION}" \
--namespace feldera \
--set parallelCompilation.enabled=true \
--set parallelCompilation.replicas=3 \
# ...other configuration

You should see multiple compiler server pods, for example:

kubectl get pods -n feldera
NAME                          READY   STATUS    RESTARTS   AGE
feldera-compiler-server-0 1/1 Running 0 2m
feldera-compiler-server-1 1/1 Running 0 2m
feldera-compiler-server-2 1/1 Running 0 2m
feldera-api-server-xxx 1/1 Running 0 2m
feldera-kubernetes-runner-xxx 1/1 Running 0 2m
feldera-db-0 1/1 Running 0 2m

Compiler server has all dependcies percompiled in it's target directory. We only need to perform compilation of the program generated based on the pipeline SQL.

If there is just 1 compiler server, sccache does not provide any benefit as dependencies are already there and as all pipelines are compiled in same workspace, the operators and other compiled artifacts are shared.

When there are multiple compiler server, that is when we want to make sure operators compiled on a server are reusable by others, and sccache achieves that.

Example: Pipeline A uses operator xx and is assigned to pod 0. Pod 0 builds xx. Later pipeline B, assigned to pod 1, also needs xx. Without sccache, pod 1 rebuilds xx from scratch. With sccache (S3/MinIO backend), pod 1 fetches the cached object files, avoiding a full rebuild.

1. Provision S3 Credentials

Use either IRSA (IAM Roles for Service Accounts) or a Kubernetes secret with S3 credentials. sccache uses these credentials to access the cache bucket.

  • IRSA: The compiler server checks for AWS_ROLE_ARN and AWS_WEB_IDENTITY_TOKEN_FILE.

  • Kubernetes Secret: Create a secret containing your S3 access keys:

    kubectl create secret generic sccache-s3-secret -n feldera \
    --from-literal=access_key_id="your-access-key" \
    --from-literal=secret_access_key="your-secret-key"

The secret must define keys access_key_id and secret_access_key. You can configure the secret name in values.yaml.

2. Configure sccache in values.yaml

parallelCompilation:
enabled: true
replicas: 3
# sccache configuration for sharing compilation artifacts between compiler servers
sccache:
# Enable sccache for compilation artifact caching (optional, recommended)
enabled: true
# S3 backend configuration for sccache
s3:
# S3 bucket name for cache storage
bucket: "sccache-bucket"
# Use SSL for S3 connections
# set to true to use HTTPS/TLS
useSSL: false
# Key prefix for cache objects used by sccache
keyPrefix: "sccache"
# AWS region of bucket
region: "us-east-1"
# custom URL ( <ip>:<port> ) of a server you want to use, such as MinIO.
# Defaults to ${BUCKET}.s3-{REGION}.amazonaws.com for AWS S3 if not set.
# endpoint: "minio.extra.svc.cluster.local:9000"
#
# Server-side encryption (optional)
# serverSideEncryption: false
#
# Existing secret containing S3 credentials
# The secret must have keys: access_key_id and secret_access_key
# If IRSA is setup, you don't need to specify existingSecret,
# credentials would be configured via AWS_ROLE_ARN and AWS_WEB_IDENTITY_TOKEN_FILE
# environment variables automatically.
# existingSecret: "sccache-s3-secret"

Troubleshooting & FAQs

  • Resource requirements:

Ensure your cluster nodes have enough resources to run the desired number of compiler server replicas.

  • Pipeline stuck on some status:

If a pipeline is assigned to a worker pod that is not yet running or is unhealthy, it will not be compiled until that pod is available and running. Make sure to validate all pods are running.

  • SystemError: Failed to upload binary:

If the pipeline gets this status, that means pod N failed to upload its binary to <_>-compiler-server-0.

You can check if <_>-compiler-server-0 is healthy or not by /cluster_healthz endpoint. Make sure to adjust binary upload related configuration as per your needs, e.g. if your upgrade takes a while, we should configure retries and backoff interval to sane values such that pods get time to come up to receive the binary.

  • error: process didn't exit successfully: sccache .. rustc -vV:

Check the Errors tab in web console ( enable Verbatim errors if required ) to check full error regarding why sccache failed.

Comman causes can be misconfigured S3 bucket / endpoint / credentials.

  • Scaling with kubectl: If you scale the compiler server StatefulSet using kubectl without restarting, the compiler server will detect the change and panic with SCALING DETECTED: StatefulSet has X replicas but compiler was started with Y workers. This would trigger a restart to ensure correct work distribution.

Configuration Options Reference

KeyDescriptionDefault/Example
parallelCompilation.enabledEnable parallel compilationfalse (ex: true)
parallelCompilation.replicasNumber of compiler server pods1 (ex: 3)
parallelCompilation.sccache.enabledEnable sccache build cachefalse (ex: true)
parallelCompilation.sccache.s3.bucketS3/MinIO bucket for cache"sccache-bucket" (ex: "feldera-sccache")
parallelCompilation.sccache.s3.useSSLUse SSL/TLS for S3/MinIOfalse (ex: true)
parallelCompilation.sccache.s3.regionBucket region"us-east-1" (ex: "us-east-1")
parallelCompilation.sccache.s3.keyPrefixCache object key prefix"sccache" (ex: "sccache" or "")

Optional Configurations

KeyDescriptionExample
parallelCompilation.sccache.s3.existingSecretSecret for S3 credentials (omit if using IRSA)sccache-s3-secret
parallelCompilation.sccache.s3.serverSideEncryptionEnable server-side encryption with s3 managed key (SSE-S3)false
parallelCompilation.sccache.s3.endpointCustom endpoint (e.g. MinIO)minio.mydomain.com:9000