Skip to main content

Synchronizing checkpoints to object store

Experimental feature

Synchronizing checkpoints to object store is a highly experimental feature.

Feldera allows synchronizing pipeline checkpoints to an object store and restoring them at startup.

Storage configuration

Checkpoint sync is supported only with the file backend and uses rclone under the hood to interact with S3-compatible object stores.

Here is a sample configuration:

"storage": {
"backend": {
"name": "file",
"config": {
"sync": {
"bucket": "BUCKET_NAME/DIRECTORY_NAME",
"provider": "AWS",
"access_key": "ACCESS_KEY",
"secret_key": "SECRET_KEY",
"start_from_checkpoint": "latest",
"fail_if_no_checkpoint": false,
"flags": ["--s3-server-side-encryption", "aws:kms"]
}
}
}
}

sync configuration fields

FieldTypeDefaultDescription
endpointstringThe S3-compatible object store endpoint (e.g., http://localhost:9000 for MinIO).
bucket *stringThe bucket name and optional prefix to store checkpoints (e.g., mybucket/checkpoints).
regionstringus-east-1The region of the bucket. Leave empty for MinIO. If provider is AWS, and no region is specified, us-east-1 is used.
provider *stringThe S3 provider identifier. Must match rclone’s list. Case-sensitive. Use "Other" if unsure.
access_keystringS3 access key. Not required if using environment-based auth (e.g., IRSA).
secret_keystringS3 secret key. Not required if using environment-based auth.
start_from_checkpointstringCheckpoint UUID to resume from, or latest to restore from the latest checkpoint.
fail_if_no_checkpointbooleanfalseWhen true the pipeline will fail to initialize if fetching the specified checkpoint fails.

When false, the pipeline will start from scratch instead. Ignored if start_from_checkpoint is not set.

transfersinteger (u8)20Number of concurrent file transfers.
checkersinteger (u8)20Number of parallel checkers for verification.
ignore_checksumbooleanfalseSkip checksum verification after transfer and only check the file size. Might improve throughput.
multi_thread_streamsinteger (u8)10Number of streams for multi-threaded downloads.
multi_thread_cutoffstring100MFile size threshold to enable multi-threaded downloads (e.g., 100M, 1G). Supported suffixes: k, M, G, T.
upload_concurrencyinteger (u8)10Number of concurrent chunks to upload during multipart uploads.
flagsarray[string]Extra flags to pass to rclone.

⚠️ Incorrect or conflicting flags may break behavior. See rclone flags and S3 flags.

*Fields marked with an asterisk are required.

S3 permissions

The following minimum permissions are required to be available on the bucket being written to:

  • ListBucket
  • DeleteBucket
  • GetObject
  • PutObject
  • PutObjectACL
  • CreateBucket

Example policy:

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::USER_SID:user/USER_NAME"
},
"Action": [
"s3:ListBucket",
"s3:DeleteObject",
"s3:GetObject",
"s3:PutObject",
"s3:PutObjectAcl"
],
"Resource": [
"arn:aws:s3:::BUCKET_NAME/*",
"arn:aws:s3:::BUCKET_NAME"
]
},
{
"Effect": "Allow",
"Action": "s3:ListAllMyBuckets",
"Resource": "arn:aws:s3:::*"
}
]
}

For more details, refer to rclone S3 permissions.

IRSA

To use IRSA (IAM Roles for Service Accounts) omit fields access_key and secret_key. This loads credentials from the environment.

Buckets with server side encryption

If the bucket has server side encryption enabled, set the flag --s3-server-side-encryption in the flags field.

Example:

      "sync": {
"bucket": "BUCKET_NAME/DIRECTORY_NAME",
"provider": "AWS",
"start_from_checkpoint": "latest",
"flags": ["--s3-server-side-encryption", "aws:kms"]
}

Triggering a checkpoint sync

A sync operation can be trigged by making a POST request to:

curl -X POST http://localhost/v0/pipelines/{PIPELINE_NAME}/checkpoint/sync

This initiates the sync and returns the UUID of the checkpoint being synced:

{"checkpoint_uuid":"019779b4-8760-75f2-bdf0-71b825e63610"}

Checking sync status

The status of the sync operation can be checked by making a GET request to:

curl http://localhost/v0/pipelines/{PIPELINE_NAME}/checkpoint/sync_status

Response examples

In Progress:

{"success":null,"failure":null}

Success:

{"success":"019779b4-8760-75f2-bdf0-71b825e63610","failure":null}

Failure:

{
"success": null,
"failure": {
"uuid": "019779c1-8317-7a71-bd78-7b971f4a3c43",
"error": "Error pushing checkpoint to object store: ... SignatureDoesNotMatch ..."
}
}