Skip to main content

Troubleshooting

This guide covers issues Feldera Enterprise users and operators might run into in production, and steps to remedy them.

Common Error Messages

Delta Lake Connection Errors

Error: Table metadata is invalid: Number of checkpoint files '0' is not equal to number of checkpoint metadata parts 'None'

Solution: This usually happens when the Delta Table uses features unsupported by delta-rs like liquid clustering or deletion vectors. Check the table properties and set the checkpoint policy to "classic":

ALTER TABLE my_table SET TBLPROPERTIES (
'checkpointPolicy' = 'classic'
)

Out-of-Memory Errors

Error: The pipeline container has restarted. This was likely caused by an Out-Of-Memory (OOM) crash.

Feldera runs each pipeline in a separate container with configurable memory limits. Here are some knobs to control memory usage:

  1. Adjust the pipeline’s memory reservation and limit:

    "resources": {
    "memory_mb_min": 32000,
    "memory_mb_max": 32000
    }
  2. Throttle the amount of records buffered by the connector using the max_queued_records setting:

    "max_queued_records": 100000
  3. Ensure that storage is enabled (it's on by default):

    "storage": {
    "backend": {
    "name": "default"
    },
    "min_storage_bytes": null,
    "compression": "default",
    "cache_mib": null
    },
  4. Optimize your SQL queries to avoid expensive cross-products. Use functions like NOW() sparingly on large relations.

Kubernetes evictions

Error: the pipeline becomes UNAVAILABLE with no errors in the logs.

Solution: configure resource reservations and limits for the Pipeline.

Kubernetes may evict Pipeline pods under node resource pressure. To confirm, run:

kubectl describe pipeline-<pipeline-id>-0

and look for

Status: Failed
Reason: Evicted

You can also view the eviction event in your cluster monitoring stack (e.g. Datadog).

Evictions typically happen only when running Feldera in shared Kubernetes clusters. The pods to evict are determined by Kubernetes Quality-of-Service classes.

By default, Feldera Pipelines do not reserve any CPU or memory resources, which puts them in the BestEffort priority class, making them eviction candidates. To raise their priority:

  1. Burstable class: reserve a minimum amount of memory and CPU:

    "resources": {
    "cpu_cores_min": 16,
    "memory_mb_min": 32000,
    }
  2. Guaranteed class: set minimum and maximum resources to the same value, for memory and CPU:

    "resources": {
    "cpu_cores_min": 16,
    "cpu_cores_max": 16,
    "memory_mb_min": 32000,
    "memory_mb_max": 32000,
    }

Rust Compilation Errors

Error: No space left on device during Rust compilation

Solution: Ensure the compiler-server has sufficient disk space (20Gib by default, configured via the compilerPvcStorageSize value in the Helm chart).

Diagnosing Performance Issues

When investigating pipeline performance, Feldera support will typically request:

  1. Performance Tab: screenshots of the Performance tab in the UI to see memory usage, record counts, and processing times

  2. Pipeline Logs Tab: for warnings and errors

  3. Circuit Profile: from the circuit profile API.

  4. Heap Profile: from heap usage API.