Pipeline Settings

Pipelines come with a set of configuration settings to toggle features, tune performance, and help with operations. If you're on the Enterprise Edition, you will likely need to configure the resources section to tune the CPU, memory, and storage resources used by the Pipeline depending on your infrastructure needs. Other than that, users rarely need to deviate from the supplied defaults.

Editing configuration

. You can edit all pipeline settings when the pipeline is Stopped with storage cleared, and a limited subset when it storage is in use.

Web Console
Python SDK
Feldera CLI
HTTP API

Press the gear button in the top right corner of the code editor to access the dialog where you can edit the runtime and program configuration JSON.

Configure pipeline in web-console

You can use RuntimeConfig.from_dict() to set the runtime configuration of a pipeline.

Example: Runtime configuration of a Pipeline

You can toggle the storage for a pipeline with

  fda set-config {pipeline_name} storage [true|false]

Include only runtime_config or program_config fields in the body of a PATCH /v0/pipelines/{pipeline_name}, e.g. :

  curl -X PATCH "http://localhost:8080/v0/pipelines/feature-engineering" -H "Content-Type: application/json" -d '{"runtime_config":{"workers":8,"fault_tolerance":{"model":"none","checkpoint_interval_secs":60},"resources":{"memory_mb_max":16000}},"program_config":{"profile":"dev"}}' -s -o /dev/null

Runtime configuration

important

Make sure to appropriately size resource limits (memory and storage), the number of worker threads and the storage backend to utilize available cluster resources.

checkpoint_during_suspend	boolean Default: true Deprecated: setting this true or false does not have an effect anymore.
clock_resolution_usecs	integer or null <int64> >= 0 Default: 1000000 Real-time clock resolution in microseconds. This parameter controls the execution of queries that use the `NOW()` function. The output of such queries depends on the real-time clock and can change over time without any external inputs. If the query uses `NOW()`, the pipeline will update the clock value and trigger incremental recomputation at most each `clock_resolution_usecs` microseconds. If the query does not use `NOW()`, then clock value updates are suppressed and the pipeline ignores this setting. It is set to 1 second (1,000,000 microseconds) by default.
cpu_profiler	boolean Default: true Enable CPU profiler. The default value is `true`.
	object Default: {} Optional settings for tweaking Feldera internals. The available key-value pairs change from one version of Feldera to another, so users should not depend on particular settings being available, or on their behavior.
	object Default: {"model":"none","checkpoint_interval_secs":60} Fault-tolerance configuration. The default [FtConfig] (via [FtConfig::default]) disables fault tolerance, which is the configuration that one gets if [RuntimeConfig] omits fault tolerance configuration. The default value for [FtConfig::model] enables fault tolerance, as `Some(FtModel::default())`. This is the configuration that one gets if [RuntimeConfig] includes a fault tolerance configuration but does not specify a particular model.
http_workers	integer or null <int64> >= 0 Default: null Sets the number of available runtime threads for the http server. In most cases, this does not need to be set explicitly and the default is sufficient. Can be increased in case the pipeline HTTP API operations are a bottleneck. If not specified, the default is set to `workers`.
init_containers	any or null Specification of additional (sidecar) containers.
io_workers	integer or null <int64> >= 0 Default: null Sets the number of available runtime threads for async IO tasks. This affects some networking and file I/O operations especially adapters and ad-hoc queries. In most cases, this does not need to be set explicitly and the default is sufficient. Can be increased in case ingress, egress or ad-hoc queries are a bottleneck. If not specified, the default is set to `workers`.
logging	string or null Default: null Log filtering directives. If set to a valid tracing-subscriber filter, this controls the log messages emitted by the pipeline process. Otherwise, or if the filter has invalid syntax, messages at "info" severity and higher are written to the log and all others are discarded.
max_buffering_delay_usecs	integer <int64> >= 0 Default: 0 Maximal delay in microseconds to wait for `min_batch_size_records` to get buffered by the controller, defaults to 0.
max_parallel_connector_init	integer or null <int64> >= 0 Default: null The maximum number of connectors initialized in parallel during pipeline startup. At startup, the pipeline must initialize all of its input and output connectors. Depending on the number and types of connectors, this can take a long time. To accelerate the process, multiple connectors are initialized concurrently. This option controls the maximum number of connectors that can be initialized in parallel. The default is 10.
min_batch_size_records	integer <int64> >= 0 Default: 0 Minimal input batch size. The controller delays pushing input records to the circuit until at least `min_batch_size_records` records have been received (total across all endpoints) or `max_buffering_delay_usecs` microseconds have passed since at least one input records has been buffered. Defaults to 0.
pin_cpus	Array of integers[ items >= 0 ] Default: [] Optionally, a list of CPU numbers for CPUs to which the pipeline may pin its worker threads. Specify at least twice as many CPU numbers as workers. CPUs are generally numbered starting from 0. The pipeline might not be able to honor CPU pinning requests. CPU pinning can make pipelines run faster and perform more consistently, as long as different pipelines running on the same machine are pinned to different CPUs.
provisioning_timeout_secs	integer or null <int64> >= 0 Default: null Timeout in seconds for the `Provisioning` phase of the pipeline. Setting this value will override the default of the runner.
	object Default: {"cpu_cores_min":null,"cpu_cores_max":null,"memory_mb_min":null,"memory_mb_max":null,"storage_mb_max":null,"storage_class":null}
	object or null Default: {"backend":{"name":"default"},"min_storage_bytes":null,"min_step_storage_bytes":null,"compression":"default","cache_mib":null} Storage configuration for a pipeline.
tracing	boolean Default: false Enable pipeline tracing.
tracing_endpoint_jaeger	string Default: "127.0.0.1:6831" Jaeger tracing endpoint to send tracing information to.
workers	integer <int32> >= 0 Default: 8 Number of DBSP worker threads. Each DBSP "foreground" worker thread is paired with a "background" thread for LSM merging, making the total number of threads twice the specified number. The typical sweet spot for the number of workers is between 4 and 16. Each worker increases overall memory consumption for data structures used during a step.

Program configuration

The "optimized" compilation profile (default) should be used when running production pipelines where performance is important.

cache	boolean Default: true If `true` (default), when a prior compilation with the same checksum already exists, the output of that (i.e., binary) is used. Set `false` to always trigger a new compilation, which might take longer and as well can result in overriding an existing binary.
profile	string or null Default: null Enum: "dev" "unoptimized" "optimized" Enumeration of possible compilation profiles that can be passed to the Rust compiler as an argument via `cargo build --profile <>`. A compilation profile affects among other things the compilation speed (how long till the program is ready to be run) and runtime speed (the performance while running).
runtime_version	string or null Default: null Override runtime version of the pipeline being executed. Warning: This setting is experimental and may change in the future. Requires the platform to run with the unstable feature `runtime_version` enabled. Should only be used for testing purposes, and requires network access. A runtime version can be specified in the form of a version or SHA taken from the `feldera/feldera` repository main branch. Examples: `v0.96.0` or `f4dcac0989ca0fda7d2eb93602a49d007cb3b0ae` A platform of version `0.x.y` may be capable of running future and past runtimes with versions `>=0.x.y` and `<=0.x.y` until breaking API changes happen, the exact bounds for each platform version are unspecified until we reach a stable version. Compatibility is only guaranteed if platform and runtime version are exact matches. Note that any enterprise features are currently considered to be part of the platform. If not set (null), the runtime version will be the same as the platform version.

Editing configuration​

Runtime configuration​

Program configuration​

Editing configuration

Runtime configuration

Program configuration