Skip to main content

Pipeline Settings

Pipelines come with a set of configuration settings to toggle features, tune performance, and help with operations. If you're on the Enterprise Edition, you will likely need to configure the resources section to tune the CPU, memory, and storage resources used by the Pipeline depending on your infrastructure needs. Other than that, users rarely need to deviate from the supplied defaults.

Editing configuration

. You can edit all pipeline settings when the pipeline is Stopped with storage cleared, and a limited subset when it storage is in use.

Press the gear button in the top right corner of the code editor to access the dialog where you can edit the runtime and program configuration JSON.

Configure pipeline in web-console

Runtime configuration

important

Make sure to appropriately size resource limits (memory and storage), the number of worker threads and the storage backend to utilize available cluster resources.

checkpoint_during_suspend
boolean
Default: true

Deprecated: setting this true or false does not have an effect anymore.

clock_resolution_usecs
integer or null <int64> >= 0
Default: 1000000

Real-time clock resolution in microseconds.

This parameter controls the execution of queries that use the NOW() function. The output of such queries depends on the real-time clock and can change over time without any external inputs. If the query uses NOW(), the pipeline will update the clock value and trigger incremental recomputation at most each clock_resolution_usecs microseconds. If the query does not use NOW(), then clock value updates are suppressed and the pipeline ignores this setting.

It is set to 1 second (1,000,000 microseconds) by default.

cpu_profiler
boolean
Default: true

Enable CPU profiler.

The default value is true.

object
Default: {}

Optional settings for tweaking Feldera internals.

The available key-value pairs change from one version of Feldera to another, so users should not depend on particular settings being available, or on their behavior.

object
Default: {"model":"none","checkpoint_interval_secs":60}

Fault-tolerance configuration.

The default [FtConfig] (via [FtConfig::default]) disables fault tolerance, which is the configuration that one gets if [RuntimeConfig] omits fault tolerance configuration.

The default value for [FtConfig::model] enables fault tolerance, as Some(FtModel::default()). This is the configuration that one gets if [RuntimeConfig] includes a fault tolerance configuration but does not specify a particular model.

http_workers
integer or null <int64> >= 0
Default: null

Sets the number of available runtime threads for the http server.

In most cases, this does not need to be set explicitly and the default is sufficient. Can be increased in case the pipeline HTTP API operations are a bottleneck.

If not specified, the default is set to workers.

init_containers
any or null

Specification of additional (sidecar) containers.

io_workers
integer or null <int64> >= 0
Default: null

Sets the number of available runtime threads for async IO tasks.

This affects some networking and file I/O operations especially adapters and ad-hoc queries.

In most cases, this does not need to be set explicitly and the default is sufficient. Can be increased in case ingress, egress or ad-hoc queries are a bottleneck.

If not specified, the default is set to workers.

logging
string or null
Default: null

Log filtering directives.

If set to a valid tracing-subscriber filter, this controls the log messages emitted by the pipeline process. Otherwise, or if the filter has invalid syntax, messages at "info" severity and higher are written to the log and all others are discarded.

max_buffering_delay_usecs
integer <int64> >= 0
Default: 0

Maximal delay in microseconds to wait for min_batch_size_records to get buffered by the controller, defaults to 0.

max_parallel_connector_init
integer or null <int64> >= 0
Default: null

The maximum number of connectors initialized in parallel during pipeline startup.

At startup, the pipeline must initialize all of its input and output connectors. Depending on the number and types of connectors, this can take a long time. To accelerate the process, multiple connectors are initialized concurrently. This option controls the maximum number of connectors that can be initialized in parallel.

The default is 10.

min_batch_size_records
integer <int64> >= 0
Default: 0

Minimal input batch size.

The controller delays pushing input records to the circuit until at least min_batch_size_records records have been received (total across all endpoints) or max_buffering_delay_usecs microseconds have passed since at least one input records has been buffered. Defaults to 0.

pin_cpus
Array of integers[ items >= 0 ]
Default: []

Optionally, a list of CPU numbers for CPUs to which the pipeline may pin its worker threads. Specify at least twice as many CPU numbers as workers. CPUs are generally numbered starting from 0. The pipeline might not be able to honor CPU pinning requests.

CPU pinning can make pipelines run faster and perform more consistently, as long as different pipelines running on the same machine are pinned to different CPUs.

provisioning_timeout_secs
integer or null <int64> >= 0
Default: null

Timeout in seconds for the Provisioning phase of the pipeline. Setting this value will override the default of the runner.

object
Default: {"cpu_cores_min":null,"cpu_cores_max":null,"memory_mb_min":null,"memory_mb_max":null,"storage_mb_max":null,"storage_class":null}
object or null
Default: {"backend":{"name":"default"},"min_storage_bytes":null,"min_step_storage_bytes":null,"compression":"default","cache_mib":null}

Storage configuration for a pipeline.

tracing
boolean
Default: false

Enable pipeline tracing.

tracing_endpoint_jaeger
string
Default: "127.0.0.1:6831"

Jaeger tracing endpoint to send tracing information to.

workers
integer <int32> >= 0
Default: 8

Number of DBSP worker threads.

Each DBSP "foreground" worker thread is paired with a "background" thread for LSM merging, making the total number of threads twice the specified number.

The typical sweet spot for the number of workers is between 4 and 16. Each worker increases overall memory consumption for data structures used during a step.

Program configuration

The "optimized" compilation profile (default) should be used when running production pipelines where performance is important.

cache
boolean
Default: true

If true (default), when a prior compilation with the same checksum already exists, the output of that (i.e., binary) is used. Set false to always trigger a new compilation, which might take longer and as well can result in overriding an existing binary.

profile
string or null
Default: null
Enum: "dev" "unoptimized" "optimized"

Enumeration of possible compilation profiles that can be passed to the Rust compiler as an argument via cargo build --profile <>. A compilation profile affects among other things the compilation speed (how long till the program is ready to be run) and runtime speed (the performance while running).

runtime_version
string or null
Default: null

Override runtime version of the pipeline being executed.

Warning: This option is experimental and may change in the future. Should only be used for CI/testing purposes, and requires network access.

A runtime version can be specified in the form of a version or SHA taken from the feldera/feldera repository main branch.

Examples: v0.96.0 or f4dcac0989ca0fda7d2eb93602a49d007cb3b0ae

A platform of version 0.x.y may be capable of running future and past runtimes with versions >=0.x.y and <=0.x.y until breaking API changes happen, the exact bounds for each platform version are unspecified until we reach a stable version. Compatibility is only guaranteed if platform and runtime version are exact matches.

Note that any enterprise features are currently considered to be part of the platform.

If not set (null), the runtime version will be the same as the platform version.