Pipeline Metrics

This reference lists all of the metrics that Feldera exports through its /metrics endpoint in Prometheus exposition format. It is automatically generated using the documentation embedded in Prometheus output.

All of the metrics exported by a particular Feldera pipeline are labeled with the pipeline's UUID as pipeline and its name as pipeline_name. Some metrics have additional labels, as documented below.

See Monitoring and Profiling for a guide to setting up Prometheus and Grafana with Feldera. The Feldera template dashboard is a sample Grafana dashboard for Feldera.

Process Metrics

These metrics report statistics for a running Feldera pipeline process. When a pipeline process is killed and restarts from a checkpoint, the new process's metrics are for it alone, not cumulative with any previous instantiations.

These metrics are intended to match the standard Prometheus definitions.

Name	Type	Description
`process_cpu_seconds_total`	counter	Total user and system CPU time spent in seconds.
`process_max_fds`	gauge	Maximum number of open file descriptors.
`process_open_fds`	gauge	Number of open file descriptors.
`process_resident_memory_bytes`	gauge	Resident set size in bytes.
`process_start_time_seconds`	counter	Start time of the process in seconds since the Unix epoch.
`process_threads`	gauge	Number of OS threads in the process.
`process_virtual_memory_bytes`	gauge	Virtual memory size in bytes.
`process_virtual_memory_max_bytes`	gauge	Maximum amount of virtual memory available in bytes.

Feldera metrics

These metrics report statistics for Feldera operations.

Name	Type	Description
`feldera_checkpoint_delay_seconds`	histogram	Sub-duration of `feldera_checkpoint_runtime_seconds` during which pipeline execution was blocked.
`feldera_checkpoint_records_processed_total`	counter	Total number of records that had processed when the most recent checkpoint successfully committed.
`feldera_checkpoint_runtime_seconds`	histogram	Time to run checkpoint operations, in seconds, including time that the pipeline could continue executing along with the checkpoint.
`feldera_checkpoint_written_bytes`	histogram	Amount of data written to storage during checkpoints, in bytes.

DBSP metrics

These metrics report statistics for DBSP, the low-level mechanism on which Feldera is built.

Name	Type	Description
`compaction_stall_duration_seconds`	counter	Time in seconds a worker was stalled waiting for more merges to complete.
`dbsp_operator_checkpoint_latency_seconds`	histogram	Latency of individual operator checkpoint operations in seconds. (Because checkpoints run in parallel across workers, these will not add to `feldera_checkpoint_latency_seconds`.)
`dbsp_runtime_elapsed_seconds_total`	counter	Time elapsed while the pipeline is executing a step, multiplied by the number of foreground and background threads, in seconds.
`dbsp_step_latency_seconds`	histogram	Latency of DBSP steps over the last 60 seconds or 1000 steps, whichever is less, in seconds
`dbsp_steps_total`	counter	Total number of DBSP steps executed.

Record Processing

These metrics report overall counts of records as they pass through the pipeline. They accumulate across checkpoint and resume.

Name	Type	Description
`output_buffered_batches`	gauge	Number of batches of records currently buffered by the output connector.
`records_input_buffered`	gauge	Total amount of data currently buffered by all endpoints, in records.
`records_input_buffered_bytes`	gauge	Total amount of data currently buffered by all endpoints, in bytes.
`records_input_bytes_total`	counter	Total amount of data received from all connectors, in bytes.
`records_input_total`	counter	Total amount of data received from all connectors, in records.
`records_late_total`	counter	Number of records dropped due to LATENESS annotations.
`records_processed_bytes_total`	counter	Total amount of input processed by the pipeline, in bytes.
`records_processed_total`	counter	Total amount of input processed by the pipeline, in records.

Storage Performance

These metrics report the performance of storage, which allows Feldera to work with data larger than memory.

Name	Type	Description
`files_created_total`	counter	Total number of files created.
`files_deleted_total`	counter	Total number of files deleted.
`storage_byte_seconds_total`	counter	Storage usage integrated over time during this run of the pipeline, in bytes × seconds.
`storage_cache_usage_bytes`	gauge	The number of bytes of memory currently in use for caching data on storage.
`storage_cache_usage_limit_bytes_total`	counter	The limit for the number of bytes of memory for caching data on storage.
`storage_read_block_bytes`	histogram	Sizes in bytes of blocks read from storage.
`storage_read_latency_seconds`	histogram	Read latency for storage blocks in seconds
`storage_sync_latency_seconds`	histogram	Sync latency in seconds
`storage_usage_bytes`	gauge	The number of bytes of storage currently in use
`storage_write_block_bytes`	histogram	Sizes in bytes of blocks written to storage.
`storage_write_latency_seconds`	histogram	Write latency for storage blocks in seconds

Pipeline Status

These metrics report the status of the pipeline.

Name	Type	Description
`pipeline_complete`	counter	Transitions from 0 to 1 when pipeline completes.
`pipeline_start_time_seconds`	counter	Start time of the pipeline in seconds since the Unix epoch. This will be earlier than `process_start_time_seconds` if the pipeline resumed from a checkpoint. This will be zero if the pipeline resumed from a checkpoint produced by a pipeline too old to record its start time.

Input Connectors

These metrics are per-input connector, labeled with endpoint set to the name of the input connector, which is either the name assigned in the SQL program or automatically generated as unnamed-<number>, where <number> counts starting from 1 for the first connector for a given table.

These metrics accumulate across checkpoint and resume.

For byte counters, for some input connectors, such as columnar formats, bytes are difficult to attribute accurately to records, so Feldera approximates. Feldera also approximately attributes byte counts to records when it processes only some of the records in a batch in a DBSP step. This approximation is corrected when the remainder of the batch is processed in a subsequent step, so it is invisible to users unless a pause or checkpoint happens mid-batch.

Name	Type	Description
`input_connector_buffered_records`	gauge	Amount of data currently buffered by an input connector, in records.
`input_connector_buffered_records_bytes`	gauge	Amount of data currently buffered by an input connector, in bytes.
`input_connector_bytes_total`	counter	Total number of bytes received by an input connector.
`input_connector_completion_latency_seconds`	histogram	Time between when the connector receives new data and when the pipeline processes this data, computes output updates, and sends these updates to all output connectors, over the last 600 seconds or 10,000 samples.
`input_connector_errors_parse_total`	counter	Total number of errors encountered parsing records received by the input connector.
`input_connector_errors_transport_total`	counter	Total number of errors encountered by the input connector at the transport layer.
`input_connector_extra_memory_bytes`	gauge	Additional memory used by an input connector beyond that used for buffered records.
`input_connector_processing_latency_seconds`	histogram	Time between when the connector receives new data and when the pipeline processes this data and computes output updates, over the last 600 seconds or 10,000 samples.
`input_connector_records_total`	counter	Total number of records received by an input connector.

Output Connectors

These metrics are per-output connector, labeled with endpoint set to the name of the output connector, which is either the name assigned in the SQL program or automatically generated as unnamed-<number>, where <number> counts starting from 1 for the first connector for a given view.

These metrics accumulate across checkpoint and resume.

Name	Type	Description
`output_connector_buffered_records`	gauge	Number of records currently buffered by the output connector.
`output_connector_bytes_total`	counter	Total number of bytes of records sent by the output connector.
`output_connector_errors_encode_total`	counter	Total number of errors encountered encoding records to send.
`output_connector_errors_transport_total`	counter	Total number of errors encountered at the transport layer sending records.
`output_connector_extra_memory_bytes`	gauge	Additional memory used by an output connector beyond that used for buffered records.
`output_connector_records_total`	counter	Total number of records sent by the output connector.

Checkpoint Synchronization

These metrics report the status of checkpoint synchronization.

Name	Type	Description
`checkpoint_sync_pull_duration_seconds`	histogram	Time taken to pull a checkpoint from object store in seconds.
`checkpoint_sync_pull_failures`	counter	Number of failures when pulling a checkpoint.
`checkpoint_sync_pull_success`	counter	Number of checkpoints pulled successfully.
`checkpoint_sync_pull_transfer_speed_bytes_per_second`	histogram	Transfer speed when pulling a checkpoint, in bytes per second.
`checkpoint_sync_pull_transferred_bytes`	histogram	Bytes transferred when pulling a checkpoint.
`checkpoint_sync_push_duration_seconds`	histogram	Time taken to push a checkpoint to object store in seconds.
`checkpoint_sync_push_failures`	counter	Number of failures when pushing a checkpoint.
`checkpoint_sync_push_success`	counter	Number of checkpoints pushed successfully.
`checkpoint_sync_push_transfer_speed_bytes_per_second`	histogram	Transfer speed when pushing a checkpoint, in bytes per second.
`checkpoint_sync_push_transferred_bytes`	histogram	Bytes transferred when pushing a checkpoint.

Process Metrics

Feldera metrics​

DBSP metrics​

Record Processing​

Storage Performance​

Pipeline Status​

Input Connectors​

Output Connectors​

Checkpoint Synchronization​