Skip to main content

Pipeline Metrics

This reference lists all of the metrics that Feldera exports through its /metrics endpoint in Prometheus exposition format. It is automatically generated using the documentation embedded in Prometheus output.

All of the metrics exported by a particular Feldera pipeline are labeled with the pipeline's UUID as pipeline. Some metrics have additional labels, as documented below.

See Monitoring and Profiling for a guide to setting up Prometheus and Grafana with Feldera. The Feldera template dashboard is a sample Grafana dashboard for Feldera.

Process Metrics

These metrics report statistics for a running Feldera pipeline process. When a pipeline process is killed and restarts from a checkpoint, the new process's metrics are for it alone, not cumulative with any previous instantiations.

These metrics are intended to match the standard Prometheus definitions.

NameTypeDescription
process_cpu_seconds_totalcounterTotal user and system CPU time spent in seconds.
process_max_fdsgaugeMaximum number of open file descriptors.
process_open_fdsgaugeNumber of open file descriptors.
process_resident_memory_bytesgaugeResident set size in bytes.
process_start_time_secondscounterStart time of the process in seconds since the Unix epoch.
process_threadsgaugeNumber of OS threads in the process.
process_virtual_memory_bytesgaugeVirtual memory size in bytes.
process_virtual_memory_max_bytesgaugeMaximum amount of virtual memory available in bytes.

Feldera metrics

These metrics report statistics for Feldera operations.

NameTypeDescription
feldera_checkpoint_latency_secondshistogramLatency of overall checkpoint operations in seconds
feldera_checkpoint_records_processed_totalcounterTotal number of records that had processed when the most recent checkpoint successfully committed.
feldera_checkpoint_written_byteshistogramAmount of data written to storage during checkpoints, in bytes.

DBSP metrics

These metrics report statistics for DBSP, the low-level mechanism on which Feldera is built.

NameTypeDescription
compaction_stall_duration_secondscounterTime in seconds a worker was stalled waiting for more merges to complete.
dbsp_operator_checkpoint_latency_secondshistogramLatency of individual operator checkpoint operations in seconds. (Because checkpoints run in parallel across workers, these will not add to feldera_checkpoint_latency_seconds.)
dbsp_step_latency_secondshistogramLatency of DBSP steps over the last 60 seconds or 1000 steps, whichever is less, in seconds
dbsp_steps_totalcounterTotal number of DBSP steps executed.

Record Processing

These metrics report overall counts of records as they pass through the pipeline. They accumulate across checkpoint and resume.

NameTypeDescription
output_buffered_batchesgaugeNumber of batches of records currently buffered by the output connector.
records_input_bufferedgaugeTotal amount of data currently buffered by all endpoints, in records.
records_input_buffered_bytesgaugeTotal amount of data currently buffered by all endpoints, in bytes.
records_input_bytes_totalcounterTotal amount of data received from all connectors, in bytes.
records_input_totalcounterTotal amount of data received from all connectors, in records.
records_late_totalcounterNumber of records dropped due to LATENESS annotations.
records_processed_bytes_totalcounterTotal amount of input processed by the pipeline, in bytes.
records_processed_totalcounterTotal amount of input processed by the pipeline, in records.

Storage Performance

These metrics report the performance of storage, which allows Feldera to work with data larger than memory.

NameTypeDescription
files_created_totalcounterTotal number of files created.
files_deleted_totalcounterTotal number of files deleted.
storage_read_block_byteshistogramSizes in bytes of blocks read from storage.
storage_read_latency_secondshistogramRead latency for storage blocks in seconds
storage_sync_latency_secondshistogramSync latency in seconds
storage_write_block_byteshistogramSizes in bytes of blocks written to storage.
storage_write_latency_secondshistogramWrite latency for storage blocks in seconds

Pipeline Status

These metrics report the status of the pipeline.

NameTypeDescription
pipeline_completecounterTransitions from 0 to 1 when pipeline completes.
pipeline_start_time_secondscounterStart time of the pipeline in seconds since the Unix epoch.

This will be earlier than process_start_time_seconds if the pipeline resumed from a checkpoint. This will be zero if the pipeline resumed from a checkpoint produced by a pipeline too old to record its start time.

Input Connectors

These metrics are per-input connector, labeled with endpoint set to the name of the input connector, which is either the name assigned in the SQL program or automatically generated as unnamed-<number>, where <number> counts starting from 1 for the first connector for a given table.

These metrics accumulate across checkpoint and resume.

For byte counters, for some input connectors, such as columnar formats, bytes are difficult to attribute accurately to records, so Feldera approximates. Feldera also approximately attributes byte counts to records when it processes only some of the records in a batch in a DBSP step. This approximation is corrected when the remainder of the batch is processed in a subsequent step, so it is invisible to users unless a pause or checkpoint happens mid-batch.

NameTypeDescription
input_connector_buffered_recordsgaugeAmount of data currently buffered by an input connector, in records.
input_connector_buffered_records_bytesgaugeAmount of data currently buffered by an input connector, in bytes.
input_connector_bytes_totalcounterTotal number of bytes received by an input connector.
input_connector_errors_parse_totalcounterTotal number of errors encountered parsing records received by the input connector.
input_connector_errors_transport_totalcounterTotal number of errors encountered by the input connector at the transport layer.
input_connector_records_totalcounterTotal number of records received by an input connector.

Output Connectors

These metrics are per-output connector, labeled with endpoint set to the name of the output connector, which is either the name assigned in the SQL program or automatically generated as unnamed-<number>, where <number> counts starting from 1 for the first connector for a given view.

These metrics accumulate across checkpoint and resume.

NameTypeDescription
output_connector_buffered_recordsgaugeNumber of records currently buffered by the output connector.
output_connector_bytes_totalcounterTotal number of bytes of records sent by the output connector.
output_connector_errors_encode_totalcounterTotal number of errors encountered encoding records to send.
output_connector_errors_transport_totalcounterTotal number of errors encountered at the transport layer sending records.
output_connector_records_totalcounterTotal number of records sent by the output connector.