Pipeline Metrics
This reference lists all of the metrics that Feldera exports through
its /metrics
endpoint in Prometheus exposition format. It is
automatically generated using the documentation embedded in Prometheus
output.
All of the metrics exported by a particular Feldera pipeline are
labeled with the pipeline's UUID as pipeline
. Some metrics have
additional labels, as documented below.
See Monitoring and Profiling for a guide to setting up Prometheus and Grafana with Feldera. The Feldera template dashboard is a sample Grafana dashboard for Feldera.
Process Metrics
These metrics report statistics for a running Feldera pipeline process. When a pipeline process is killed and restarts from a checkpoint, the new process's metrics are for it alone, not cumulative with any previous instantiations.
These metrics are intended to match the standard Prometheus definitions.
Name | Type | Description |
---|---|---|
process_cpu_seconds_total | counter | Total user and system CPU time spent in seconds. |
process_max_fds | gauge | Maximum number of open file descriptors. |
process_open_fds | gauge | Number of open file descriptors. |
process_resident_memory_bytes | gauge | Resident set size in bytes. |
process_start_time_seconds | counter | Start time of the process since the Unix epoch in seconds. |
process_threads | gauge | Number of OS threads in the process. |
process_virtual_memory_bytes | gauge | Virtual memory size in bytes. |
process_virtual_memory_max_bytes | gauge | Maximum amount of virtual memory available in bytes. |
Feldera metrics
These metrics report statistics for Feldera operations.
Name | Type | Description |
---|---|---|
feldera_checkpoint_latency_seconds | histogram | Latency of overall checkpoint operations in seconds |
feldera_checkpoint_records_processed_total | counter | Total number of records that had processed when the most recent checkpoint successfully committed. |
feldera_checkpoint_written_bytes | histogram | Amount of data written to storage during checkpoints, in bytes. |
DBSP metrics
These metrics report statistics for DBSP, the low-level mechanism on which Feldera is built.
Name | Type | Description |
---|---|---|
compaction_stall_duration_seconds | counter | Time in seconds a worker was stalled waiting for more merges to complete. |
dbsp_operator_checkpoint_latency_seconds | histogram | Latency of individual operator checkpoint operations in seconds. (Because checkpoints run in parallel across workers, these will not add to feldera_checkpoint_latency_seconds .) |
dbsp_step_latency_seconds | histogram | Latency of DBSP steps over the last 60 seconds or 1000 steps, whichever is less, in seconds |
dbsp_steps_total | counter | Total number of DBSP steps executed. |
Record Processing
These metrics report overall counts of records as they pass through the pipeline. They accumulate across checkpoint and resume.
Name | Type | Description |
---|---|---|
output_buffered_batches | gauge | Number of batches of records currently buffered by the output connector. |
records_input_buffered | gauge | Total number of records currently buffered by all endpoints. |
records_input_total | counter | Total number of input records received from all connectors. |
records_late_total | counter | Number of records dropped due to LATENESS annotations. |
records_processed_total | counter | Total number of input records processed by the pipeline. |
Storage Performance
These metrics report the performance of storage, which allows Feldera to work with data larger than memory.
Name | Type | Description |
---|---|---|
files_created_total | counter | Total number of files created. |
files_deleted_total | counter | Total number of files deleted. |
storage_read_block_bytes | histogram | Sizes in bytes of blocks read from storage. |
storage_read_latency_seconds | histogram | Read latency for storage blocks in seconds |
storage_sync_latency_seconds | histogram | Sync latency in seconds |
storage_write_block_bytes | histogram | Sizes in bytes of blocks written to storage. |
storage_write_latency_seconds | histogram | Write latency for storage blocks in seconds |
Pipeline Status
These metrics report the status of the pipeline.
Name | Type | Description |
---|---|---|
pipeline_complete | counter | Transitions from 0 to 1 when pipeline completes. |
Input Connectors
These metrics are per-input connector, labeled with endpoint
set to
the name of the input connector, which is either the name assigned in
the SQL program or automatically generated as unnamed-<number>
,
where <number>
counts starting from 1 for the first connector for a
given table.
Name | Type | Description |
---|---|---|
input_connector_buffered_records | gauge | Number of records currently buffered by an input connector. |
input_connector_bytes_total | counter | Total number of bytes received by an input connector. |
input_connector_errors_parse_total | counter | Total number of errors encountered parsing records received by the input connector. |
input_connector_errors_transport_total | counter | Total number of errors encountered by the input connector at the transport layer. |
input_connector_records_total | counter | Total number of records received by an input connector. |
Output Connectors
These metrics are per-output connector, labeled with endpoint
set to
the name of the output connector, which is either the name assigned in
the SQL program or automatically generated as unnamed-<number>
,
where <number>
counts starting from 1 for the first connector for a
given view.
Name | Type | Description |
---|---|---|
output_connector_buffered_records | gauge | Number of records currently buffered by the output connector. |
output_connector_bytes_total | counter | Total number of bytes of records sent by the output connector. |
output_connector_errors_encode_total | counter | Total number of errors encountered encoding records to send. |
output_connector_errors_transport_total | counter | Total number of errors encountered at the transport layer sending records. |
output_connector_records_total | counter | Total number of records sent by the output connector. |