S3 input connector

note

This page describes configuration options specific to the S3 input connector. See top-level connector documentation for general information about configuring input and output connectors.

The S3 input connector is used to load data from an S3 bucket to a Feldera table. It can be configured to load a single object or multiple objects selected based on a common S3 prefix. By setting the endpoint_url, you can also read from non-AWS services that offer S3 compatible APIs.

tip

When accessing an S3 bucket that stores data in the Delta Lake or Iceberg format, consider using the Delta Lake connector or the Iceberg connector connector instead.

The S3 input connector supports fault tolerance.

Configuration options

Property	Type	Default	Description
`aws_access_key_id`	string		AWS Access Key id. This property must be specified unless `no_sign_request` is set to `true`.
`aws_secret_access_key`	string		Secret Access Key. This property must be specified unless `no_sign_request` is set to `true`.
`no_sign_request`	bool	`false`	Do not sign requests. This is equivalent to the `--no-sign-request` flag in the AWS CLI.
`key`	string		Read a single object specified by a key. Either this property or the `prefix` property must be set.
`prefix`	string		Read all objects whose keys match a prefix. Set to an empty string to read all objects in the bucket. Either this property or the `key` property must be set.
`region`*	string		AWS region.
`bucket_name`*	string		S3 bucket name.
`endpoint_url`	string		The endpoint URL used to communicate with this service. Explicitly set it to connect to non-AWS services. For example, use `https://storage.googleapis.com` to interact with Google Cloud Storage.
`max_concurrent_fetches`	integer	8	Controls the number of S3 objects fetched in parallel. Increasing this value can improve throughput by enabling greater concurrency. However, higher concurrency may lead to timeouts or increased memory usage due to in-memory buffering. Recommended range: 1–10. Default: 8.

*Fields marked with an asterisk are required.

Support for AWS IAM roles for service accounts (IRSA)

To use AWS IAM roles for service accounts (IRSA), you must skip supplying the fields aws_access_key_id and aws_secret_access_key.

Feldera will pick up the credentials from the environment variables.

Examples

Populate a table from a JSON file in a public S3 bucket:

CREATE TABLE vendor (
    id BIGINT NOT NULL PRIMARY KEY,
    name VARCHAR,
    address VARCHAR
) WITH ('connectors' = '[{
    "transport": {
        "name": "s3_input",
        "config": {
            "key": "vendor.json",
            "no_sign_request": true,
            "bucket_name": "feldera-basics-tutorial",
            "region": "us-west-1"
        }
    },
    "format": { "name": "json" }
}]');

Populate a table from a JSON file, using access key-based authentication:

CREATE TABLE vendor (
    id BIGINT NOT NULL PRIMARY KEY,
    name VARCHAR,
    address VARCHAR
) WITH ('connectors' = '[{
    "transport": {
        "name": "s3_input",
        "config": {
            "key": "vendor.json",
            "aws_access_key_id": "YOUR_ACCESS_KEY_ID",
            "aws_secret_access_key": "YOUR_SECRET_ACCESS_KEY",
            "bucket_name": "feldera-basics-tutorial",
            "region": "us-west-1"
        }
    },
    "format": { "name": "json" }
}]');

To connect to Google Cloud Storage, explicitly set the endpoint_url. You also need to configure an HMAC key to get an Access ID and Secret. You will also need to grant the corresponding Principal access to the bucket.

CREATE TABLE vendor (
    id BIGINT NOT NULL PRIMARY KEY,
    name VARCHAR,
    address VARCHAR
) WITH ('connectors' = '[{
    "transport": {
        "name": "s3_input",
        "config": {
            "key": "vendor.json",
            "aws_access_key_id": "YOUR_ACCESS_KEY_ID",
            "aws_secret_access_key": "YOUR_SECRET_ACCESS_KEY",
            "bucket_name": "my-bucket",
            "region": "us-west1",
            "endpoint_url": "https://storage.googleapis.com"
        }
    },
    "format": { "name": "json" }
}]');

tip

Refer to the secret references guide to externalize AWS access keys via Kubernetes.

Additional resources

For more information, see:

Configuration options​

Support for AWS IAM roles for service accounts (IRSA)​

Examples​

Additional resources​

Configuration options

Support for AWS IAM roles for service accounts (IRSA)

Examples

Additional resources