Amazon S3 Tables

AWS S3 tables provide tabular data storage on top of S3, allowing applications to treat S3 like a SQL data store. It does this by writing parquet files to S3 and updating the catalog to inform S3 tables what the schema of the files is.

Connection Settings

Access Key

IAM created user access key with S3 and S3 Tables permission to read and write.

Secret Key

IAM provided secret key.

Region

Region of the Amazon Kinesis Data Streams instance (e.g. us-east-1)

Input Settings

Inputs are not currently supported although the Athena JDBC driver can be used with the JDBC connection to read from S3 Tables

Output Settings

Table Bucket ARN

The full ARN of the table bucket to write to. For example: arn:aws:s3tables:us-west-2:000000000000:bucket/test

Namespace

The namespace to write to in the table bucket. Table buckets can have multiple namespaces but a table has a single namespace.

Table

The name of the table to write to. Table names must follow S3 naming restrictions (ex. no upper case).

Create

When Off is selected, the table and namespace must already exist. When Create is selected, the table and namespace are created if they don’t exist. When Create & Update is selected the table and namespace are created if they don’t exist, and the table schema are updated if new attributes appear in the write.

Output Examples

The S3 Tables connection writes out tabular data and matches the data to the table schema by name.

For example, assume the payload is as follows. Assume the table already exists, and has two columns “col1” and “col2”.

json
{
  "col1": 1.23,
  "col2": "hello world",
  "col3": null
}

The above write will match col1 and col2 by name, and insert a single row into the table [1.23, “hello world”]. Following the example, assume the following payload is written to the same table, but Create & Update is enabled.

json
[
  {
    "col1": 1.23,
    "col2": "hello world"
  },
  {
    "col3": false
  }
]

This will add a “col3” to the table with a data type of boolean, and then insert the first row as [1.23, “hello world”, null] and the second row as [null, null, false].