S3 Tables
AWS S3 tables provide tabular data storage on top of S3, allowing applications to treat S3 like a SQL data store. It does this by writing parquet files to S3 and updating the catalog to inform S3 tables what the schema of the files is.
Connection Settings
Authentication Type
| Setting | Description |
|---|---|
| Token | Enter an IAM Access and Secret Key that have permissions to write to S3. See below section on IAM permission best practices. |
| Assume EC2 IAM Role | If running on an EC2 instance with an IAM role attached, automatically assumes that role. No credentials are required. See below section on IAM permission best practices. |
Region
Region of the Amazon Kinesis Data Streams instance (e.g. us-east-1)
Input Settings
Inputs are not currently supported although the Athena JDBC driver can be used with the JDBC connection to read from S3 Tables
Output Settings
Table Bucket ARN
The full ARN of the table bucket to write to. For example: arn:aws:s3tables:us-west-2:000000000000:bucket/test
Namespace
The namespace to write to in the table bucket. Table buckets can have multiple namespaces but a table has a single namespace.
Table
The name of the table to write to. Table names must follow S3 naming restrictions (ex. no upper case).
Create
When Off is selected, the table and namespace must already exist. When Create is selected, the table and namespace are created if they don’t exist. When Create & Update is selected the table and namespace are created if they don’t exist, and the table schema are updated if new attributes appear in the write.
Output Examples
The S3 Tables connection writes out tabular data and matches the data to the table schema by name.
For example, assume the payload is as follows. Assume the table already exists, and has two columns “col1” and “col2”.
{
"col1": 1.23,
"col2": "hello world",
"col3": null
}
The above write will match col1 and col2 by name, and insert a single row into the table [1.23, “hello world”]. Following the example, assume the following payload is written to the same table, but Create & Update is enabled.
[
{
"col1": 1.23,
"col2": "hello world"
},
{
"col3": false
}
]
This will add a “col3” to the table with a data type of boolean, and then insert the first row as [1.23, “hello world”, null] and the second row as [null, null, false].
AWS IAM Best Practices
Please see AWS documentation on IAM best practices. HighByte strongly recommends following the policy of least privilege when granting the IAM role for the connection.
https://docs.aws.amazon.com/AmazonS3/latest/userguide/security-iam.html
https://docs.aws.amazon.com/AmazonS3/latest/userguide/security-best-practices.html
It is also recommended that users occasionally rotate new IAM credentials and manually update the Intelligence Hub configuration with the new credentials.
The following IAM permissions are used by the S3 Tables Connection.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3tables:*"
],
"Resource": "*"
},
{
"Sid": "PassRoleToS3TablesReplication",
"Effect": "Allow",
"Action": [
"iam:PassRole"
],
"Resource": "*",
"Condition": {
"StringEquals": {
"iam:PassedToService": [
"replication.s3tables.amazonaws.com"
]
}
}
}
]
}