From c9bca86139b972ef802e26b9d96a591480c83b41 Mon Sep 17 00:00:00 2001 From: HarshCasper Date: Mon, 27 Apr 2026 16:16:23 +0530 Subject: [PATCH] Document Athena S3 Tables query support --- src/content/docs/aws/services/athena.mdx | 162 +++++++++++++++++++++ src/content/docs/aws/services/s3tables.mdx | 7 + 2 files changed, 169 insertions(+) diff --git a/src/content/docs/aws/services/athena.mdx b/src/content/docs/aws/services/athena.mdx index a27d6751..066ab1de 100644 --- a/src/content/docs/aws/services/athena.mdx +++ b/src/content/docs/aws/services/athena.mdx @@ -218,6 +218,168 @@ s3://mybucket/prefix/metadata/snap-9068645333036463050-1-2f8d3628-bb13-4081-b5a9 s3://mybucket/prefix/temp/ ``` +## S3 Tables + +LocalStack Athena can query [S3 Tables](/aws/services/s3tables/) through Glue federated catalogs, mirroring the AWS workflow that bridges S3 Tables, Glue, and Athena into a single query path. +This lets you point Athena at a table bucket and run SQL against the Iceberg tables it manages without copying data into a separate warehouse. + +The flow is the same as on AWS: + +1. Create a table bucket and namespaces in S3 Tables. +2. Register a Glue federated catalog (conventionally named `s3tablescatalog`) that delegates metadata to S3 Tables. +3. Register an Athena data catalog with `Type=GLUE` whose `catalog-id` parameter points to a specific table bucket via the federated catalog (`s3tablescatalog/`). +4. Reference the Athena data catalog in `QueryExecutionContext` when running queries. + +### Create S3 Tables resources + +Create a table bucket and a namespace in S3 Tables. +The bucket holds your Iceberg tables and the namespace organizes them. + +```bash +awslocal s3tables create-table-bucket --name athena-doc-bucket +``` + +```bash title="Output" +{ + "arn": "arn:aws:s3tables:us-east-1:000000000000:bucket/athena-doc-bucket" +} +``` + +```bash +awslocal s3tables create-namespace \ + --table-bucket-arn arn:aws:s3tables:us-east-1:000000000000:bucket/athena-doc-bucket \ + --namespace sales +``` + +```bash title="Output" +{ + "tableBucketARN": "arn:aws:s3tables:us-east-1:000000000000:bucket/athena-doc-bucket", + "namespace": [ + "sales" + ] +} +``` + +### Register a Glue federated catalog + +Register a Glue catalog that federates to S3 Tables using the [`CreateCatalog`](https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-catalog-Catalogs.html#aws-glue-api-catalog-CreateCatalog) API. +The catalog name `s3tablescatalog` matches the AWS convention used by Athena, EMR, and Redshift. + +```bash +awslocal glue create-catalog \ + --name s3tablescatalog \ + --catalog-input '{ + "FederatedCatalog": { + "Identifier": "arn:aws:s3tables:us-east-1:000000000000:bucket/*", + "ConnectionName": "aws:s3tables" + } + }' +``` + +You can verify the federated catalog with: + +```bash +awslocal glue get-catalogs +``` + +### Register an Athena data catalog + +Register an Athena data catalog that points at a specific table bucket using the [`CreateDataCatalog`](https://docs.aws.amazon.com/athena/latest/APIReference/API_CreateDataCatalog.html) API. +The `catalog-id` parameter follows the format `s3tablescatalog/` so that Athena routes queries through the federated catalog path. + +```bash +awslocal athena create-data-catalog \ + --name s3tables-catalog \ + --type GLUE \ + --parameters "catalog-id=s3tablescatalog/athena-doc-bucket" +``` + +Confirm the data catalog status: + +```bash +awslocal athena get-data-catalog --name s3tables-catalog +``` + +```bash title="Output" +{ + "DataCatalog": { + "Name": "s3tables-catalog", + "Type": "GLUE", + "Parameters": { + "catalog-id": "s3tablescatalog/athena-doc-bucket" + }, + "Status": "CREATE_COMPLETE" + } +} +``` + +### Resolve metadata through the catalog + +Once the data catalog is registered, Athena resolves S3 Tables namespaces as databases and S3 Tables as tables. +List the databases exposed by the federated catalog: + +```bash +awslocal athena list-databases --catalog-name s3tables-catalog +``` + +```bash title="Output" +{ + "DatabaseList": [ + { + "Name": "sales", + "Parameters": { + "createdBy": "000000000000", + "ownerAccountId": "000000000000" + } + } + ] +} +``` + +You can also describe a single namespace with [`GetDatabase`](https://docs.aws.amazon.com/athena/latest/APIReference/API_GetDatabase.html): + +```bash +awslocal athena get-database \ + --catalog-name s3tables-catalog \ + --database-name sales +``` + +### Run queries via the federated catalog + +To query S3 Tables data from Athena, reference the data catalog name in the `QueryExecutionContext`. +The `Catalog` field maps to the Athena data catalog you registered, and `Database` maps to the S3 Tables namespace: + +```bash +awslocal athena start-query-execution \ + --query-string "CREATE TABLE orders (id int, customer string, amount double) TBLPROPERTIES ('table_type' = 'ICEBERG')" \ + --query-execution-context "Catalog=s3tables-catalog,Database=sales" \ + --result-configuration "OutputLocation=s3://athena-doc-output/results/" +``` + +Insert and read data using the same `QueryExecutionContext`: + +```bash +awslocal athena start-query-execution \ + --query-string "INSERT INTO orders VALUES (1, 'alice', 100.0), (2, 'bob', 250.5)" \ + --query-execution-context "Catalog=s3tables-catalog,Database=sales" \ + --result-configuration "OutputLocation=s3://athena-doc-output/results/" +``` + +```bash +awslocal athena start-query-execution \ + --query-string "SELECT * FROM orders ORDER BY id" \ + --query-execution-context "Catalog=s3tables-catalog,Database=sales" \ + --result-configuration "OutputLocation=s3://athena-doc-output/results/" +``` + +You can also use the catalog-id reference (`s3tablescatalog/`) directly in `QueryExecutionContext.Catalog` if you prefer not to register a named Athena data catalog. + +:::note +Query execution against the federated catalog routes through Trino's Iceberg connector inside the LocalStack bigdata container. +The first query may take several minutes while LocalStack downloads and starts the bigdata dependencies. +Subsequent queries reuse the running services. +::: + ## Client configuration You can configure the Athena service in LocalStack with various clients, such as [PyAthena](https://github.com/laughingman7743/PyAthena/), [awswrangler](https://github.com/aws/aws-sdk-pandas), among others! diff --git a/src/content/docs/aws/services/s3tables.mdx b/src/content/docs/aws/services/s3tables.mdx index f5952fcf..4b472186 100644 --- a/src/content/docs/aws/services/s3tables.mdx +++ b/src/content/docs/aws/services/s3tables.mdx @@ -164,6 +164,13 @@ awslocal s3tables list-tables \ } ``` +## Querying S3 Tables from Athena + +LocalStack [Athena](/aws/services/athena/) can query S3 Tables data through a Glue federated catalog. +Once you register a federated `s3tablescatalog` in Glue and add a matching Athena data catalog, you can run SQL against your S3 Tables namespaces and tables directly from Athena. + +See [S3 Tables in the Athena documentation](/aws/services/athena/#s3-tables) for the full workflow. + ## API Coverage