Store tabular data at scale in S3
Amazon S3 Tables deliver the first cloud object store with built-in Apache Iceberg support and streamline storing tabular data at scale. Continual table optimization automatically scans and rewrites table data in the background, achieving up to 3x faster query performance compared to unmanaged Iceberg tables. These performance optimizations will continue to improve over time. Additionally, S3 Tables include optimizations specific to Iceberg workloads that deliver up to 10x higher transactions per second compared to Iceberg tables stored in general purpose S3 buckets. For more details on S3 Tables’ query performance improvements, refer to the blog.
With S3 Tables support for the Apache Iceberg standard, your tabular data can be easily queried with popular AWS and third-party query engines including Amazon Athena, Redshift, EMR, and Apache Spark. Use S3 Tables to store tabular data such as daily purchase transactions, streaming sensor data, or ad impressions as an Iceberg table in S3, and optimize performance and cost as your data evolves using automatic table maintenance. Read the blog to learn more.
Benefits
How it works
S3 Tables provide purpose-built S3 storage for storing structured data in the Apache Parquet format. Within a table bucket, you can create tables as first-class resources directly in S3. These tables can be secured with table-level permissions defined in either identity- or resource-based policies and are accessible by applications or tooling that supports the Apache Iceberg standard. When you create a table in your table bucket, the underlying data in S3 is stored as Parquet data. Then, S3 maintains the metadata necessary to make that Parquet data queryable by your applications. Table buckets include a client library that is used by query engines to navigate and update the Iceberg metadata of tables in your table bucket. This library, in conjunction with updated S3 APIs for table operations, allows for multiple clients to safely read and write data to your tables. Over time, S3 automatically optimizes the underlying Parquet data by rewriting, or "compacting” your objects. Compaction optimizes your data on S3 to improve query performance and minimize costs. Read the user guide to learn more
Customers
-
Genesys
Genesys is a global cloud leader in AI-Powered Experience Orchestration. Through advanced AI, digital and workforce engagement management capabilities, Genesys helps more than 8,000 organizations in over 100 countries to provide personalized, empathetic customer and employee experiences while benefiting from improved business agility and outcomes.
-
SnapLogic
SnapLogic is a pioneer in AI-led integration. The SnapLogic Platform for Generative Integration accelerates digital transformation across the enterprise to design, deploy, and manage AI agents and integration that automate tasks, make real-time decisions, and integrate effortlessly into existing workflows.
-
Zus Health
Zus is a shared health data platform designed to accelerate healthcare data interoperability by providing easy-to-use patient data via API, embedded components, and direct EHR integrations.