Replicating with PGD v1.7

Integrate Postgres Distributed (PGD) and Postgres Analytics Accelerator (PGAA) to manage the complete data lifecycle within a unified architecture. PGD provides the high-availability transactional backbone for hot data, while PGAA delivers the vectorized engine and cloud-native handlers for cold data in object storage.

This collaboration creates two distinct storage tiers:

  • Hot tier (PGD): Local PostgreSQL storage for current operational data requiring sub-millisecond latency and full ACID compliance.
  • Cold tier (PGAA): Cost-effective object storage using columnar formats like Apache Iceberg® or Delta Lake, optimized for high-speed analytical queries.

Together, they allow you to seamlessly transition data from high-performance transactional storage to a cost-effective data lake, all while maintaining transparent access through a unified Postgres interface. This architecture supports a variety of high-impact use cases:

  • Regulatory compliance: Offload aged data and audit trails to maintain lean, high-performance systems while ensuring long-term retention.
  • Time-series & IoT: Balance real-time dashboards using hot recent data with cost-effective cold historical analytics for sensor and event streams.
  • Financial services: Manage massive volumes of required historical transaction records without inflating transactional storage costs.
  • Retail & seasonal business cycles: Use adaptive policies to keep peak-season order data hot for customer service while tiering older records to cold storage.

Understanding table types

This architecture implements two distinct storage tiers: the hot tier (local SSD, sub-millisecond latency) and the cold tier (object storage, columnar format). Data exists in one of the following three distinct table types:

Table typeStorage locationBehavior
HeapLocal disk onlyStandard transactional table in the hot tier.
Hybrid Analytics and Transactional Processing (HTAP)Local + object storageA heap table mapped to an analytical copy. PGD replicates all changes in near real-time.
PGAAObject storage onlyA purely analytical cold table, available in the Delta or Iceberg format.

Tiered storage strategies

There are three available methods to transition data between tiers based on your operational needs.

Implementing tiered tables

This method leverages PGD AutoPartition to automate the lifecycle of high-volume data. A heap table is transformed into a partitioned structure where:

  • Hot partitions: Recent data remains as heap (or HTAP if enable_replication is enabled) for active processing.
  • Cold partition: Once partitions cross an age threshold, PGD bulk-copies them to object storage, converts them to PGAA tables, and reclaims local disk space, maintaining the data queryable.

Replicating to analytics

This method continuously synchronizes a table with an analytical copy in object storage. A heap table is converted into an HTAP table. To the user, it appears as a standard transactional table, but an analytical copy is maintained in the background.

To ensure near real-time synchronization without taxing your transactional throughput, PGAA uses Iceberg Merge-on-Read (MoR). This records changes in small "delete files" rather than rewriting entire data blocks, keeping the data lake fresh with minimal overhead.

Offloading to analytics

This method offers a one-time, surgical way to immediately reclaim local disk space for HTAP tables, leaving only the copy in object storage (PGAA table). The local heap copy is truncated, and the table access method is altered to PGAA, which now points to object storage.

If you need to bring your data back for transactional writes, you can restore the table back from object storage into the local Postgres heap.

MethodPurposeUse caseOrigin TableDestination table
Tiered tablesAutomated
lifecycle management
High-volume time-series
data where historical retention
is critical and automated.
HeapHeap or HTAP (hot)
PGAA (cold)
ReplicationContinuous
synchronization
Replicating a transactional table
in near real-time for general analytics.
HeapHTAP
OffloadSelective
archiving
One-time, surgical removal and
archival of a specific table.
HTAPPGAA

Metadata management

PGAA supports two methods for managing object store metadata when integrating with PGD: