Data Archival in Pega: Common Issues and How to Fix Them

If you’re on Pega Cloud, setting up data archival is actually straightforward. You define the archival policy for the case type, set the dataarchival/batchPipelineEnabled DSS to true, and enable the pyPegaArchiverUsingPipeline job scheduler with an appropriate schedule.

Here’s the thing though. In real projects, the struggle usually isn’t the setup. It’s figuring out why archival stalls, runs partially, or fails without telling you much. If that sounds familiar, this article is for you.

The Archival Pipeline at a Glance

At a high level, Pega archival follows a four-step pipeline:

  1. Crawler
    Identifies cases eligible for archival based on retention rules and policies.

  2. Copier
    Copies eligible case data from the primary database (your active Pega DB) to a secondary archival store.

  3. Indexer
    Indexes archived cases into Elasticsearch so the data remains searchable after archival.

  4. Purger
    Removes the archived data from the primary database once copy and indexing succeed.

Earlier implementations used three separate activities:

  • pzPerformArchive

  • pzPerformIndexing

  • pxPerformPurge

Latest Pega versions consolidate this logic into a single OOTB pipeline activity:

  • pzPerformArchiveUsingPipeline

This consolidation reduces orchestration complexity and is especially beneficial in cloud environments.

Pega Cloud Makes Archival Easier

On Pega Cloud, archival setup is largely simplified:

  • Pega provides a secondary archival database out of the box.

  • When using Pega Cloud File Storage Repository, archived data is stored using managed object storage (backed internally by cloud storage such as Amazon S3).

  • Configuration relies almost entirely on OOTB features rather than custom code.

In this model:

  • Primary database holds active cases.

  • Secondary database holds archived case data.

The setup itself is usually straightforward. Most real-world problems start after the pipeline is enabled.

The Challenge: Troubleshooting Archival Failures

Most archival failures are not caused by incorrect rules. They stem from lack of visibility.

Enable the Right Logs First

Before you debug anything else, get the logging right. This one step will save you hours.

In Admin Studio → Resources → Log Categories, configure the following loggers.

Native SQL Execution

These help identify database-level issues during copy and purge.

  • com.pega.pegarules.data.internal.sqlapi.exec.NativeSQLListExecutor

  • org.apache.http.wire

Search and Indexing (SRS)

These are critical for understanding Elasticsearch-related failures.

  • com.pega.platform.search.infrastructure.internal.srs.SRSConnectorImpl

Archival Pipeline Logs

These give you end-to-end visibility across the archival pipeline.

  • Archival-CaseCrawler

  • Archival-CaseCopier

  • Archival-Indexer

  • Archival-Purger

  • Archival-Search

  • Archival-Datastore

  • Archival-General

Why This Matters

With these logs enabled, you can clearly see:

  • SQL errors during copy or purge operations

  • Indexing failures that never surface in the UI

  • Pipeline execution issues that otherwise look like silent or partial failures

Without this level of logging, archival troubleshooting quickly turns into guesswork. With it, failures usually explain themselves.

Using PR_METADATA to Understand What Failed

PR_METADATA is one of the most useful and underused tools for debugging archival issues. When the pipeline fails, this table usually knows why.

A Practical Debugging Pattern

  1. Clear existing metadata
    Create a small utility activity with a single step:

    • Method: RDB-Delete

    • Class: Data-ArchivalMetadata

    • RequestType: pzTruncateDataArchivalMetadata

    • Access: ALL

    This gives you a clean slate before testing again.

  2. Run archival manually
    Execute the archival steps manually using the three core activities (archive, index, purge). This isolates which phase is failing instead of relying on the scheduled job.

  3. Query the metadata

    SELECT * FROM pr_metadata;
    
    
  4. Inspect the results
    Pay close attention to the pyComment column.

What You’ll Usually Find

PR_METADATA often contains clear, actionable failure details, such as:

  • Indexing failures

  • SQL exceptions

  • Copy or purge errors tied to specific cases

Instead of rerunning jobs blindly and hoping for a different outcome, this approach gives you concrete signals about what failed and where to focus next.

Indexing Is the Most Fragile Step

Across projects, indexing is the most common point of failure in the archival pipeline.

Typical symptoms include:

  • Data copied successfully but not searchable

  • Pipeline completing partially

  • Repeated retries without progress

Most indexing issues fall into one of three buckets:

  • Elasticsearch connectivity or schema mismatches

  • Platform defects in specific product versions

  • Environmental constraints preventing index writes

The key is visibility. With proper SRS and archive logs enabled, these issues become diagnosable rather than mysterious.

Environmental Constraints Can Stop Archival Completely

One issue teams often overlook is environmental health, especially in lower-tier environments.

Examples include:

  • Blob storage limits

  • Insufficient processing capacity

  • Queue or thread starvation

When this happens, the pipeline may stop picking up cases altogether.

What to Watch For

  • PEGA0004 alerts in logs

  • PoisonPill messages indicating the system intentionally halted processing

This is Pega signaling that archival cannot proceed safely until the environment is stabilized. No rule or pipeline change will fix this until the underlying constraint is resolved.