NodeLevel Datapage Caching Behaviour

Hi everyone,

I have a requirement where I am using a Node-level Data Page to fetch and cache product information from an external API. The Data Page accepts ProductID as a parameter.

I am running into an issue with the cache behavior and need to ensure the following:

  • Empty Responses: If a ProductID is not found, the resulting empty page should not be cached.

  • API Exceptions: If the server returns an error or exception, that failed request should not be cached.

What I have tried: I attempted to handle this by setting a page message on the primary page whenever there is an error or exception. However, when inspecting the clipboard, I can see that Pega still caches the Data Page in its error state. This means subsequent calls for that ProductID instantly fail without retrying the API, and we are consuming memory caching useless pages.

My Questions:

  1. What is the best practice to prevent these failed or empty instances from being cached at the Node level?

  2. Is there a recommended way to instantly evict these specific pages from the cache so the system properly retries the API call on the next request, while keeping memory consumption low?

Any guidance or examples of how you have solved this would be greatly appreciated!

To prevent Pega from caching empty or failed Node-level Data Pages, you must use the Page-Remove method directly inside your data source load activity. Immediately after the API call step, evaluate the response to check if the data is missing or if a server error occurred. If an issue is detected, add a step that executes the Page-Remove method and set its step page to the primary page of the Data Page. By entirely deleting the primary page before the load activity completes, Pega automatically aborts the caching process for that specific parameter instance. This completely prevents the failed state from consuming node memory and guarantees that the very next request for that product will instantly trigger a fresh API retry.

Hi, Thanks for the suggestion.

Are you suggesting to execute Page-Remove step on the Primary page context within the Activity referenced as source by the Node-level Data Page?

I have tried the following:

  1. Using ‘Primary’ as the Step Page in Page-Remove step: This triggers a validation error stating that Page-Remove cannot be performed on the symbolic name ‘Primary’.
This record has 1 error(s) in 1 place(s) .

Step page—

Method Page-Remove cannot be performed on a symbolic Page name: Primary
  1. Using Page-Remove without a Step Page: The method executes, but subsequent calls with the same parameters continue to pull from the cache rather than triggering a fresh API call."

The Datapage is still returning the cached instance for these parameters instead of re-invoking the source activity.

Any more suggestion?

For a node-level data page, once Pega loads an instance for a parameter set, it can cache that page even if it is empty or contains an error state, so just setting page messages is not enough to stop caching. The practical best practice is to avoid using node scope for data that can legitimately return “not found” or transient API failures unless you also have a deliberate invalidation strategy.

Node scope is best for relatively static reference data, not volatile or failure-prone lookups

For this scenario, a requestor or thread scope data page, possibly with a short refresh strategy, is often safer because you avoid poisoning a shared node cache with empty/error instances.

If you must keep node scope, then the practical option is to explicitly flush/invalidate the page instance when the load fails or is empty. Pega provides declarative page deletion APIs and functions to remove cached data page instances

you can choose one of the below 3 options for Node level page-

  • deleting all instances of a declarative page with pzDeleteAllInstancesOfDeclarativePage
  • flushing the specific data page instance after load failure
  • using expiration/idle settings so bad pages do not remain too long

trying to flush a node-level page from the data page’s own post-processing may not behave the way you expect, because the page can still remain cached in error state until expiration. In other words, “flush on error from inside the same load cycle” is not always reliable for node-scoped pages

Use a node-level page only for successful responses, and route failures through a different pattern. invoke connector call in activity/data transform, if success, populate the data page, if empty/error, do not rely on the node page as the source of truth. use explicit invalidation/ flush logic and expiry strategy, but flushing a failed node-level data page immediately from post-processing is not always reliable

Thanks for the detailed explanation, Ravi.

Agreed, performing Page-Remove inside the Data Page context doesn’t evict from the cache.

This function may potentially clear the cache for other parameterized pages associated with the Data Page. It won’t work for my usecase.

I have implemented the below configuration during invocation to ensure cache eviction is applied only to the specified parameter. This setup has been verified and functions as intended.

I was trying to understand how other architects feel about this behaviour and what they practise in their applications.

I would disagree with this viewpoint. Many business scenarios require API responses to be cached across applications. So, by default transient failures are unavoidable. Static data, including dropdown lists and feature flags sourced from third‑party APIs, are good candidates for caching depending on business needs.

I believe that, in practice, the Pega platform should stop caching once the Data Page’s primary page is invalidated using a Page‑Set‑Messages step, as this is the error‑handling and page‑invalidation mechanism recommended by Pega.

Error detection configuration | Pega Academy

I believe the Pega Engineering team should reconsider the current caching behavior for node‑level Data Pages.

What is your perspective on this?

Thanks for sharing the approach. The parameter-specific invalidation is a good refinement.
I agree that Page-Set-Messages is the correct way to flag the Data Page as invalid from an error-handling perspective, but I still think cache eviction behaviour for node-level pages should be treated separately from message handling.

page messages are great for marking the load as failed, but node-scoped caching can still retain the instance unless the page is explicitly invalidated or removed through a cache-control path. So I would view this as error detection + explicit cache management not just error detection alone.

In practice, I think the most reliable pattern is exactly what you’ve done: keep the invalidation targeted to the specific parameter instance, and avoid broad page clearing that affects unrelated cached entries.

The platform behaviour you’re describing is not surprising. Pega’s error-handling guidance focuses on detecting and surfacing errors, while cache invalidation is handled through separate mechanisms and can be scope-specific. That means it is reasonable to expect a Data Page to end up in an error state even though the page has a message on it.

So I would not expect Pega Engineering should stop caching because a page message exists. I’d argue the current behaviour is consistent with the separation between error state and cache state, even if it’s not always the most intuitive behaviour for architects. :slight_smile:

I agree that node scope absolutely has a place and should not be avoided just because an API can fail transiently. My point was more about how to treat failures in a node-cached lookup. If the API returns a transient error or a “not found” response, I would still prefer that the cache strategy be explicit, because a failed response cached at node scope can affect every consumer until expiry or invalidation kicks in.

So I’d distinguish between good caching candidates like relatively static reference data and failure states that should be handled differently so they do not disturb the shared cache.

So I’m not against node scope itself. I’m only saying that when the response can be empty or error-prone, the implementation should include a deliberate eviction or retry strategy. That keeps the benefit of shared caching without letting a temporary issue become a shared bad cache entry.