How would you design a Pega application to handle 1M+ cases/day with consistent performance?

Hi,

I wanted to understand best practices for designing a high-volume Pega application (1M+ cases/day) while maintaining consistent performance across different layers.

Specifically, I’m trying to understand how to balance:

  • Data Pages → caching strategy, refresh frequency, avoiding clipboard bloat

  • Report Definitions → handling large datasets, joins, and aggregation efficiently

  • Background processing → Queue Processors, Data Flows, and scaling strategy

Some challenges I’m thinking about:

  • Avoiding performance bottlenecks due to heavy data page usage or large result sets

  • Handling high concurrency with consistent response times

  • Designing efficient data access patterns (RD vs RDB-List vs data pages)

  • Ensuring queue processors and async processing scale properly

  • Managing database load vs Pega layer caching

What are the design patterns you consider in such high-volume systems?

Hello @PoojaPalla

Very good post to highlight important points.

For high-volume Pega applications (1M+ cases/day), tuning individual components like Data Pages or Report Definitions is important, but the real differentiator is overall architecture design.

In addition to the points you mentioned, here are some key design patterns to consider:

  • Case design & lifecycle: Keep cases lightweight, avoid large clipboard footprints, and offload heavy processing to asynchronous layers (Queue Processors/Data Flows).
  • Commit & transaction strategy: Avoid large or frequent commits; design proper transaction boundaries to prevent DB contention.
  • Database design: Ensure proper indexing, exposed columns for reporting, and minimize joins on large datasets. DB performance is often the primary bottleneck at scale.
  • Integration patterns: Prefer async integrations over synchronous calls in critical paths. Use retry, idempotency, and fallback mechanisms.
  • Node specialization: Separate UI, background processing, and integration workloads across nodes to ensure consistent performance.
  • Clipboard & memory management: Minimize page list sizes and avoid unnecessary data duplication.
  • Bulk processing: Use Data Flows or batch processing instead of iterative loops for large datasets.

There are few more to consider like to use Leverage PDC, PAL, and queue monitoring to proactively identify bottlenecks if any.

I would let other community members also to add if they can think of any other areas to focus on.

Regards

JC

4 Likes

Not my area of expertise by any means but it also depends what UI architecture you are on? I think from your other posts @PoojaPalla you are working within Constellation? This changes a little what you can do/control, and thankfully is more performant due to the reduced server side interaction.

Like JC, I would be curious on others experiences here.


The question is broad, but that’s expected for systems operating at this scale. When I hear requirements like 1M+ cases per day, the first thing I challenge is whether “cases” are actually the right abstraction. In Pega there is a clear distinction between data and cases. Cases bring significant built‑in capabilities around lifecycle, auditability, locking, assignments, SLA, and governance. That overhead is essential for true case management, but if those capabilities are not being used deliberately, the cost can be substantial and will directly impact throughput and latency.

The second important point is that, at this level of volume and concurrency, this is no longer just casual business application development; it becomes an engineering problem. There is no single design pattern or best practice that neatly solves data access, caching, reporting, asynchronous processing, and database load at the same time. Each area introduces its own constraints, and optimizing one typically means accepting trade‑offs in another. It is best approached as an obstacle course, where each design decision clears one constraint while introducing the next.

This is also why performance testing cannot be treated as a final validation step. It has to be part of an iterative process, continuously informing design decisions as the application evolves. At this scale, assumptions that look reasonable on paper tend to break under real load, and only repeated testing and adjustment will reveal where the true bottlenecks are.

1 Like