When building GenAI-powered agents in Pega, a common and intuitive approach is to expose enterprise data through a data page and use that as a knowledge source.
At first glance, this seems perfectly reasonable.
However, when working with large datasets (e.g., ~60,000 records) and asking agents to perform aggregations and summaries, this approach can produce:
-
Inconsistent results
-
Poor performance
-
Excessive token usage
In contrast, Pega GenAI “Chat With Your Data” takes a fundamentally different approach—one that is both more efficient and more reliable.
This post walks through a real example comparing both patterns.
Note: This testing was performed on Pega Infinity 25.1.2. Behavior and capabilities may evolve in future releases.
The Scenario
We use a dataset of approximately 60,000 customer records and ask a simple question:
“Show me a count of customers by state in the New England region, and break that down by city.”
This is a classic aggregation query—something a database is designed to handle efficiently.
Approach 1: Data Page–Driven Agent
Agent Design
The agent is configured to:
-
Use a data page tool (
customerdata) -
Retrieve records and perform aggregation inside the LLM
You are an application-level assistant that helps users explore and understand customer data provided via a Data Page. Use the Data Page context to answer questions, summarize key customer facts (profile, status, recent activity, risks, and notable changes), and explain the “why” behind any insights using only the data you are given. If the user’s question can’t be answered from the available Data Page content, ask a short follow-up question describing exactly what data is missing, or suggest what additional source/tool (if available) would be needed—without guessing or fabricating details.
Use the customerdata tool to get a list of customers.
If you are being asked to get a count, summary, or aggregate value get the list of customers from the customerdata tool then calculate what you were asked for using the list that is returned by the tool.
Tool Invocation Requires Reinforcement
To even get reliable behavior:
-
The same questions were added to:
-
Quick Select prompts
-
Tool example phrases
-
Caption:
Tool usage had to be explicitly reinforced through both quick-select prompts and tool example phrases. Without this, the agent did not consistently invoke the data page.
Observed Behavior
1. Default Data Page Limitation
Caption:
Default data page limits restrict the number of rows returned, meaning aggregation is often performed on incomplete data.
The agent is often not operating on the full dataset
2. Inconsistent Results Across Runs
The same exact question was asked multiple times against the same agent.
Repeated executions of the same query produce different results, including varying counts and distributions.
Observed issues:
-
Totals differ between runs
-
State distributions are inconsistent
-
City-level breakdowns do not reconcile
Why this happens
The agent is following this pattern:
-
Call the data page
-
Retrieve a list of records (often incomplete)
-
Perform aggregation inside the LLM
Variability in:
-
Returned row subsets
-
Tool invocation behavior
-
LLM reasoning
Leads to non-deterministic outputs
3. Token Inefficiency
Caption:
The data page approach generates extremely high token usage (~120K tokens) due to retrieving large record sets and reprocessing them in the LLM.
The LLM is effectively doing the job of a database query engine
Root Cause
The issue is architectural:
Retrieve records → Then aggregate in the LLM
This results in:
-
Large payload transfers
-
High token consumption
-
Inconsistent outputs
-
Poor scalability
Approach 2: Chat With Your Data
Agent Design (Governed and Constrained)
You are a governed analytics assistant for demo customer data. Your responsibility is to use the “chat with your data” capability to produce summaries and aggregated query results for the Customer data type.
Hard scope restriction: You MUST ONLY access data from the class Demo-AIUseCases-Data-Customer. Do not access or reference any other class, case type, data type, external system, or knowledge source for customer record retrieval.
Default behavior: Prefer aggregated answers (counts, distributions, averages, min/max, percentiles, group-bys, top-N, trends) over returning raw records. If the user asks for “all customers,” “export,” “full list,” or any unbounded record dump, redirect them to request a summarized or aggregated query instead.
Record volume rule: If fulfilling a request would return more than 30 records, you MUST warn the user before returning results and propose a summarized alternative.
Explainability: Clearly state (a) what filters you applied, (b) what aggregation you performed, and (c) what the results mean in plain language.
Context discipline: Keep responses concise and avoid returning large text or large tables.
Important: Constraining the Data Scope
One key difference with Chat With Your Data is that the agent will attempt to determine which data class to query unless explicitly constrained.
Example Risk
If your system contains:
-
SMB-Customer -
Consumer-Customer
An unconstrained agent may:
-
Select the wrong data class
-
Misinterpret the scope of the question
Leading to incorrect aggregates and misleading results
Why This Matters
Unlike data pages, which are fixed:
Chat With Your Data is flexible by design—but requires governance
That’s why this instruction is critical:
Hard scope restriction: You MUST ONLY access data from the class Demo-AIUseCases-Data-Customer.
Best Practice
-
Always explicitly constrain the data class
-
Do not rely on naming or inference
-
Treat data source selection as a governance decision
Observed Behavior
1. Consistent, Accurate Results
Caption:
Chat With Your Data produces consistent, fully reconciled results with clear aggregation logic and structured output.
Example results:
-
Total customers: 6,517
-
CT: 288
-
MA: 5,420
-
NH: 527
-
VT: 282
City totals correctly roll up to state totals
Results remain consistent across executions
2. Structured and Explainable Output
The response includes:
-
Filters applied
-
Aggregation performed
-
Structured summaries
-
Clear explanation of results
Outputs are both deterministic and explainable
3. Token Efficiency
Caption:
Aggregation is executed at the data layer, dramatically reducing token usage and eliminating large payload transfers.
No full dataset retrieval
No token spikes
More efficient execution
Why This Works Better
The difference is architectural:
Data Page Pattern
Retrieve all data → Aggregate in the LLM
Chat With Your Data Pattern
Generate query → Aggregate in the database
Key Takeaways
Data Pages Work Well For
-
Record-level access
-
Small datasets
-
Customer detail exploration
Use Caution with Data Pages For
-
Large datasets
-
Aggregation queries
-
Analytical use cases
Chat With Your Data Is Ideal For
-
Aggregations and summaries
-
Large datasets
-
Consistent, explainable outputs
-
Efficient execution
Final Thought
This was not a misconfiguration.
-
The agent was clearly instructed
-
Tool usage was reinforced
-
Example phrases were aligned
Yet the limitations persisted.
Because the issue is not configuration—it is architecture.
If you ask an LLM to do a database’s job, you will pay for it—
in tokens, performance, and correctness.
Recommendation
When building GenAI-driven analytics in Pega:
-
Be intentional about where computation happens
-
Prefer database-driven aggregation over LLM reasoning
-
Use Chat With Your Data for scale
-
Always apply strict data scoping guardrails



