I’m exploring an approach to build an AI-driven code review utility in Pega and wanted to get some feedback from the community.
The goal is to automate basic code reviews and catch common issues (like hardcoding, Obj-Save usage, guardrail violations, etc.) before manual LSA review.
Here’s the approach I’m thinking of:
Fetch rules from a branch / ruleset version
Using D_pzRulesForRulesetVersion
Extract pzInsKey for all rules
Rule extraction
Open each rule using pzInsKey
Convert rule content into XML
GenAI Analysis
Pass XML to GenAI Connect
Ask AI to review based on:
Best practices
Performance
Guardrails
Common anti-patterns
Output generation
Aggregate responses
Generate a downloadable Word/Excel report
Provide:
Issues
Recommendations
Questions / Feedback needed:
Is this a good approach architecturally?
Is using XML the right input, or is there a better structured way to extract rule logic?
Are there better ways to fetch and analyze rules across a branch?
Any limitations or risks I should consider (performance, accuracy, etc.)?
I’m afraid to use GenAI connect because of the following reasons,
XML Payload is huge (depends on the number or rules)
XMLs may not have the the actual rule representation based on my observations. Though XMLs are good for the data instances.
The rules could be referenced in 1 or more rules. You’re supposed to send the multiple XMLs.
If we’re reusing the rules from the Reusable layers or OOTB layer, then we don’t necessarily keep the reference in the branch. For example: SendEmail activity from OOTB layer.
Need to ensure tpo provide context of the functionality or branch needs to be defined correctly.
The strongest aspect of this approach is treating GenAI as an advisory layer that analyzes structured inputs and produces constrained recommendations, not enforcement.
Concerns around XML fidelity and cross-rule context are valid, which is why design-time heuristics should be complemented with runtime evidence such as alerts or PDC data.
I’d be interested to hear how others are tackling AI-driven code review in Pega
Good idea Pooja. I would recommend JSON instead, as LLMs might understand it faster as they are light weight. Also JSON is better with token efficiency compared to XML which has repeated closing tags per element.
I would expect the results to generic and limited to the rule types, it might not know if the rule is in correct ruleset or class. But might cover majority of scenarios.
Let us know how you get along and your findings. Thank for sharing.
why can’t we fetch java code and ask same questions to LLM’s. java code will have logic and current codex or claude AI models can answer it better. Its a stupid question but logical