Approach for AI-based Code Review in Pega – Looking for feedback

Hi everyone,

I’m exploring an approach to build an AI-driven code review utility in Pega and wanted to get some feedback from the community.

The goal is to automate basic code reviews and catch common issues (like hardcoding, Obj-Save usage, guardrail violations, etc.) before manual LSA review.

Here’s the approach I’m thinking of:

  1. Fetch rules from a branch / ruleset version

    • Using D_pzRulesForRulesetVersion

    • Extract pzInsKey for all rules

  2. Rule extraction

    • Open each rule using pzInsKey

    • Convert rule content into XML

  3. GenAI Analysis

    • Pass XML to GenAI Connect

    • Ask AI to review based on:

      • Best practices

      • Performance

      • Guardrails

      • Common anti-patterns

  4. Output generation

    • Aggregate responses

    • Generate a downloadable Word/Excel report

    • Provide:

      • Issues

      • Recommendations

Questions / Feedback needed:

  • Is this a good approach architecturally?

  • Is using XML the right input, or is there a better structured way to extract rule logic?

  • Are there better ways to fetch and analyze rules across a branch?

  • Any limitations or risks I should consider (performance, accuracy, etc.)?

  • Has anyone implemented something similar?

2 Likes

Great initiative!

I’m afraid to use GenAI connect because of the following reasons,

  1. XML Payload is huge (depends on the number or rules)
  2. XMLs may not have the the actual rule representation based on my observations. Though XMLs are good for the data instances.
  3. The rules could be referenced in 1 or more rules. You’re supposed to send the multiple XMLs.
  4. If we’re reusing the rules from the Reusable layers or OOTB layer, then we don’t necessarily keep the reference in the branch. For example: SendEmail activity from OOTB layer.
  5. Need to ensure tpo provide context of the functionality or branch needs to be defined correctly.
1 Like

That’s a really valid point, and honestly something I was also starting to feel while testing this out :+1:

Especially around XML not truly representing the actual logic and the context problem across multiple rules - makes sense.

I’m curious though :backhand_index_pointing_down:

If XML isn’t the most reliable input, what would you suggest as a better approach to represent rule logic for AI?

1 Like

Great initiative and a thoughtful breakdown.

The strongest aspect of this approach is treating GenAI as an advisory layer that analyzes structured inputs and produces constrained recommendations, not enforcement.

Concerns around XML fidelity and cross-rule context are valid, which is why design-time heuristics should be complemented with runtime evidence such as alerts or PDC data.

I’d be interested to hear how others are tackling AI-driven code review in Pega

Good idea Pooja. I would recommend JSON instead, as LLMs might understand it faster as they are light weight. Also JSON is better with token efficiency compared to XML which has repeated closing tags per element.

I would expect the results to generic and limited to the rule types, it might not know if the rule is in correct ruleset or class. But might cover majority of scenarios.

Let us know how you get along and your findings. Thank for sharing.

why can’t we fetch java code and ask same questions to LLM’s. java code will have logic and current codex or claude AI models can answer it better. Its a stupid question but logical :slight_smile: