Don Schuerman’s recent post on stochastic vs. deterministic work framed the theoretical question perfectly. Ryan Easley’s article on testing and monitoring gave us the evaluation framework. Mike Hancock’s model selection guide gave us the toolbox. What I want to add is the practitioner’s decision framework — the one I use when I’m standing in front of a whiteboard with a delivery team asking: “should this step be an agent or a workflow?”
This isn’t about whether AI is good or bad. It’s about placement. Every step in a case lifecycle is a decision point: deterministic automation, human task, or AI agent. Getting the placement wrong is worse than not using AI at all — because a misplaced agent introduces unpredictability exactly where the business needs certainty.
THE THREE QUESTIONS
For every step in a case lifecycle, I ask three questions in order. The first “yes” determines the pattern.
-
Is the outcome objectively knowable from the inputs?
If given the same inputs, should the outcome always be the same? If yes → deterministic automation. Decision tables, when rules, data transforms. No AI needed. Examples: regulatory eligibility checks, fee calculations, SLA assignments, routing based on case attributes. -
Is the outcome measurable but uncertain?
Can you define what “correct” looks like but need predictive power to get there? If yes → predictive AI (Pega’s adaptive models, Next Best Action, propensity models). These are probabilistic but measurable — you can track outcomes and the model improves with feedback. Examples: churn prediction, claim fraud scoring, case prioritization. -
Is the outcome subjective, contextual, or creative?
Does the step require interpreting unstructured data, generating natural language, or making judgment calls on ambiguous inputs? If yes → GenAI agent. Examples: document summarization, email drafting, extracting fields from unstructured PDFs, conversational triage.
The critical insight: most case lifecycles are 70–80% deterministic, 10–20% predictive, and 5–10% genuinely agentic. If your architecture flips that ratio, you’ve introduced unnecessary risk.
THE PLACEMENT MAP
Here’s how this maps to Pega’s Infinity '25 capabilities:
-- DETERMINISTIC –
Pega Capability: Decision tables, when rules, data transforms, SLA rules
Governance Model: 100% auditable, version-controlled, branch-reviewable
When It Fails: Never (by definition)
-- PREDICTIVE –
Pega Capability: Adaptive models, CDH (Customer Decision Hub), Next Best Action
Governance Model: Measurable via outcome feedback loops, Prediction Studio
When It Fails: Degrades with data drift — detectable via monitoring
-- AGENTIC (GenAI) –
Pega Capability: Agent Steps, GenAI Connect, Doc Agent, Application Agent
Governance Model: Prompt governance, AI Tracer, confidence scoring, human-in-the-loop review
When It Fails: Hallucination, drift, model changes — needs continuous evaluation
WORKED EXAMPLE: INSURANCE CLAIM PROCESSING
A typical property damage claim lifecycle might look like this. Classification for each step:
-
Receive FNOL (First Notice of Loss) submission → DETERMINISTIC
Data capture, assignment to queue based on claim type and region. Decision table. -
Extract data from uploaded photos and documents → AGENTIC
Doc Agent reads the adjuster’s report, extracts damage descriptions, amounts, dates. GenAI Connect processes images for damage classification. -
Validate policy coverage → DETERMINISTIC
Policy terms are structured data. A when rule checks coverage limits, exclusions, deductible amounts. Zero ambiguity. -
Assess fraud risk → PREDICTIVE
Adaptive model scores the claim based on historical patterns, claimant history, third-party data. Confidence threshold routes to SIU (Special Investigations Unit) or auto-approves. -
Generate claim summary for adjuster → AGENTIC
Agent Step summarizes the FNOL, extracted document data, and policy validation into a 3-paragraph briefing for the adjuster. Grounded in case data, not internet knowledge. -
Approve/deny claim → DETERMINISTIC + HUMAN
Under $X threshold with low fraud score: auto-approve via decision table. Over threshold: human task with the agent-generated summary as context. -
Generate denial letter → AGENTIC WITH GOVERNANCE
GenAI drafts the letter. Human reviews before sending. Prompt is governed by approved templates. Content is auditable.
The ratio in this example: 3 deterministic steps, 1 predictive, 3 agentic. The agentic steps are concentrated where unstructured data meets judgment — exactly where they belong.
PRACTICAL GOVERNANCE: WHAT “PREDICTABLE AI” ACTUALLY MEANS IN DELIVERY
Pega’s “Predictable AI” positioning isn’t marketing — it maps directly to architectural decisions:
-
Agent Steps have prompt sources and response targets. The prompt is bound to a case field (structured input). The response is written to a case field (structured output). This means the AI’s work is part of the case record — auditable, versioned, traceable.
-
Tools are governed. An Agent Step can only use tools you explicitly expose. It can’t discover or invoke arbitrary APIs. This is fundamentally different from autonomous agent frameworks where the agent decides what tools to use.
-
Confidence scoring is composable. You can ask the LLM for a confidence score in the prompt, parse it into a case field, and use a decision table to route based on that score (high confidence → auto-proceed, low confidence → human review). The confidence score becomes a deterministic routing signal, wrapping the probabilistic AI output in a governed workflow.
-
AI Tracer gives observability. Every agent invocation, every prompt, every response, every tool call is logged. This feeds into the testing framework Ryan Easley described — golden conversations, replay evaluation, drift detection.
THE ANTI-PATTERNS
In delivery, I watch for these:
-
Using an Agent Step for what a decision table can do. If the logic is expressible as if/then rules with structured inputs, don’t introduce LLM latency, cost, and nondeterminism. A decision table is instantaneous, free, and 100% predictable.
-
Skipping human review on agentic outputs that reach the customer. Any AI-generated content that leaves the system (emails, letters, chat responses) should have a human gate until you’ve built confidence through evaluation metrics.
-
Treating the LLM as a system of record. The case is the system of record. The LLM is a processing step. Its output gets written to case fields, validated, and governed by the same rules as any other data.
-
Deploying without an evaluation strategy. If you can’t tell me how you’ll detect when the agent’s quality degrades, you’re not ready for production. This is where DeepEval, golden datasets, and the AI Tracer come in.
-
Treating agent placement as a one-time architecture decision. The ratio shifts as data matures. A step that needed an agent at launch may become fully deterministic once patterns are well-understood and expressible as rules. Architecture is not a snapshot — it’s a continuous evaluation discipline. Revisit placement every time you add a new data signal or observability finding.
OVER TO YOU
The AI Expert Circle is new and I’m excited to see the quality of discussion already happening. I’d love to hear:
-
What’s your ratio? In the workflows you’re building, what percentage of steps are deterministic vs. predictive vs. agentic? Is 70/20/10 matching your experience?
-
Where have you surprised yourself? Have you found a use case where an Agent Step worked better than expected — or worse?
-
How are you handling confidence scoring? Are you using LLM-generated confidence scores to route decisions? How are you calibrating the thresholds?
RELATED READING
-
Agents for Stochastic vs. Deterministic Work — Don Schuerman
-
Mastering Trust: Testing, Monitoring, and Safety — Ryan Easley
-
Choosing the Right Model for GenAI Connect — Mike Hancock
-
5 AI Placement Patterns — AI Expert Circle series
-
Smart Cases, Faster Decisions: Agent Steps — Constellation 101
-
From Blueprint to Agents — Pega Cloud Summit 2026