DocAI with Custom Attachment Lists

Video Walkthrough

https://players.brightcove.net/1900410236/4fVA8Ojzs_default/index.html?videoId=6394493239112

Case management work often requires workers to review, cross-reference, and compare multiple documents before making a decision. Pega’s DocAI capability enables developers to build exactly this kind of experience: users can attach documents to a Pega case and an GenAI will automatically extract, compare, and summarise the key information across all of them.

The current configuration for specifying which attachments should be analyzed by a GenAI Connect rule is limited to specifying a single attachment property:

This can obviously be a bit limiting.

Agents in Pega do a really good job of comparing and summarizing documents that a user uploads in the agent conversation

But it’s not obvious how this works. The agent tracer doesn’t show any tools being called for document analysis so what is happening here?

The attachments that the user uploads are stored as Data-WorkAttach-File-Temp objects and are referenced part of the Pega-Autopilot-Conversation work object. Every time the user asks the agent a question the attachment data for all of the uploaded documents is gathered up behind the scenes and included in the prompt that gets sent out to the LLM.

In order to be able to use a custom list of attachments in a GenAI Connect shape within a case I implemented a similar feature that can be used within the context of a case

GetCaseAttachments Activity

This activity replicates the functionality used in the OOTB Agent. It loops through D_AttachmentList for the case and builds up a JSON representation of the attachments. I tried doing this with a Data Transform but there is no OOTB Data Page that I could find that returns the mime type for a Data-WorkAttach-File.

This could be based on a different attachment list that only retrieved a subset of the case attachments eg recently uploaded, specific attachment category or any other filtering mechanism.

D_GetAttachmentData Data Page

This is a basic wrapper for the GetCaseAttachments activity that will allow it to be easily called from a Declare Expression.

AttachmentData Property and Declare Expression

This property is available at the Work- level and the declare expression allows it to be declaratively populated from any case.

So in order to use it, we can just use the AttachmentData property in a GenAI Connect rule instead of checking the box to include attachments

And at runtime we now get summaries and comparisons of custom attachment lists:

Possible Enhancements

Run a Data Transform after the GenAI Connect to remove the contents of the AttachmentData property so that the document data is not duplicated in the case blob.

Customize the data page that returns the list of attachments to add filtering or provide the user with the ability to choose which attachments are included.

Well done on this , we should definitely shoot for this to be out of the box configuraiton

well done.

Hey, this is super insightful :clap:
Had a quick question - since GenAI Connect already has the ‘Include attachments for analysis’ option, is the main reason for overriding it just the limitation of supporting a single attachment property?
Or does the custom approach give you more control over how multiple documents are passed and structured?

The way that this is set up, it’s just a workaround for not being able to control which documents are included in analysis.