Choosing the right model for GenAI Connect

As I work more with the AI Designer in the Pega Platform, I have been trying to better select the model that I use based on the task that I want the GenAI Connect to accomplish. In a recent demo, I needed the GenAI Connect to read a PDF form that I uploaded in the case flow. Many of the models accept PDF but I was not getting the results I expected. The GenAI Connect was struggling with the radio buttons. I worked on an improved prompt but ultimately realized that I needed to use a different model. To read check boxes, I found that Claude had strong document handling and was able to accurately decipher checkboxes and visual items from the PDF Form. Below is a list to help narrow down what model to use.

Azure OpenAI – GPT‑4 / GPT‑4‑class

  • Best at
    • Complex reasoning and decision support
    • High‑quality summarization and classification
    • Agent reasoning and multi‑step workflows
    • Strong document understanding
  • Attachment / Input types
    • Text :white_check_mark:
    • PDF :white_check_mark: (via text extraction / document ingestion)
    • Word (DOCX) :white_check_mark:
    • Image :white_check_mark: (vision‑capable GPT‑4 variants)
    • Audio :white_check_mark: (via speech‑to‑text + GenAI orchestration)

Azure OpenAI – GPT‑3.5‑class

  • Best at
    • Cost‑efficient summarization
    • Classification and tagging
    • Structured data extraction
    • Conversational flows
  • Attachment / Input types
    • Text :white_check_mark:
    • PDF :white_check_mark: (text‑only)
    • Word (DOCX) :white_check_mark:
    • Image :cross_mark:
    • Audio :cross_mark:

AWS Bedrock – Amazon Titan

  • Best at
    • Secure, enterprise‑controlled workloads
    • Summarization and text generation
    • Embeddings and semantic search
  • Attachment / Input types
    • Text :white_check_mark:
    • PDF :white_check_mark: (text‑only)
    • Word (DOCX) :white_check_mark:
    • Image :cross_mark:
    • Audio :cross_mark:

AWS Bedrock – Anthropic Claude

  • Best at
    • Long‑context reasoning
    • Large document and policy analysis
    • Safe, compliance‑oriented enterprise responses
  • Attachment / Input types
    • Text :white_check_mark:
    • PDF :white_check_mark: (very strong document handling)
    • Word (DOCX) :white_check_mark:
    • Image :white_check_mark: (Claude vision models, where enabled)
    • Audio :cross_mark:

Google Vertex AI – Gemini

  • Best at
    • Multimodal understanding
    • Reasoning across text + images
    • Conversational and agent‑based use cases
  • Attachment / Input types
    • Text :white_check_mark:
    • PDF :white_check_mark:
    • Word (DOCX) :cross_mark:
    • Image :white_check_mark:
    • Audio :white_check_mark:

Other AWS Bedrock / Google Vertex Foundation Models

  • Best at
    • Use‑case‑specific tasks (summarization, embeddings, classification)
    • Cost‑ or latency‑optimized workloads
  • Attachment / Input types
    • Text :white_check_mark:
    • PDF :white_check_mark:*
    • Word (DOCX) :white_check_mark:*
    • Image :cross_mark:*
    • Audio :cross_mark:*
    • *Depends on the specific foundation model selected

Super useful , will definitely leverage this.

Good one Mike. Thanks for sharing!

Nice information about the use of different models. If any one wants more depth information on prompting do 2 days course for few dollars at GrowthSchool – Become the Top 1%

Great list Mike! I recently setup a connect rule to read client purchase orders and realized that I had messed with the model and changed it to GPT. I started getting very interesting results when mapping addresses to the appropriate fields. A model switch to Bedrock fixed it right up.

I need to correct something in my post. While the Gemini App can support Word Docs, you cannot upload a Word Document through the API. This means that for Pega, you should not use Gemini if you need to read DOCX files.

Informative! Thanks for sharing