As I work more with the AI Designer in the Pega Platform, I have been trying to better select the model that I use based on the task that I want the GenAI Connect to accomplish. In a recent demo, I needed the GenAI Connect to read a PDF form that I uploaded in the case flow. Many of the models accept PDF but I was not getting the results I expected. The GenAI Connect was struggling with the radio buttons. I worked on an improved prompt but ultimately realized that I needed to use a different model. To read check boxes, I found that Claude had strong document handling and was able to accurately decipher checkboxes and visual items from the PDF Form. Below is a list to help narrow down what model to use.
Azure OpenAI – GPT‑4 / GPT‑4‑class
- Best at
- Complex reasoning and decision support
- High‑quality summarization and classification
- Agent reasoning and multi‑step workflows
- Strong document understanding
- Attachment / Input types
- Text

- PDF
(via text extraction / document ingestion) - Word (DOCX)

- Image
(vision‑capable GPT‑4 variants) - Audio
(via speech‑to‑text + GenAI orchestration)
- Text
Azure OpenAI – GPT‑3.5‑class
- Best at
- Cost‑efficient summarization
- Classification and tagging
- Structured data extraction
- Conversational flows
- Attachment / Input types
- Text

- PDF
(text‑only) - Word (DOCX)

- Image

- Audio

- Text
AWS Bedrock – Amazon Titan
- Best at
- Secure, enterprise‑controlled workloads
- Summarization and text generation
- Embeddings and semantic search
- Attachment / Input types
- Text

- PDF
(text‑only) - Word (DOCX)

- Image

- Audio

- Text
AWS Bedrock – Anthropic Claude
- Best at
- Long‑context reasoning
- Large document and policy analysis
- Safe, compliance‑oriented enterprise responses
- Attachment / Input types
- Text

- PDF
(very strong document handling) - Word (DOCX)

- Image
(Claude vision models, where enabled) - Audio

- Text
Google Vertex AI – Gemini
- Best at
- Multimodal understanding
- Reasoning across text + images
- Conversational and agent‑based use cases
- Attachment / Input types
- Text

- PDF

- Word (DOCX)

- Image

- Audio

- Text
Other AWS Bedrock / Google Vertex Foundation Models
- Best at
- Use‑case‑specific tasks (summarization, embeddings, classification)
- Cost‑ or latency‑optimized workloads
- Attachment / Input types
- Text

- PDF
* - Word (DOCX)
* - Image
* - Audio
* - *Depends on the specific foundation model selected
- Text