Would it be a viable and scalable approach to design a user interface in Pega that allows users to upload Excel files containing over 100,000 records, perform validations, and display the results in a grid format?
Approach #1: Case Type–Driven Upload and Processing
This approach leverages Pega’s case management capabilities to track and orchestrate the upload lifecycle.
Steps:
-
Create a dedicated Case Type for Excel uploads.
-
Screen 1: Upload the Excel file as a case attachment.
-
On submit, route the case to a Wait shape or Processing assignment.
-
Trigger a Queue Processor (QP) or Job Scheduler (JS) to:
-
Read the attachment stream.
-
Parse and validate the Excel data.
-
Save validated rows into a Data Type, using the Case ID as a key.
-
-
Once processing is complete, move the case to a Review stage.
-
Display results using a Data Page reference in the case details or footer.
Benefits:
-
Full traceability via case history.
-
Easy to extend with audit, retry, or escalation logic.
-
Ideal for business workflows that require approval or review.
Approach #2 : Landing Page with Data Type Tracking
This approach is UI-centric and better suited for operational users who need quick access to upload history and results.
Steps:
-
Create a Landing Page with:
-
Excel upload control.
-
A grid listing previously uploaded files.
-
-
On upload:
-
Save the file as a
Data-WorkAttach-Filerecord. -
Create a record in a Track Excel Uploads Data Type.
-
Link the attachment to the tracking record.
-
-
Trigger a QP or JS to:
-
Read and parse the attachment.
-
Validate and store results in a separate Data Type.
-
Update a flag in the tracking record once complete.
-
-
Enable a Preview button once processing is done.
-
On click, show a modal dialog with the validated data.
Benefits:
-
Clean separation of UI and processing logic.
-
Easy to manage multiple uploads and preview results.
-
No need for case management overhead.
Short answer no, not directly in one go. For files over 100k rows, upload to a repository first and process in the background with a queue processor or data flow instead of parsing on the requestor. Convert to CSV if you can since Excel parsing is heavier and memory hungry. Stage rows into a temporary table or data set, run validations in chunks with checkpoints, and write only failures and key stats. Do not try to render 100k rows in a grid; show a paginated summary with counts and the first few hundred errors, plus a download link to a full error file. Add server side limits on file size and row count, and reject files that exceed safe thresholds with a friendly message. Use optimistic concurrency and retries around inserts to avoid lock issues during bulk loads. Log per batch and surface progress to the user with a status page that auto refreshes. When clean, trigger the final import step as a separate job