Processing Large Data set in PEGA

AVIKCEMK2 · May 8, 2024, 2:19pm

In Our organization , PEGA will be receiving one large data set ( Around 2 lakhs record ) from one Non PEGA Team , which needs to be processed and stored in PEGA as a Data type objects . This operation needs to happen bi weekly .

Currently , due to some operational limitation , This Non PEGA team , will not be able to connect to PEGA repository/or they can’t consume any PEGA API to push this data to PEGA . They have option to use Share point at the moment .

With the above situation , we have thought through below approaches -

Non PEGA Team can push this file to Share Point .One job schedular will be set up , that will use , Share point component (D_SPOnlineGetFileContent) , to read the file content and push it to PEGA repository . One File listener , will pick up this file from repository and process this and store in PEGA data type .
Non PEGA Team can push this file to Share Point . One job schedular will be set up , that will use , Share point component (D_SPOnlineGetFileContent) , to read the file content and push it to PEGA repository .Will create one Data set to read this file from Repository and will use one Data flow to read the data and write it to PEGA data type .

3.Non PEGA Team can push this file to Share Point.One job schedular will be set up , that will use , Share point component (D_SPOnlineGetFileContent) , to read the file content and also parse it ( Using Binary File template and pxParseExcelFile activity) and directly write it to PEGA data type( by passing PEGA repository ) .

Can some one suggest -what will be ideal one /or any better approach present than the suggested considering the performance aspect specifically ?

We are on PEGA 8.7 Cloud .

MarijeSchillern · May 15, 2024, 1:15pm

@AVIKCEMK this question would be best directed to our pega consulting services.

Based on the provided context, the third approach seems to be the most efficient. It involves reading the file content from SharePoint, parsing it, and directly writing it to a Pega data type. This approach bypasses the need to store the file in the Pega repository, which could save time and resources. However, it’s important to consider the size of the data and the performance of the parsing operation. If the data is too large, it might be more efficient to temporarily store it in the Pega repository and process it in smaller chunks. Ultimately, the best approach would depend on the specific requirements and constraints of your project.

This is a GenAI-powered tool. All generated answers require validation against the provided references.

How to read data from Microsoft-Sharepoint

Data management and integration > File processing

Big volume of data processing Real-time in Pega

Upload CSV File on UI from Data-Portal and then parse the CSV Records

How to Connect Pega with One Drive / Share Point

AVIKCEMK2 · May 20, 2024, 11:24am

@MarijeSchillern Thanks for the suggestion .

Considering the parsing time , We are going ahead with the 2nd approach .

Conversation		Replies	Views
How do we handle large data transfer between Openspan and Pega BPM ? General solutions-consulting-manager , data-integration , financial-services , 19-1	6	130	January 23, 2024
How to Connect Pega with One Drive / Share Point General system-administration , case-management , data-integration , 8-6	1	284	May 24, 2023
Best way to copy 3 million records from external database to pega data table General senior-system-architect , system-administration , data-integration , data-management , financial-services , 8-6 , pega-academy , dev-designer-studio	2	741	November 2, 2022
Is it possible generating Apache parquet file in PEGA? General system-architect , case-management , financial-services , 8-4-2	2	576	November 2, 2020
Pega 8.6 integration with Sharepoint General pega-platform , solutions-consultant , data-integration , cross-industry	5	1149	November 8, 2021

Processing Large Data set in PEGA

Related topics