How to replace Connect File implementation with Repository API approach

Hi,

To write an output file to filesystem, we used to develop Connect File. However, starting with Pega 8.7, Connect File is deprecated and instead, Repository API should be used going forward. Existing Connect File rule will continue to function, but you cannot create new rules any longer (you can use and modify existing rules). For your reference, I will share how our customer implemented their requirements using Connect File, and how we replaced it with Repository API.

1. PoC Requirements

End user displays a list of work objects in a table structure. When a button is clicked, system generates an output file of work list in CSV format and writes it to application server’s filesystem. Usually the output file is put to a remote File Server but in this PoC, I will save it to the local C drive as below.

2. Connect File implementation (old approach)

2-1. Create UI that displays a list of work objects as below. Place a button to trigger an activity.

2-2. Configure an activity. Construct CSV strings and execute Connect-File passing the file contents as parameter.

Pages & Classes

2-3. Create Connect File rule.

2-4. Now you can test. Click a button and make sure the CSV file is written to the directory successfully.

3. Replace with Repository API (new approach)

3-1. Create a Repository rule. You may have to add PegaRULES:RepositoryAdministrator access role to your access group if you don’t have access to Data-Repository class. After creation, click “Test connectivity” button and make sure connection status is SUCCESS.

3-2. Modify the activity that we created earlier. Step 1 and Step 2 are the same. I’ve added Step 3, 4, and 5 for Repository API execution. Be noted, the CSV strings have to be encoded by Base64 before passing to D_pxNewFile data page. Then save the savable data page to write a file.

Pages & Classes

3-3. Now you can test. The output file must look exactly the same as previous Connect File approach.

  • Notes

In this particular example, I have used String approach (pyContents) for D_pxNewFile data page. You can also use Stream approach (pyStream) if required. For large files, consider Stream approach. The maximum file size limit is 5GB. Please refer to https://support.pega.com/discussion/how-use-repository-api for how to use Repository API.

Hope this helps.

Thanks,

@KenshoTsuchihashi Thanks for the post! had a couple of questions-Why have to used stringbuffer method, you could directly concatenated the property values? Can you please tell how to use stream method?

@SurajK47

For both Connect-File and Repository API activities above, you can replace StringBuffer-Append part with Property-Set approach using + operator to concatenate Strings, as shown below. This will generate exactly the same output file.

Either approach is fine. However in general, StringBuffer has a better performance as compared to the + operator or concat method because StringBuffer class is mutable, so it does not result in the creation of new String objects. Strings in Java are immutable; modifying the String creates a new String object in the heap memory with the latest content, and the original String is never changed.

FYI, I’ve done a quick experiment to benchmark the performance of both approaches, appending “Hello World” Strings 100,000 times. The result was, StringBuffer-Append approach was about 200 times faster (20.554 / 0.102).

No Approach Performance (100,000 times)
1 Property-Set 20.554 seconds
2 StringBuffer-Append 0.102 seconds

Thanks,

@KenshoTsuchihashi

Thanks for the post. It is very helpful. I have a question. You have used String approach in the Repository API implementation. Is that because you are handling ASCII text file (CSV) in this example? I mean, if the file content is ASCII text we should use String approach, and if the file content is binary then we should use Stream approach?

Regards,

@KenshoTsuchihashi Can we append Empty spaces also

For instance

1234Apple98765<2 empty-spaces>5612<15 empty-spaces>

1234Apple98765 5612

can you please provide steps to implement like above

@KenshoTsuchihashi Really Helpful.

@KenshoTsuchihashi When we try to use, D_pxGetFile[repositoryName:param.SourceRepo,filePath:param.StartingPath,responseType:“stream”] with stream as parameter,

pyStream property is not holding any data in clipboard or tracer. Can you please let me know how can we view contents file when we invoke repository as Stream and same contents we need to write to another repository as a new file

@CloeW938

No, that is not correct. You can use Stream approach for this example, too. Below is the code. The only difference is Step 4, where I inserted Java step (actually I also changed StringBuffer-Append method in Step 2 with Property-Set but if you want to keep StringBuffer-Append, convert StringBuffer to String in the Java step).

//Convert string to inputStream		
java.io.InputStream inputStream = new java.io.ByteArrayInputStream(CSVStrings.getBytes(java.nio.charset.StandardCharsets.UTF_8));		
tools.getStepPage().getProperty("pyStream").setValue(inputStream);

To answer your question, you can use both String approach or Stream approach for any file, regardless of the file contents. String or Stream is just a means to pass the file to Repository API and it is irrelevant to whether it is an ASCII text file or binary file. I would recommend that you use Stream approach for large files. Hope this clarifies your question.

Thanks,

@KenshoTsuchihashi We are planning to write more than 10k rows to csv file and push to repository. Will this approach work for high volume data? or is there any other alternatives for high volume data?