How to set the number of Partitions in Batch Outbound Scheduler Dataflow of Customer Decision Hub?

ClarissaL16661030 · June 12, 2024, 4:07am

Hi,

We need to adjust the number of partitions of the batch outbound scheduler dataflow on Customer Decision Hub from the default 99.

We have 6 batch nodes with each 7 threads. On our daily scheduler, it will really slowing down when some of the nodes finished their partitions & only several with slow nodes working with the partitions left, which if it can be processed with the other idle nodes, it will be so much faster.

I have changed MKTSegPartitionCount & MKTEmailPartitionCount dynamic system settings & fully restart the server, but it still not working.

Can anyone tell me which configuration should i change for this need?

Thank you for your help.

MarijeSchillern · June 26, 2024, 10:44am

To adjust the number of partitions for the batch outbound scheduler dataflow in Pega Customer Decision Hub, you need to create and configure a dynamic system setting. In the header, click Create > SysAdmin > Dynamic System Settings. Configure the parameters in the New tab. In the Owning Ruleset field, enter: Pega-Engine. In the Setting Purpose field, enter: prconfig/dsm/services/stream/pyTopicPartitionsCount/default. In the Value field, enter the new global default partitions count per topic. Save the changes and restart the server. Please note that the new setting is applied only to newly created partitions.

This is a GenAI-powered answer . All generated answers require validation against the provided references.

Changing the default number of partitions per topic

Managing Concurrent Campaign Data Flow Runs

==> @malaa1 @nairv1 please could you provide your input here?

ClarissaL16661030 · July 4, 2024, 6:38am

@ClarissaL16661030

Want to update about this question, I’ve got the solution from GCS.

It can be addressed by updating the partitions count by using the formula.

The number of partitions = 3 * number of threads (Infrastructure > Services) * number of batch nodes.

SysAdmin > Dynamic System Settings > MKTSegPartitionCount has to be updated as per the above formula.

If the updated value is not taking effect, it could be due to the Refresh audience option before each campaign run is not configured. You can also Refresh the segment configured on the campaign manually and during the campaign run, verify if the partitions on the dataflow got updated.
Try the above before increasing the number of threads or batch nodes.

Clarissa

ClarissaL16661030 · July 4, 2024, 6:42am

@MarijeSchillern Hi Marije, thank you so much for your response. But there are another way by changing the MKTSegPartitionCount DSS & refresh the segment which are easier to implement in this case.

DeepakRaghulR16785688 · December 20, 2024, 7:10pm

@ClarissaL16661030 I understand that the factor of 3 is to distribute the load evenly .. but do we know why it is “3” . Shouldn’t the no of partition be a multiple of (no of node * no of threads ). In your case - a multiple of 42 , like 84 ,126 ?

ClarissaL16661030 · February 11, 2025, 4:32pm

@DeepakRaghulR16785688 Hi Deepak, honestly i didn’t really get the detail of why it should be 3, but I believe the number 3 is only the recommended baseline just to make sure we didn’t over partitioned our segment.

I think we can configure the multiplier even higher than 3 depends on the number of processed data, because in my case, the speed of my batch nodes are varying. Sometimes it took so long for a node to process, when all of the other nodes already finished their job. For this case, I think increasing the partitions will be beneficial to make sure the loads were evenly distributed and no idle nodes when the others are ‘struggling’.

Just try configuring different number of partitions & compare the performance.

Conversation		Replies	Views
Number of Threads to Number of Partition relation in Queue Processor General lead-system-architect , system-administration , case-management , performance , 8-8-3	9	1872	January 20, 2024
Dataflow: ratio requestor/partitions General lead-system-architect , decision-management , installation-and-deployment , performance , data-management , customization , government , pega-infinity	3	117	January 16, 2025
Data flow Partition General 8-3-6	1	282	April 12, 2024
Does each queue processor have 20 threads, or do all queue processors share 20 threads? General case-management , data-integration , 8-7-6	8	1862	May 23, 2024
Setting Partitions General decision-management , 8-6	1	585	June 18, 2024

How to set the number of Partitions in Batch Outbound Scheduler Dataflow of Customer Decision Hub?

Related topics