Dataflow: ratio requestor/partitions

Phil5873 · January 8, 2025, 10:36am

Hi everyone,

I am running a dataflow sourced by a report definition, with a defined partition key.

Let’s assume the partition key can have x possible distinct values, and the dataflow will run on y nodes.

My questions are the following:

in the options configuration panel, before launching the dataflow execution, which is the most proper number of requestor I should set to maximise the throughput?
is it correct to assume that the number of requestor * number of nodes must be equal or slightly grater than the possible distinct values of the partition key?

Thanks to anyone who can help me with this doubt.

Sairohith · January 14, 2025, 2:58pm

@Phil5873 To maximize throughput when running a dataflow sourced by a report definition with a partition key, set the number of requestors based on the number of nodes and the capacity of each node. The goal is to ensure all partitions are processed simultaneously without overloading the system. Ideally, the total requestors across all nodes should be equal to or slightly greater than the number of distinct values in the partition key. For example, if there are 20 distinct values and 3 nodes with a capacity of 4 requestors each, setting 12 requestors (3 nodes * 4 requestors) would be efficient. If the number of distinct values exceeds the total requestors, aim for a higher requestor count to avoid bottlenecks. On the other hand, if the partition values are fewer than the requestors, adjust the count downward to prevent unnecessary resource usage. Balancing the requestors with the available nodes and partition keys ensures optimal performance by leveraging parallel processing without overwhelming system resources.

Phil5873 · January 14, 2025, 4:26pm

@Sairohith Thank you for your clear response.

So it’s correct assuming that the point to try to balance as much as possible the total number of available requestor and the distinct values identified by the partition key.

I will configure the Dataflow in this way.

Sairohith · January 16, 2025, 2:32am

@Phil5873 Yup, your understanding is correct. Thanks

Conversation		Replies	Views
How to set the number of Partitions in Batch Outbound Scheduler Dataflow of Customer Decision Hub? General system-architect , next-best-action , outbound-marketing , financial-services , 8-6 , dev-designer-studio	5	460	February 11, 2025
Data flow Partition General 8-3-6	1	280	April 12, 2024
Setting Partitions General decision-management , 8-6	1	584	June 18, 2024
Number of Threads to Number of Partition relation in Queue Processor General lead-system-architect , system-administration , case-management , performance , 8-8-3	9	1868	January 20, 2024
Movement from Agent to QP Queries General lead-system-architect , system-administration , case-management , installation-and-deployment , performance , 8-6-2	4	192	August 16, 2022

Dataflow: ratio requestor/partitions

Related topics