We are in the process to externalize Kafka. Our internal Kafka team (using Confluent’s version) questions why we need default 6 partitions per topic/QP. Versus their recommended 1…
The average message rates are in most cases rather low (<3 msg/s)
The argument for using less partitions is reducing server side resource usage.
We would like to try at least for no-prod to use less than 6, let’s say 3.
Someone, got some experience to share on this “Topic”
?
@Henrik-Swedbank also see this question Does each queue processor have 20 threads, or do all queue processors share 20 threads?
I see that you logged INC-B18366 which was closed with the following information:
To address the concerns raised by your internal Kafka team regarding the “message.max.bytes” setting and the default number of partitions per topic, it’s important to understand the rationale behind these configurations.
The “message.max.bytes” setting of 5000000 (~5Mb) in the context of Pega Platform’s stream functionality is designed to optimize performance and accommodate larger message sizes.
This setting allows for the efficient processing of messages within the Pega Platform, especially in scenarios where larger payloads are necessary.
It’s important to note that this setting aligns with the specific requirements and capabilities of the Pega Platform’s stream functionality, which may differ from the default settings in other Kafka environments.
Regarding the default number of partitions per topic, the decision to set it to six in Pega Platform is based on achieving a balance between resource utilization and performance.
While a greater number of partitions can enable more clients to consume messages, it’s essential to consider the trade-offs in terms of resource utilization and latency.
The default setting of six partitions per topic in Pega Platform is optimized to ensure efficient data consumption throughput and concurrency while managing resource usage effectively.
In the context of your specific use case, where the average message rates are low and there is a focus on reducing server-side resource usage, it’s understandable that the Kafka COE is exploring the possibility of using fewer partitions, such as 3, for non-production environments.
This approach aligns with the goal of optimizing resource utilization based on the specific messaging patterns and requirements in those environments.
For further details and insights, you can refer to the Pega Platform documentation: