Case ID generation mechanism

Hi,

In this post, I am sharing case ID generation mechanism and how it can be customized.

First of all, case ID consists of three parts - [prefix]-[integer]-[suffix] as shown below.

No Example Prefix Sequence # Suffix
1 C-100 C- 100
2 15378-DR 15378 -DR
3 MORT-763-K4 MORT- 763 -K4

You can use either prefix, or suffix, or both along with sequence #. “Suffix” is blank by default and most of customers don’t use it in my experience but you can use it if it’s preferred. “Prefix” is added to your application rule automatically (the initial letter of the case type by default) when you define a case type.

1. How to customize it

There are three approaches to change it. Try it in the order of 1, 2, 3. If what you’re trying is not possible in the approach, try next one. FYI, if you update all of them, precedence is 3 > 1 > 2.

No Approach Comments
1 Application rule You can update prefix in the application rule. Only prefix is changeable and suffix can’t be configured. You can only enter static string value (expression is not allowed).
2 pyDefault data transform You can set .pyWorkIDPrefix (prefix), and .pyWorkIDSuffix (suffix) in the pyDefault data transform. You can use expression to set dynamic value.
3 Work-.GenerateID activity You can override Work-.GenerateID for greater flexibility. Prefix, suffix, or sequence # (*) can be customized.
  • In my experience, very few customer wants to customize sequence # generation logic. Although it is technically possible, I would not recommend it as it is a bit risky and may cause unexpected issues. For example, one of my customer couldn’t use Package Work wizard, which migrates work object from one environment to another, because they customized sequence # and it is not supported by this wizard. If you really need to do this, make sure the ID is always unique and also there is no performance problem in a multi-node environment.

2. Sequence # generation mechanism

There has been a change in 8.3.1. Let me explain old / new mechanism and why it was changed. Be noted, this new mechanism was introduced only in PostgreSQL and it is not available for other database for now. However, we are planning to extend it to Oracle and Microsoft SQL Server from 8.7. For DB2, we don’t have any plans at moment.

(1) Old mechanism (~ Pega 8.2)

The latest ID is maintained in the database table (PC_DATA_UNIQUEID) per case type. Every time case gets created, system calls Work-.GenerateID and it queries and updates the value in the table. The ID is incremented by one and returned to app node.

(2) New mechanism (Pega 8.3.1 ~)

The latest ID is maintained in the app node. Database table (PC_DATA_UNIQUEID) is still used, but it only holds the chunk of scope (called “batch size”). For example, when the very first case is created, app node queries the batch size in the database. Since there is no entry in the table, system updates it to 1000 (by default). App node is assigned with 1-1000 scope. Hence the 1st case’s sequence ID becomes 1. When next case is created on the same app node, it won’t hit the database anymore and 2 is assigned immediately since the latest ID is maintained at app node (not database). This continues until either the app node exhausts the assigned scope (1-1000), or restarts. Be noted, this process happens per app node. For example, if app node #2 comes up, it is assigned with the next chunk (1001-2000). Hence, even if the latest ID in the app node 1 is in the middle of scope, app node #2 will start with 1001 regardless.

* Why it was changed

The old mechanism largely relied on database and performance was bad. Communication between app node and database is costly. Actually, the half of case creation process time was this ID generation. So, bottleneck issues were sometimes reported in a high load environment. More importantly, in a multi-node environment, the case generation can happen at the same time between nodes, and it could cause contention as the row is shared for all nodes. New mechanism reduced the number of communication between app node and database and increased the performance. Now ID generation takes only less than 5% of its case creation process time.

* Business impact

As a side effect of this new mechanism, now the sequence ID jumps around between nodes or every time you restart the system. Prior to 8.3, the case was pretty much sequential - 1, 2, 3, 4, 5…etc. Now, it goes like 1, 1001, 2, 2001, 3001, 3, 4, 3002… etc. This is all caused by technical reasons. However, for some customers or business type, sequence # is important. So I would recommend you to consider with business people the balance - if, the sequence # is more important than performance, you can change the batch size. Or if you don’t get bothered by sequence #, you can keep the default.

* Batch size update

The default value is 1000, but you can change it by Dynamic System Settings. For example, if you update it to 1, system will behave like old version. Be noted performance gets slower in that case.

  • Dynamic System Settings: idGenerator/defaultBatchSize
  • Owning ruleset: Pega-RULES

  1. If you want the batch size to differ per case type, you can create another Dynamic System Settings (ex. idGenerator/P-/batchSize). Be noted, this is case sensitive and if you type “BatchSize” instead of “batchSize”, it won’t take effect.
  2. If you plan to update batch size, I would advise you to do so before you create the first case. This is because, if you update Dynamic System Settings after case creation, pyLastReservedID is updated to 1000 anyways, and when the app node is restarted or user connects to other node, system will create 1001 regardless of the value in Dynamic System Settings. This is one step behind, and customer may feel the change is not reflected immediately. Usually this should be okay because the Dynamic System Settings is included in the R-A-P and imported to production environment before the very first case is created. But if you care sequence ID even in Dev, I would suggest to do so before defining a case type to avoid unnecessary confusion.

* Utilizing PC_DATA_UNIQUEID table for non-work

I have also written up an article about how to utilize this PC_DATA_UNIQUEID table to get the next auto-increment ID for non-work (ex. Data Type). Please see https://collaborate.pega.com/discussion/how-get-next-auto-increment-id-pega.

Thanks,

Awesome post and good insights on what’s changed at the engine side. Please keep posting similar kind of posts , I doubt even Pega help does not provide the insights your provide.

Thanks a lot.

Great insights on Case ID generation @BraamCLSA.

Excellent Post!!

Fantastic approach explained on the Case ID generation in V83 and later. I understood the bottleneck and the process improvement in the product. Nice post.

Thanks,

Ravi Kumar Pisupati.

Excellent Post and thank you!

@KenshoTsuchihashi How to generate Case Id starting from 1000000. For example C-1000000 then C-1000001…

@KenshoTsuchihashi Very well explained kensho San

Awesome explanation!!!

Its Very Useful article. I have one question.

In Appnode1 – 1000 and appnode2 1001 – 2000. on Each node 500 cases processed and next I restarted my app Server nodes.

So it would be incremented to next batch i.e. 2001 to 3000 on node 1 and 3001 to 4000 on node 2

or it will use old batch number as it is not exhausted.

Please explain me.

@KenshoTsuchihashi Excellent !! I was looking for this answer since last few months and finally I got it here. Thanks a lot.

@KenshoTsuchihashi Thank you for this post

Very useful post and great explanantion.

Hi @KenshoTsuchihashi ,

This was really a great explanation…!! Thank you..!!

Looking forward for an update which can address the sequence issue we have with this new mechanism.

Thanks.

Maybe its time to move to not so user frendly, but blazing fast and reliable UUID?

@KenshoTsuchihashi

Contention issue is resolved , which is a great achievement i can say. But sequence is also very important for some customers.

I think we should think of solution for sequence as well.

@KenshoTsuchihashi ,

We are in pega SCE framework with V8.5.1, it uses old approach to generate unique.

Currently the issue, what we are facing, when multiple queue processor, trying to create a work object, PC_DATA_UNIQUEID stuck in deadlock.

This application in prod. Without impacting current prod data, how to move the unique id generation from old approach to new one.

database : aws postgres aurora

Reply has been moved here.

@KenshoTsuchihashi Thank you for the detailed explanation.

@KenshoTsuchihashi Critical information. Thanks for sharing. But have serious doubt on this design change and the motivation behind this. Not buying the new mechanism of ID generation specially in cloud architecture where pods creation and deletion is more too often.