Generating Sampling in PEGA

As per our Business requirement , we need to generate a Random Sampling of say -X% of the total data ( present in a Data type ) and use that for some process in Case flow . We are on Vanilla PEGA (Non CDH) . I know, we can generate some Random Number and use that to do sampling , but we are looking for -do we have any OOTB rule /feature in PEGA , which provides this support ? I heard , in CDH space we do have this type of feature but not able to exactly , find it . We do have CDH license for the client ( for some other applications) and hence , if any CDH feature , already does this ->we should be able to implement that as well .

Looking forward to get advise from this expert Circle , at earliest .

@AVIKCEMK2

In a non-CDH Pega application, there may not be a dedicated OOTB ‘random sampling’ feature for selecting X% of records from a Data Type. However, you can still achieve the requirement by using a Strategy rule in combination with a Data Flow.

Since Strategy rules are available under Records–> Decision–> Strategy rules, you can invoke them from a Data Flow even in a non-CDH setup. The Data Flow can read records from the Data Set, pass them through the Strategy, and then apply the selection logic needed for sampling. In that sense, the solution is not a built-in sampling feature, but rather a valid Pega pattern to implement the requirement.

If the business expects a true random subset, the Strategy could also include the logic needed to rank, score or filter records so that the desired percentage is selected consistently during execution. So while Pega may not provide a direct sampling rule for this use case, the combination of Data Flow and Strategy gives you a workable and extensible approach.

Hi @AVIKCEMK2 the feature in CDH that does something like that is called Data Migration. Specifically, inbound and outbound sampling in Data Migration is what you may be after. You can read more about it here. But it also does not do anything more than the random sampling by generating a random number.

@kamag @RaviChandra Thanks for responding and I get the intent .

However , As per the steps mentioned in (Pegasystems Documentation), i am not able to under stand , how using Random() in a Data flow , filter condition ->I will be able to achieve the Random sampling here . Can some one pls explain that ?

@AVIKCEMK2
You can find the java class documentation below for the Random Class. -

The Random() condition in the data flow is evaluated row by row. So if the filter uses a random threshold, each record gets an independent chance of passing the filter, which gives you statistical random sampling.

This means the output is usually close to X%, but not always an exact fixed count.

Make sense. Will try to leverage this idea.

Be careful with this. The features you are referring to are in the decisioning rulesets. They are covered by licensing restrictions. This is not a forum for those types of discussions you should speak with your account team. Here are some documentation references.

What you’re looking for in these are clauses like:

"Decision Management is part of Pega Platform, but requires additional licenses.

Platform documentation : Pegasystems Documentation

Decision Management documentation: Pegasystems Documentation

Process AI documentation: Pegasystems Documentation

The are also statements about this in Pega Academy coures like this one: Decision Management overview | Pega Academy