Offline Pega simulations

Hello,

We are conducting an AB test to determine if adding web data to our existing Pega models, which generate propensity scores, increases the likelihood that a user will click on a banner. The AB test consists of:

• A Test (Baseline Model): Based on Pega propensity data alone.
• B Test (Enhanced Model): Based on Pega propensity data combined with web data.

The objective is to test if the inclusion of web data positively impacts the propensity scores generated by the models.

We run these simulations within our own information factory (Databricks) with a gradient boosting algorithm as we thought this is the best way.

Specific Questions

Feasibility of Using Propensity Scores Alone:
• Is it feasible and valid to reduce the training data to just the propensity scores and use them along with the web data, instead of using the full training dataset (predictors as features) ?
• Would this approach provide reliable results, or is it better to use the complete training data (predictors) along with the web data?

Simulation Execution:
• What is the best practice for running these simulations within or outside the Pega environment?
• Are there recommended tools or methods within Pega or outside of Pega to efficiently test and compare these models?
• How can we effectively monitor and measure the impact of adding web data on the propensity scores and overall model performance?

Best Practices and Recommendations:
• Based on your experience, what are the best practices for integrating additional data sources (like web data) into Pega models?
• Are there specific configuration settings or advanced features in Pega that we should leverage to optimize our models for this scenario.

I hope that the above questions are clear.

Thanks in advance!

@jonathang16893510

Feasibility of Using Propensity Scores Alone: It is feasible to use just the propensity scores along with the web data. However, the reliability of the results would depend on the quality and relevance of the web data. It might be beneficial to use the complete training data along with the web data to ensure a comprehensive model.

Simulation Execution: Pega Customer Decision Hub provides a robust environment for running simulations and testing models. It includes features for monitoring and measuring the impact of adding new data on the propensity scores and overall model performance.

Best Practices and Recommendations: When integrating additional data sources into Pega models, it’s important to ensure the data is relevant and of high quality. Pega Customer Decision Hub provides features for adding new predictor fields and simulating customer interactions, which can be used to optimize models for this scenario.

:warning: This is a GenAI-powered tool. All generated answers require validation against the provided references.

Using behavioral data as predictors > Scenario

Adding predictors to an adaptive model > 1 Inspect the prediction configuration

Monitoring adaptive models > 1 Inspect the adaptive models

Exploring Prediction Studio > 1 Inspect the out-of-the-box Customer Decision Hub predictions