Historical Adaptive Model Data Logging

CheriGaudet · November 20, 2020, 5:59pm

I have a couple questions regarding logging historical adaptive model data.

When I first learned of this feature, I assumed that it was intended to help the business understand how its models are performing over time. As I dig deeper into the documentation, however, I see a feature that makes me question my assumption. The documentation suggests that customers can specify the percentage of positive and negative responses that they want to record. What is the purpose of only logging a percentage of responses? If a customer wanted to capture 100% of positive and negative responses, what time-to-live would you recommend for the JSON file?

Second, regarding the JSON file repository, can this be provided by Pega Cloud (for a Cloud implementation) or should the customer provide it?

Otto_Perdeck · November 20, 2020, 6:45pm

Hi Cheri

To see how models perform over time, you don’t need this feature. That view is available in adaptive models and in predictions already, and is based on storing snapshots of model data in the datamart tables at regular intervals.

The historical dataset feature is mainly to give (data science teams of) customers insight in the actual data that drives (adaptive) models (at the moment the feature is only available for adaptive). Actual data often includes contextual and session data that is not always available in the data lakes outside of Pega, so can help people understand better what makes the models tick. With the data they could even create challenger models, or build group/issue level models that could be deployed along with the regular adaptive models.

As for your time-to-live question - I can’t give a generic answer. It depends on the goal I guess. But the amount of storage will quickly become significant: for every response to every model, the data of all the predictors is stored. That is also why we provide a sampling mechanism - so you can span a larger time frame. The separate percentages for positives and negatives make sense because typically the “success rates” are small - depending on the channel perhaps only a few %, for email often even way lower. In the analysis you will probably more interested in the data from that 1% that did click, than in the 99% that didn’t. Of course, if you use the data to build external models you’ll need to compensate for this sampling bias.

Regards

-Otto

Conversation		Replies	Views
Impact to Models of Not Logging Negative Responses General next-best-action-analyst , decision-management , financial-services , 8-5	1	84	March 19, 2021
CDH Community Event: Adaptive Modeling Lessons from the Field Pega-as-a-Service next-best-action-analyst , data-scientist , decision-management , artificial-intelligence , customer-journeys , digital-personalization , next-best-action , outbound-marketing , pega-customer-decision-hub	65	1600	November 25, 2021
Pega Adaptive Model General decision-management , prediction-studio , 8-4	3	132	June 26, 2024
Concerns about using adaptive models, part 1 AI	0	88	March 10, 2026
CDH Community Event: Everything You Need to Know About Adaptive Decisioning General pega-customer-decision-hub	5	953	March 14, 2023

Historical Adaptive Model Data Logging

Related topics