RUTA Script for UETR

I am trying to extract UETR from email ( NLP). Below is the script ( Not Working) :

DECLARE EntityType;

CW{REGEXP(“[a-zA-Z0-9-]{8}[a-zA-Z0-9-]{4}[a-zA-Z0-9-]{4}[a-zA-Z0-9-]{4}[a-zA-Z0-9]{12}”) → MARK(EntityType,1,36)};

And for the below script alphanumeric is not working.

CW{REGEXP(“[a-zA-Z0-9-]{8}”) → MARK(EntityType,1,9)};

@Arun Sankar JTo successfully extract the UETR from an email using RUTA, you need to ensure that your regular expressions correctly match the UETR format and that the syntax aligns with RUTA’s requirements. The UETR typically follows a UUID format, which is 36 characters long, including hyphens. Your first script almost captures this, but you should specify the exact pattern with hyphens in the right places. For example, the UETR pattern is usually [a-fA-F0-9]{8}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{12}. Update your script to include the hyphens explicitly and ensure the case sensitivity if needed. Additionally, for the alphanumeric part, ensure that your regex matches exactly 8 characters by using {8} without the hyphen unless hyphens are part of the expected pattern. Here’s a corrected version:

DECLARE EntityType;

CW{REGEXP(“[a-fA-F0-9]{8}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{12}”) → MARK(EntityType, 1, 36)};

CW{REGEXP(“[a-zA-Z0-9]{8}”) → MARK(EntityType, 1, 8)};
​​​​​​​
Make sure your email text is properly tokenized and that the scripts are applied in the correct order. Test the regex patterns separately to verify they match the intended UETR and alphanumeric strings. This should help your script correctly identify and extract the UETR from the email content.

@Sairohith , When I tested with above RUTA Script , the script is not able to identify

UETR de2da6c9-18be-48d4-8053-867ed90a316a

But the script is recognizing - YYYYZZZZ-AAAA-BBBB-CCCC-DDDDEEEEFFFF

Below is the script :

DECLARE EntityType;
CW{REGEXP(“[a-zA-Z0-9-]{8}[a-zA-Z0-9-]{4}[a-zA-Z0-9-]{4}[a-zA-Z0-9]{12}”) → MARK(EntityType)};

Just got the solution from my team:

“[a-zA-Z0-9]{8}-[a-zA-Z0-9]{4}-[a-zA-Z0-9]{4}-[a-zA-Z0-9]{4}-[a-zA-Z0-9]{12}” → EntityType;

It working fine.