RUTA script for extracting 37 digits string

Hi All,

I need assistance for extracting a string from an incoming email which of form:

Test Ref(s) : ABxxx

Where xxx - 35 digit number

OR

Test Ref(ABCD) : 37 Digital alphanumeric string

Can anybody pls help with the RUTA script.

Thanks in advance.

@Himalaya Sudan to extract the patterns you mentioned, you can use the following RUTA scripts:
For “Test Ref(s) : ABxxx” where xxx is a 35 digit number:

“Test Ref(s) : AB” NUM{REGEXP(“.{35}”) → MARK(EntityType)}

For "Test Ref(ABCD) : " followed by a 37 digit alphanumeric string:

“Test Ref(ABCD) : " (W|NUM){REGEXP(”.{37}") → MARK(EntityType)}

Please replace EntityType with the actual entity type you are using. These scripts will mark the entire string including the prefix as the entity. If you want to mark only the number or alphanumeric string, adjust the MARK function accordingly.

:warning: This is a GenAI-powered tool. All generated answers require validation against the provided references.

NLP With Ruta Script

Detecting transaction details with Ruta > Detecting alphanumeric strings with a fixed number of characters

Creating entity extraction model using Ruta script > Scenario

Best practices for pattern extraction in text analytics

—> Can you confirm you’ve checked available resources?

Modifying Apache Ruta scripts to extract custom structured entities

Detecting transaction details with Ruta

Creating entity extraction model using Ruta script