RUTA does not support special characters and Chinese

We’re trying to extract fields from email subject as below:

Tom Cruise (汤姆克鲁斯)'s Overtime Application [OTA2026030001] is waiting for your approval

Expected fields:

applicant name:Tom Cruise (汤姆克鲁斯)

application name: Overtime Application

document number: OTA2026030001

There’re both special character (single quota) and Chinese characters in the subject.

So we tried to process these characters in the RUTA script.

And it seems not working.

Got error says “” when tryed to script as below:
W{ REGEXP(“[a-zA-Z0-9\\']+”) → MARK(EntityType) };

Also tried use unicode instead of the character as below, no error, but not working neither:
W{ REGEXP(“[a-zA-Z0-9'']+”) → MARK(EntityType) };

For Chinese characters, also tried with unicodes, no error but not working either.

@VikasRaidhan Any thoughts?