How to set custom elasticsearch analyzer

Hi all,

we’d like to adjust the search results to also include terms that contain special characters so when users search for ‘Zoe’ they also find ‘Zoë’.

I know that elasticsearch supports this by using asciifolding tokenfilter (ASCII folding token filter | Reference) but I’m unclear how I can get this working in combination with Pega search functionalities.

When creating a Custom Search Property, I can set an analyzer name for that property but how do I add a new analyzer to the index? Is there a way to get this working with OOTB features?

Kind regards
Manuel

@mrits1 is this a question you might be able to answer?

@ManuelZ8To set a custom Elasticsearch analyzer in Pega for handling special characters like ‘Zoë’ when searching for ‘Zoe’, you can create a Custom Search Property in Pega and specify an analyzer name for the property. However, just naming the analyzer in Pega does not modify the actual Elasticsearch index settings. If you have access to the Elasticsearch index (mostly for on-prem setups), you can create a custom analyzer using the asciifolding filter by sending a request to Elasticsearch with the analyzer and filter definitions before the index is created. For Pega Cloud, direct index modifications are restricted, so you may need to raise a support request with Pega. Alternatively, you can use Data Transforms or Declare Expressions in Pega to preprocess the data by normalizing special characters before indexing. After making these changes, you need to perform a full re-index from Admin Studio to apply the updates to your search results

@Sairohiththanks for the response! Fortunately, I do have access to elasticsearch on-prem. There are some uncertainties on my side though.

You mention that the analyzer and filter definitions are sent before the index is created. If I read the elasticsearch documentation, these settings are applied during index crtion as part of the PUT call: Create a custom analyzer | Elastic Docs

This would imply I have to adjust how Pega creates the index, which I’m unsure is possible. Based on your description that is also not what I think you mean.

Alternatively I tried to add the analyzer to the index after creation using the update settings call: Update index settings | Elasticsearch API documentation

While this seemed promising, how would I reindex from Admin Studio? I did try it from the Search landing page in Dev Studio but Pega seems to create a new index including the timestamp in its name so my changes are taking no effect.

Any other suggestions?

I reached out to Pega support and unfortunately they weren’t able to help. SRS does not support custom analyzers. They created a feedback item (FDBK-119951) and expect the support of asciifolding-tokenfilter in upcoming SRS releases