Admin Studio displays 0 connected nodes. PDC shows all nodes are active. How can I trouble shoot this?

Hello,

First and foremost, thank you for reviewing this challenge we are facing. I have searched for similar issues and found a few posts similar to mine. Most are abandoned by the original submitter while others do not provide a satisfactory reply. I hope I can break that cycle.

Summary of issue:

When we restart our servers for any reason, all nodes initially display in Admin Studio. After some time, this changes and we are presented with 0 nodes running. I’ve confirmed in PDC that the nodes are indeed running (and I can navigate to the services section in DEV studio and can see stream nodes running, etc.) Furthermore, once this has occurred, we can no longer stop or start listeners or see any requestors. I’ve witnessed this in multiple environments ranging from DEV to PROD. The only fix for this, and it has been temporary, is to restart the servers.

For reference, we are using Pega Platform 8.6.3 (on-premises). We have 4 web nodes and 2 background processing/stream nodes. This issue has been occurring for a few months.

What we’ve tried so far (per Pega GCS):

  • Set the prconfig setting ‘cluster/hazelcast/v4/enabled=true’ explicitly to point to Hazelcast 4.x.
  • Added to the prconfig
  • On cluster restart , check for any stale entries in the pr_sys_statusnodes table

Has anyone else faced this and solved this pervasive issue? Any ideas would be great!

Hi @WilliamM9222

Could you provide us with the Support Case ID from GCS so that we can connect this Question to it?

Thank you!

@MarissaRogers Here is the support case ID: INC-250322 Sev3/35 - BSC - Timeoutexceptions in logs -Hazelcast Issues

@WilliamM9222 thank you for having provided the support ticket ID. I can see that the ticket was closed in January with the following observations:

"Solution description:

This is not a default version of Hazelcast for Pega Platform on-premises deployments. During installation and upgrade, Hazelcast 4.x EE requires an additional prconfig setting:

Prconfig setting “cluster/hazelcast/v4/enabled” = "true”

Upgrading to Hazelcast 4.x requires downtime. Therefore, consider which option to use:

Keep the current Hazelcast version enabled during your installation of or upgrade to Pega 8.6.

Install or upgrade to Pega 8.6 first and then take downtime to upgrade to Hazelcast 4.x.
reference:https://docs.pega.com/pega-services-troubleshooting/updates-hazelcast-support,

"

@MarijeSchillern Thank you for the reply. I know were added the explicit entry mentioned in the INC to prConfig. I’ll follow up with our support team to validate the current version of Hazelcast. I will try to reply back to this thread today.

EDIT

I’ve come to understand that the explicit entry was indeed added but when checking what was reporting out, Pega is stating:
thread-34] ( internal.guice.ClusterModule) INFO - Infinity is using Hazelcast Enterprise 3.12.10 for clustering. This version of Hazelcast is not supported anymore.

I’ll follow up with my support teams to have the proper version of 4.x installed and report back to this thread.