Stream Node stuck at status JOINING

Hi,

Greetings. Hope you are doing well! ! I have an issue with one of the Stream nodes stuck at JOINING status, there are 4 ec2 instances and all of these have been classified as STREAM nodes, whichever ec2 instance’s application server is started last, that gets stuck at JOINING state forever, i am not sure what’s wrong with this. Any suggestions?

At first i thought this was an issue with the ec2 instance, but it is happening across every other instance (whichever has been started at the last)

Any suggestions on what is wrong?

Regards,

Bharat

@KOMARINA When you check kafka-data folder do you see stale entries? Has the image been created with the stale kafka-data folder?

Did you already perform the following to see if i helps to start the stream nodes properly?

Remove /opt/tomcat8/kafka-data
Truncate the below tables:
pr_data_stream_nodes;
pr_data_stream_node_updates;
pr_data_stream_sessions;
pr_sys_statusnodes
Perform a Rolling restart of the Stream nodes

Do you have old failed nodes in the system which were not removed before a new system restart?

If this requires a more in-depth look I would suggest that you log a support incident in order that our support team help you go through your pega and cluster logs

@MarijeSchillern Hi, Greetings. I have already performed the tables truncation and removal of kafka-data however i still have the issue. There is an incident opened with GCS, INC-212154.

Regards,

Bharat

Thanks @KOMARINA!

I have updated your Question with the Incident ID and linked this to your Incident in Pega Support for our Support Engineers to review.

I checked the status of the support ticket INC-212154

It was closed Mar 24, 2022 :

Final outcome was that user informed us that you resolved this issue.

The issue was that 1 and 2 had ports missing from their firewall config. This was resolved by adding 5701-5800/tcp 9300-9399/tcp 9092/tcp to their Drop Zones in firewall-cmd.

This aligned the firewall-cmd config with the other severs in all envs.