Hi - After we upgraded our application to V8.6.0 we get kafka data belong to different node error.
I have decommissioned the existing stream node stopped the JVM updated below in prconfig.xml file and truncated the tables and the started the JVm still we get the error,
kindly find the logs attached.
prconfig/dsm/services/stream/pyport/default 10089
prconfig/dsm/services/stream/pybrokerport/default 9092
prconfig/dsm/services/stream/pykeeperport/default 2181
prconfig/dsm/services/stream/pyjmxport/default 9999
truncate table pr_data_stream_nodes
truncate table pr_data_stream_sessions
truncate table pr_data_stream_node_updates
truncate table PR_SYS_STATUSNODES
stream error.docx (41.9 KB)
uat.txt (7.95 KB)
@SubinJ76 Are you following Configuring the Stream service and Operating Stream Service?
Could you check the following forum question:
Stream nodes Joining Failed
Stream Service Exception- Node has associated Broker id
See Split-Brain Syndrome and cluster fracturing FAQs and Troubleshooting Hazelcast cluster management
The exception mentioned during Stream startup in your screenshot (‘“Kafka data belongs to a node”’) is basically saying that Kafka-data belongs to a different broker ID and kafka cluster ID which might be from the old instance (or any instance IDs created in between the restarts/troubleshooting attempts).
You can try to just bring down all stream nodes, remove the kafka-data folder from all stream hosts and then restart the stream nodes.
If it does not work, then please try the below steps:
-
Bring down all nodes.
-
Remove Kafka-data folders from all stream hosts.
-
If this a cloned environment, please truncate the below tables that contains information related to node IDs, stream partitions and stream broker/cluster IDs:
-
PLEASE NOTE: this will truncate any batch/realtime dataflow runs and their run metrics history, so new dataflow runs will need to be created.
truncate table pegadata.pr_data_stream_node_updates;
truncate table pegadata.pr_data_stream_nodes;
truncate table pegadata.pr_data_stream_sessions;
truncate table pegadata.pr_sys_statusnodes;
truncate table pegadata.pr_sys_statusdetails;
truncate table pegadata.pr_data_qp_run_partition;
truncate table pegadata.pc_work_dsm_batch;
truncate table pegadata.pr_data_decision_ddf_run_opts;
truncate table pegadata.pr_data_decision_df_part;
truncate table pegadata.pr_data_decision_ddf_error;
truncate table pegadata.pr_data_decision_df_met;
truncate table pegadata.pr_log_dataflow_events;
truncate table pegadata.pr_data_decision_ddf_runtime;
truncate table pegadata.pr_sys_serviceregistry;
truncate table pegadata.pr_sys_serviceregistry_kvs;
-
Bring up one stream node at a time, wait for the first node to list the StreamServer.Default as NORMAL state and then bring up the next stream node. Use this approach up to last stream node.
-
If this approach for some reason does not work, please log a support incident via the MSP and in the ticket provide the startup logs and Kafka server logs (if any, located inside /appserver/Kafka-1.1.0.X/logs).
If you chose to log a support incident please provide the INC reference here to help us track it.
Hi @MarijeSchillern Thanks for your response!
I’ve tried the above workaround and it didn’t work.
This issue was resolved by following below steps
- delete Kafka-data folder
2 )add below DSS values
setting purpose: prconfig/dsm/services/stream/pyBaseLogPath/default
setting purpose: prconfig/dsm/services/stream/pyUnpackBasePath/default
owning ruleset: Pega-Engine
Value: provide the directory path of Kafka-data
-
bring down the application
-
truncate below tables
truncate table pr_data_stream_nodes
truncate table pr_data_stream_sessions
truncate table pr_data_stream_node_updates
truncate table PR_SYS_STATUSNODES
- start the application.