Decision Data Store and Real-Time data grid nodes status showing as Joining_Failed in CDH 8.8

Hi Everyone,

We installed Pega CDH 8.8 version and Created one sample application. While running the campaigns,
we Observed that Decision Data Store and Real-Time data grid nodes are failing. Showing the status as Joining_Failed. Below is the stack trace. Could someone please assist.

com.pega.dsm.dnode.api.DNodeException: Cannot bootstrap cassandra
at com.pega.dsm.dnode.impl.cassandra.Cassandra.bootstrap(Cassandra.java:176)
at com.pega.dsm.dnode.api.DdsService.startCassandra(DdsService.java:859)
at com.pega.dsm.dnode.api.DdsService.access$1600(DdsService.java:99)
at com.pega.dsm.dnode.api.DdsService$DdsStartOperation$2$1.execute(DdsService.java:320)
at com.pega.dsm.dnode.util.OperationWithLock$1.execute(OperationWithLock.java:91)
at com.pega.dsm.dnode.util.OperationWithLock$LockingOperation.couldAcquireLock(OperationWithLock.java:190)
at com.pega.dsm.dnode.util.OperationWithLock$LockingOperation.performLockOperation(OperationWithLock.java:157)
at com.pega.dsm.dnode.util.OperationWithLock$LockingOperation.access$200(OperationWithLock.java:102)
at com.pega.dsm.dnode.util.OperationWithLock.doWithLock(OperationWithLock.java:99)
at com.pega.dsm.dnode.util.OperationWithLock.doWithLock(OperationWithLock.java:95)
at com.pega.dsm.dnode.util.OperationWithLock.doWithLock(OperationWithLock.java:76)
at com.pega.dsm.dnode.impl.prpc.service.ServiceHelper.executeWithLockInternal(ServiceHelper.java:265)
at com.pega.dsm.dnode.impl.prpc.service.ServiceHelper.executeWithLock(ServiceHelper.java:213)
at com.pega.dsm.dnode.api.prpc.service.AbstractDsmService.executeWithLock(AbstractDsmService.java:367)
at com.pega.dsm.dnode.api.DdsService.access$2100(DdsService.java:99)
at com.pega.dsm.dnode.api.DdsService$DdsStartOperation$2.emit(DdsService.java:311)
at com.pega.dsm.dnode.impl.stream.DataObservableImpl$SafeDataSubscriber.subscribe(DataObservableImpl.java:353)
at com.pega.dsm.dnode.impl.stream.DataObservableImpl.subscribe(DataObservableImpl.java:55)
at com.pega.dsm.dnode.impl.stream.DataObservableImpl.await(DataObservableImpl.java:117)
at com.pega.dsm.dnode.impl.stream.DataObservableImpl.await(DataObservableImpl.java:106)
at com.pega.dsm.dnode.api.prpc.service.operation.StartOperation$1.execute(StartOperation.java:167)
at com.pega.dsm.dnode.util.OperationWithLock$LockingOperation.couldAcquireLock(OperationWithLock.java:190)
at com.pega.dsm.dnode.util.OperationWithLock$LockingOperation.performLockOperation(OperationWithLock.java:157)
at com.pega.dsm.dnode.util.OperationWithLock$LockingOperation.access$200(OperationWithLock.java:102)
at com.pega.dsm.dnode.util.OperationWithLock.doWithLock(OperationWithLock.java:99)
at com.pega.dsm.dnode.util.OperationWithLock.doWithLock(OperationWithLock.java:95)
at com.pega.dsm.dnode.impl.prpc.service.ServiceHelper.executeWithLockInternal(ServiceHelper.java:273)
at com.pega.dsm.dnode.impl.prpc.service.ServiceHelper.executeWithLock(ServiceHelper.java:221)
at com.pega.dsm.dnode.api.prpc.service.operation.StartOperation.doActualServerStart(StartOperation.java:164)
at com.pega.dsm.dnode.api.prpc.service.operation.StartOperation.performStartupWithRetries(StartOperation.java:137)
at com.pega.dsm.dnode.api.prpc.service.operation.StartOperation.initializeServerMode(StartOperation.java:117)
at com.pega.dsm.dnode.api.prpc.service.operation.StartOperation.lambda$bootstrap$0(StartOperation.java:85)
at com.pega.dsm.dnode.impl.stream.DataObservableImpl$SafeDataSubscriber.subscribe(DataObservableImpl.java:353)
at com.pega.dsm.dnode.impl.stream.DataObservableImpl.subscribe(DataObservableImpl.java:55)
at com.pega.dsm.dnode.api.DdsService$DdsStartOperation$1.emit(DdsService.java:300)
at com.pega.dsm.dnode.impl.stream.DataObservableImpl$SafeDataSubscriber.subscribe(DataObservableImpl.java:353)
at com.pega.dsm.dnode.impl.stream.DataObservableImpl.subscribe(DataObservableImpl.java:55)
at com.pega.dsm.dnode.impl.stream.DataObservableImpl.await(DataObservableImpl.java:117)
at com.pega.dsm.dnode.impl.stream.DataObservableImpl.await(DataObservableImpl.java:106)
at com.pega.dsm.dnode.impl.prpc.service.ServiceDefinition.startService(ServiceDefinition.java:81)
at com.pega.dsm.dnode.impl.prpc.service.ServiceDefinition.start(ServiceDefinition.java:66)
at com.pega.dsm.dnode.api.prpc.service.ServiceManager$4.run(ServiceManager.java:429)
at com.pega.dsm.dnode.util.PrpcRunnable.execute(PrpcRunnable.java:77)
at com.pega.dsm.dnode.impl.prpc.service.ServiceHelper.executeInPrpcContextInternal(ServiceHelper.java:305)
at com.pega.dsm.dnode.impl.prpc.service.ServiceHelper.executeInPrpcContext(ServiceHelper.java:150)
at com.pega.dsm.dnode.api.prpc.service.ServiceManager.startServiceDefinition(ServiceManager.java:426)
at com.pega.dsm.dnode.api.prpc.service.ServiceManager.lambda$bootstrap$3(ServiceManager.java:388)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at com.pega.dsm.dnode.util.PrpcRunnable$1.run(PrpcRunnable.java:69)
at com.pega.dsm.dnode.util.PrpcRunnable$1.run(PrpcRunnable.java:66)
at com.pega.dsm.dnode.util.PrpcRunnable.execute(PrpcRunnable.java:77)
at com.pega.dsm.dnode.impl.prpc.PrpcThreadFactory$PrpcThread.run(PrpcThreadFactory.java:164)
Caused by: com.pega.dsm.dnode.api.DNodeException: Unable to start DDS. Process has exited with code: 1
NOTE: Picked up JDK_JAVA_OPTIONS: --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.rmi/sun.rmi.transport=ALL-UNNAMED
intx ThreadPriorityPolicy=42 is outside the allowed range [ 0 … 1 ]
Improperly specified VM option ‘ThreadPriorityPolicy=42’
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.

at com.pega.dsm.dnode.impl.cassandra.Cassandra.awaitCassandra(Cassandra.java:476)
at com.pega.dsm.dnode.api.prpc.service.managedprocess.ManagedProcess.start(ManagedProcess.java:185)
at com.pega.dsm.dnode.impl.cassandra.Cassandra.bootstrap(Cassandra.java:167)
… 56 more

DDS_LogFile.txt (6.2 KB)

@YaswanthGummadivelli Can you try to do the same mentioned in this post? Restarting may help to bring the nodes back to normal.

https://support.pega.com/question/decision-data-store-nodes-joiningfailed

https://support.pega.com/question/joiningfailed-kafka-node

@YaswanthGummadivelli I am guessing you are running Cassandra in embedded mode and not connecting to an external Cassandra cluster.

What version of Java are you using? The embedded Cassandra works with Java 8 and I think your instance is having a higher version that is not compatible. Please use Java 8 and try again.

@YaswanthGummadivelli

I can see that you have an issue with the Decision Data Store and Real-Time data grid nodes in your Pega CDH 8.8 environment. Based on the stack trace you provided, it appears that there is a problem with bootstrapping Cassandra, which is causing the nodes to fail.

The error message “Unable to start DDS. Process has exited with code: 1” indicates that the process for starting the Decision Data Store has failed with an exit code of 1. Additionally, the message “Error: Could not create the Java Virtual Machine. Error: A fatal exception has occurred. Program will exit.” suggests that there may be a problem with the Java Virtual Machine (JVM) on your system.

The error “intx ThreadPriorityPolicy=42 is outside the allowed range [0 … 1 ] Improperly specified VM option ‘ThreadPriorityPolicy=42’” indicates that the JVM is being started with an invalid option for the ThreadPriorityPolicy. This could be causing the JVM to fail, which in turn causes Cassandra to fail to bootstrap and the Decision Data Store and Real-Time data grid nodes to fail as well.

To resolve this issue, I recommend checking your JVM options and removing any invalid options that may be causing he problem. Additionally, you may want to ensure that your system meets the minimum hardware and software requirements for Pega CDH 8.8, as outlined in the Pega Platform Support Guide.

If the issue persists after checking your JVM options and system requirements, I recommend opening a support ticket with Pega Support for further assistance.

I hope this helps.