node-problem-detector aims to make various node problems visible to the upstream layers in cluster management stack.
Please follow the instructions in the installation section to setup Node Problem Detector on Kubernetes. The following instructions are setting it up on OpenShift:
- Create
openshift-node-problem-detector
namespace ns.yaml withoc create -f ns.yaml
- Add cluster role with
oc adm policy add-cluster-role-to-user system:node-problem-detector -z default -n openshift-node-problem-detector
- Add security context constraints with
oc adm policy add-scc-to-user privileged system:serviceaccount:openshift-node-problem-detector:default
- Edit node-problem-detector.yaml to fit your environment.
- Edit node-problem-detector-config.yaml to configure node-problem-detector.
- Create the ConfigMap with
oc create -f node-problem-detector-config.yaml
- Create the DaemonSet with
oc create -f node-problem-detector.yaml
Once installed you will see node-problem-detector pods in openshift-node-problem-detector namespace.
Now enable openshift-node-problem-detector in the config.yaml.
Cerberus just monitors KernelDeadlock
condition provided by the node problem detector as it is system critical and can hinder node performance.