Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add doc index and improve installation with helm support #173

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 2 additions & 16 deletions containers/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,22 +3,8 @@
Container image gets automatically built by quay.io at [Cerberus image](https://quay.io/repository/chaos-kubox/cerberus). The builds will be triggered by any commit pushed to this repository.

### Run containerized version
Refer to the [instructions](https://github.com/cloud-bulldozer/cerberus/tree/master/containers/build_own_image-README.md) for information on how to build and run the containerized version of cerberus.
Refer to the [instructions](https://github.com/chaos-kubox/cerberus/tree/master/containers/build_own_image-README.md) for information on how to build and run the containerized version of cerberus.

### Cerberus as a Kubernetes/OpenShift application
To run containerized Cerberus as a Kubernetes/OpenShift Deployment, follow these steps:
1. Configure the [config.yaml](https://github.com/openshift-scale/cerberus/tree/master/config) file according to your requirements.
2. Create a namespace under which you want to run the cerberus pod using `kubectl create ns <namespace>`.
3. Switch to `<namespace>` namespace:
- In Kubernetes, use `kubectl config set-context --current --namespace=<namespace>`
- In OpenShift, use `oc project <namespace>`
4. Create a ConfigMap named kube-config using `kubectl create configmap kube-config --from-file=<path_to_kubeconfig>`
5. Create a ConfigMap named cerberus-config using `kubectl create configmap cerberus-config --from-file=<path_to_cerberus_config>`
6. Create a serviceaccount to run the cerberus pod with privileges using `kubectl create serviceaccount useroot`.
- In Openshift, execute `oc adm policy add-scc-to-user privileged -z useroot`.
7. Create a Deployment and a NodePort Service using `kubectl apply -f cerberus.yml`
8. Accessing the go/no-go signal:
- In Kubernetes, execute `kubectl port-forward --address 0.0.0.0 pod/<cerberus_pod_name> 8080:8080` and access the signal at `http://localhost:8080` and `http://<hostname>:8080`.
- In Openshift, create a route based on service cerberus-service using `oc expose service cerberus-service`. List all the routes using `oc get routes`. Use HOST/PORT associated with cerberus-service to access the signal.

NOTE: It is not recommended to run Cerberus internal to the cluster as the pod which is running Cerberus might get disrupted.
Refer to the [instructions](https://github.com/chaos-kubox/cerberus/blob/master/docs/installation.md#run-in-kubernetesopenshift) for information on how to run cerberus as a Kubernetes or OpenShift application.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets use a link to another doc like this: ../docs/installation.md#run-in-kubernetesopenshift

117 changes: 117 additions & 0 deletions docs/consume-signal.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
# Consume cerberus signal

Various examples on how to comsume the Cerberus signal :

- Using a simple http client (cURL)
- As an init container for a Pod
- As a tekton task (part of a pipeline)

## Simple check with curl

if you just want to check the cerberus signal, you can use this small script to check the cerberus signal.

```bash
export CERBERUS_URL=http://cerberus.chaos-cerberus.svc.cluster.local:8080
if curl -s "$CERBERUS_URL" | grep True &> /dev/null; then
echo "Cerberus check is OK at ${CERBERUS_URL}"
else
echo "Cerberus check is NOT OK at ${CERBERUS_URL}"
fi
```

## As an init container

This example allow you to start a container when cerberus return a OK signal.

```yaml
---
apiVersion: v1
kind: ConfigMap
metadata:
name: cerberus-config-job
data:
cerberus_url: "cerberus.default.svc.cluster.local"
---
apiVersion: batch/v1
kind: Job
metadata:
name: "check-node-if-cerberus-ok"
spec:
template:
spec:
initContainers:
- name: check-cerberus
image: quay.io/startx/runner-oc:fc35
command: ['bash', '-c', "until curl -s $CERBERUS_URL | grep True &> /dev/null; do echo Wait for OK from cerberus at $CERBERUS_URL; sleep 2; done"]
env:
- name: CERBERUS_URL
valueFrom:
configMapKeyRef:
name: cerberus-config-job
key: cerberus_url
containers:
- name: job-task
image: "quay.io/startx/runner-oc:fc35"
command:
- "/bin/bash"
- "-c"
- |-
echo "executed after a POSITIVE cerberus check agains't $CERBERUS_URL"
echo "Replace this container definition with your own description"
exit
env:
- name: CERBERUS_URL
valueFrom:
configMapKeyRef:
name: cerberus-config-job
key: cerberus_url
resources:
requests:
cpu: "10m"
memory: "64Mi"
limits:
cpu: "50m"
memory: "128Mi"
restartPolicy: Never
backoffLimit: 2
```

## Using tekton pipeline
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@startx-lab thoughts on creating a new section called use cases/Cerberus in practice and moving the content over there similar to how you did for Kraken: krkn-chaos/krkn#333?


You can find on [artifacthub.io](https://artifacthub.io/packages/search?kind=7&ts_query_web=cerberus) the
[cerberus-check](https://artifacthub.io/packages/tekton-task/startx-tekton-catalog/cerberus-check) `tekton-task`
which can be used to check a cerberus signal (and a cluster global health) as part of a chaos pipeline.
You can read [tekton concepts](https://tekton.dev/docs/concepts/overview/), [tekton pipeline entities](https://github.com/tektoncd/pipeline/blob/main/docs/README.md#tekton-pipelines-entities), [tekton getting started](https://tekton.dev/docs/getting-started/tasks/) and the
[openshift pipeline documentation](https://docs.openshift.com/container-platform/4.10/cicd/pipelines/understanding-openshift-pipelines.html)
to get familiar with this project.

### Installing tekton

#### OpenShift cluster

To use this task, you must have tekton enabled into your cluster. For Openshift cluster, an operator named **Openshift pipeline** enable tekton in your cluster. You can use the OperatorHub to find the `Openshift pipeline` operator and deploy it in your openshift cluster. You can also use [startx helm-chart pipeline](https://helm-repository.readthedocs.io/en/latest/charts/cluster-pipeline/) for easy and automatic install.

#### Kubernetes cluster

```bash
kubectl apply --filename https://storage.googleapis.com/tekton-releases/pipeline/latest/release.yaml
kubectl get pods --namespace tekton-pipelines --watch
```

### Running cerberus tekton task

#### Start as a single taskrun

```bash
kubectl project default
kubectl apply -f https://github.com/startxfr/tekton-catalog/raw/stable/task/cerberus-check/0.1/samples/taskrun.yaml
kubectl get taskrun pod
```

#### Start as a pipelinerun

```bash
kubectl project default
kubectl apply -f https://github.com/startxfr/tekton-catalog/raw/stable/task/cerberus-check/0.1/samples/pipelinerun.yaml
kubectl get pipelinerun taskrun pod
```
31 changes: 31 additions & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Cerberus watchdog Guide

## Table of Contents
- [Cerberus watchdog Guide](#cerberus-watchdog-guide)
- [Table of Contents](#table-of-contents)
- [Introduction](#introduction)
- [Tooling](#tooling)
- [Workflow](#workflow)

## Introduction

One keypoint of a chaos infrastructure test is the way to obtain a reliable status of the health of your targeted cluster.
Cerberus is that master piece component that observe regulary various central components of your targeted cluster and return an updated
signal of the global health of you cluster.

For more detail about chaos challenges, read the [cerberus introduction to chaos testing](https://github.com/chaos-kubox/krkn/blob/main/docs/index.md#introduction)

## Tooling

In this section, we will go through how [cerberus](https://github.com/chaos-kubox/cerberus) - a cluster watchdog can help test the global health state of OpenShift and make sure you track state change and return an updated global health signal.

## Workflow

Let us start by understanding the workflow of Cerberus: the user will start by running cerberus by pointing to a specific OpenShift cluster using kubeconfig to be able to talk to the platform on top of which the OpenShift cluster is hosted. This can be done by either the oc/kubectl API or the cloud API. Based on the configuration of cerberus, it will [watch for nodes](https://github.com/startxfr/cerberus/blob/main/docs/config.md#watch-nodes),
[watch for cluster operators](https://github.com/startxfr/cerberus/blob/main/docs/config.md#watch-cluster-operators),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a list here of components that cerberus can watch. Could you update that check box list with the links to the docs/config section for each? Please be sure to add the link to the main chaos-kubox repo not your own thanks

[watch for master schedulable status](https://github.com/startxfr/cerberus/blob/main/docs/config.md#watch-master-schedulable-status),
[watch for defined namespaces](https://github.com/startxfr/cerberus/blob/main/docs/config.md#watch-namespaces) and
[watch for defined routes](https://github.com/startxfr/cerberus/blob/main/docs/config.md#watch-routes).
Accoridng to the result of theses check, cerberus will return a go/no-go signal representing the overall health of the cluster.

![Cerberus workflow](../media/cerberus-workflow.png)
63 changes: 61 additions & 2 deletions docs/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,5 +76,64 @@ The go/no-go signal ( True or False ) gets published at http://`<hostname>`:8080
If you want to build your own Cerberus image, see [here](https://github.com/cloud-bulldozer/cerberus/tree/master/containers/build_own_image-README.md).
To run Cerberus on Power (ppc64le) architecture, build and run a containerized version by following the instructions given [here](https://github.com/cloud-bulldozer/cerberus/tree/master/containers/build_own_image-README.md).

## Run containerized Cerberus as a Kubernetes/OpenShift deployment
Refer to the [instructions](https://github.com/openshift-scale/cerberus/blob/master/containers/README.md#cerberus-as-a-kubernetesopenshift-application) for information on how to run cerberus as a Kubernetes or OpenShift application.
## Run in Kubernetes/OpenShift

### Using a deployment

To run containerized Cerberus as a Kubernetes/OpenShift Deployment, follow these steps:
1. Configure the [config.yaml](https://github.com/openshift-scale/cerberus/tree/master/config) file according to your requirements.
2. Create a namespace under which you want to run the cerberus pod using `kubectl create ns <namespace>`.
3. Switch to `<namespace>` namespace:
- In Kubernetes, use `kubectl config set-context --current --namespace=<namespace>`
- In OpenShift, use `oc project <namespace>`
4. Create a ConfigMap named kube-config using `kubectl create configmap kube-config --from-file=<path_to_kubeconfig>`
5. Create a ConfigMap named cerberus-config using `kubectl create configmap cerberus-config --from-file=<path_to_cerberus_config>`
6. Create a serviceaccount to run the cerberus pod with privileges using `kubectl create serviceaccount useroot`.
- In Openshift, execute `oc adm policy add-scc-to-user privileged -z useroot`.
7. Create a Deployment and a NodePort Service using `kubectl apply -f cerberus.yml`
8. Accessing the go/no-go signal:
- In Kubernetes, execute `kubectl port-forward --address 0.0.0.0 pod/<cerberus_pod_name> 8080:8080` and access the signal at `http://localhost:8080` and `http://<hostname>:8080`.
- In Openshift, create a route based on service cerberus-service using `oc expose service cerberus-service`. List all the routes using `oc get routes`. Use HOST/PORT associated with cerberus-service to access the signal.

NOTE: It is not recommended to run Cerberus internal to the cluster as the pod which is running Cerberus might get disrupted.

### Using a helm-chart

You can find on [artifacthub.io](https://artifacthub.io/packages/search?kind=0&ts_query_web=cerberus) the
[chaos-cerberus](https://artifacthub.io/packages/helm/startx/chaos-cerberus) `helm-chart`
which can be used to deploy a cerberus server.

Default configuration create the following resources :

- 1 project named **chaos-cerberus**
- 1 scc with privileged context for **cerberus** deployment
- 1 configmap named **cerberus-config** with cerberus configuration
- 1 configmap named **cerberus-kubeconfig** with kubeconfig of the targeted cluster
- 2 networkpolicy to allow kraken and route to consume the signal
- 1 deployment named **cerberus**
- 1 service to the cerberus pods
- 1 route to the cerberus service

```bash
# Install the startx helm repository
helm repo add startx https://startxfr.github.io/helm-repository/packages/
# Install the cerberus project
helm install --set project.enabled=true chaos-cerberus-project startx/chaos-cerberus
# Deploy the cerberus instance
helm install \
--set cerberus.enabled=true \
--set cerberus.kraken_allowed=true \
--set cerberus.kraken_ns="chaos-kraken" \
--set cerberus.kubeconfig.token.server="https://api.mycluster:6443" \
--set cerberus.kubeconfig.token.token="sha256~XXXXXXXXXX_PUT_YOUR_TOKEN_HERE_XXXXXXXXXXXX" \
-n chaos-cerberus \
chaos-cerberus-instance startx/chaos-cerberus
```

Refer to the [chaos-cerberus chart manpage](https://artifacthub.io/packages/helm/startx/chaos-cerberus)
and especially the [cerberus configuration values](https://artifacthub.io/packages/helm/startx/chaos-cerberus#chaos-cerberus-values-dictionary)
for details on how to configure this chart.

## Consuming the cerberus signal

You can find various example in the [consume signal page](./consume-signal.md).