Containerize kraken

This commit adds support to run the tool as a container on the host
with access to kubeconfig for better portability. The plan is to
trigger regular image builds on quay.io to make sure it has the
latest code.
This commit is contained in:
Naga Ravi Chaitanya Elluri
2020-04-20 18:00:52 -04:00
parent f1c145e942
commit eec52cf613
5 changed files with 49 additions and 2 deletions

View File

@@ -33,11 +33,26 @@ tunings:
$ python3 run_kraken.py --config <config_file_location>
```
#### Run containerized version
Assuming that the latest docker ( 17.05 or greater with multi-build support ) is intalled on the host, run:
```
$ docker pull quay.io/openshift-scale/kraken:latest
$ docker run --name=kraken --net=host -v <path_to_kubeconfig>:/root/.kube/config -v <path_to_kraken_config>:/root/kraken/config/config.yaml -d quay.io/openshift-scale/kraken:latest
$ docker logs -f kraken
```
Similarly, podman can be used to achieve the same:
```
$ podman pull quay.io/openshift-scale/kraken
$ podman run --name=kraken --net=host -v <path_to_kubeconfig>:/root/.kube/config:Z -v <path_to_kraken_config>:/root/kraken/config/config.yaml:Z -d quay.io/openshift-scale/kraken:latest
$ podman logs -f kraken
```
#### Report
The report is generated in the run directory and it contains the information about each chaos scenario injection along with timestamps.
#### Checking if the cluster is sane after failures injection
[Cerberus](https://github.com/openshift-scale/cerberus) can be used to monitor the cluster under test and the aggregated go/no-go signal generated by it can be consumed by Kraken to determine pass/fail i.e make sure the Kubernetes/OpenShift cluster recovered fine after the failure injection. It is highly recommended to turn on the Cerberus health check feature avaliable in Kraken after installing and setting up Cerberus. To do that, set cerberus_enabled to True and cerberus_url to the url where Cerberus publishes go/no-go signal in the config file.
#### Cerberus to help with cluster health checks
[Cerberus](https://github.com/openshift-scale/cerberus) can be used to monitor the cluster under test and the aggregated go/no-go signal generated by it can be consumed by Kraken to determine pass/fail. This is to make sure the Kubernetes/OpenShift environments are healthy on a cluster level instead of just the targeted components level. It is highly recommended to turn on the Cerberus health check feature avaliable in Kraken after installing and setting up Cerberus. To do that, set cerberus_enabled to True and cerberus_url to the url where Cerberus publishes go/no-go signal in the config file.
### Kubernetes/OpenShift chaos scenarios supported
Following are the components of Kubernetes/OpenShift for which a basic chaos scenario config exists today. It currently just supports pod based scenarios, we will be adding more soon. Adding a new pod based scenario is as simple as adding a new config under scenarios directory and defining it in the config.

23
containers/Dockerfile Normal file
View File

@@ -0,0 +1,23 @@
# Dockerfile for kraken
FROM quay.io/openshift/origin-tests:latest as origintests
FROM centos:7
MAINTAINER Red Hat OpenShift Performance and Scale
ENV KUBECONFIG /root/.kube/config
# Copy OpenShift CLI, Kubernetes CLI from origin-tests image
COPY --from=origintests /usr/bin/oc /usr/bin/oc
COPY --from=origintests /usr/bin/kubectl /usr/bin/kubectl
# Install dependencies
RUN yum install -y git python36 python3-pip && \
git clone https://github.com/openshift-scale/kraken /root/kraken && \
mkdir -p /root/.kube && cd /root/kraken && \
pip3 install -r requirements.txt
WORKDIR /root/kraken
ENTRYPOINT python3 run_kraken.py --config=config/config.yaml

6
containers/README.md Normal file
View File

@@ -0,0 +1,6 @@
### Kraken image
Container image gets automatically built by quay.io at [Kraken image](https://quay.io/repository/openshift-scale/kraken).
### Run containerized version
Refer [instructions](https://github.com/openshift-scale/kraken#Run-containerized-version) for information on how to run the containerized version of kraken.

View File

@@ -1,3 +1,4 @@
datetime
pyfiglet
powerfulseal
requests

View File

@@ -58,6 +58,8 @@ def main(cfg):
if not cerberus_status:
logging.error("Received a no-go signal from Cerberus, looks like the cluster is unhealthy. Please check the Cerberus report for more details. Test failed.")
sys.exit(1)
else:
logging.info("Received a go signal from Ceberus, the cluster is healthy. Test passed.")
except Exception as e:
logging.error("Failed to run scenario: %s. Encountered the following exception: %s" %(scenario, e))
else: