mirror of
https://github.com/krkn-chaos/krkn.git
synced 2026-02-14 18:10:00 +00:00
Adding server set up for kraken
This commit is contained in:
committed by
Naga Ravi Chaitanya Elluri
parent
d3f8e2dd35
commit
6b865fc573
@@ -65,6 +65,13 @@ It's important to make sure to check if the targeted component recovered from th
|
||||
- Leveraging [Cerberus](https://github.com/openshift-scale/cerberus) to monitor the cluster under test and consuming the aggregated go/no-go signal to determine pass/fail post chaos. It is highly recommended to turn on the Cerberus health check feature avaliable in Kraken. Instructions on installing and setting up Cerberus can be found [here](https://github.com/openshift-scale/cerberus#installation) or can be installed from Kraken using the [instructions](https://github.com/cloud-bulldozer/kraken#setting-up-infrastructure-dependencies). Once Cerberus is up and running, set cerberus_enabled to True and cerberus_url to the url where Cerberus publishes go/no-go signal in the Kraken config file. Cerberus can monitor [application routes](https://github.com/cloud-bulldozer/cerberus/blob/master/docs/config.md#watch-routes) during the chaos and fails the run if it encounters downtime as it's a potential downtime in customer, users environment as well. It is especially important during the control plane chaos scenarios including the API server, Etcd, Ingress etc. It can be enabled by setting `check_applicaton_routes: True` in the [Kraken config](https://github.com/cloud-bulldozer/kraken/blob/master/config/config.yaml) provided application routes are being monitored in the [cerberus config](https://github.com/cloud-bulldozer/kraken/blob/master/config/cerberus.yaml)
|
||||
- Leveraging [kube-burner](docs/alerts.md) alerting feature to fail the runs in case of critical alerts.
|
||||
|
||||
### Signaling
|
||||
In CI runs or any external job it is useful to stop Kraken once a certain test or state gets reached. We created a way to signal to kraken to pause the chaos or stop it completely using a signal posted to a port of your choice
|
||||
|
||||
For example if we have a test run loading the cluster running and kraken separately running; we want to be able to know when to start/stop the kraken run based on when the test run completes or gets to a certain loaded state
|
||||
|
||||
More detailed information on enabling and leveraging this feature can be found [here](docs/signal.md).
|
||||
|
||||
|
||||
### Performance monitoring
|
||||
Monitoring the Kubernetes/OpenShift cluster to observe the impact of Kraken chaos scenarios on various components is key to find out the bottlenecks as it's important to make sure the cluster is healthy in terms if both recovery as well as performance during/after the failure has been injected. Instructions on enabling it can be found [here](docs/performance_dashboards.md).
|
||||
|
||||
Reference in New Issue
Block a user