Add support to deploy performance dashboards

This commit enables performance monitoring on the cluster when
running Kraken to be able to observe how cluster reacts to failures
as it's important to make sure the cluster is healthy in terms of
both recovery as well as performance.
This commit is contained in:
Naga Ravi Chaitanya Elluri
2021-02-07 17:50:05 -05:00
parent a42adf89e8
commit a7e28ca490
9 changed files with 53 additions and 2 deletions

View File

@@ -22,8 +22,12 @@ cerberus:
cerberus_enabled: False # Enable it when cerberus is previously installed
cerberus_url: # When cerberus_enabled is set to True, provide the url where cerberus publishes go/no-go signal
performance_monitoring:
deploy_dashboards: False # Install a mutable grafana and load the performance dashboards. Enable this only when running on OpenShift
repo: "https://github.com/cloud-bulldozer/performance-dashboards.git"
tunings:
wait_duration: 60 # Duration to wait between each chaos scenario
iterations: 1 # Number of times to execute the scenarios
daemon_mode: False # Iterations are set to infinity which means that the kraken will cause chaos forever
```
```

View File

@@ -21,6 +21,8 @@ $ cd kraken
$ pip3 install -r requirements.txt
```
**NOTE**: Make sure python3-devel is installed on the system.
#### Run
```
$ python3 run_kraken.py --config <config_file_location>

View File

@@ -0,0 +1,12 @@
## Performance dashboards
Kraken supports installing a mutable grafana on the cluster with the dashboards loaded to help with monitoring the cluster for things like resource usage to find the outliers, API stats, Etcd health, Critical alerts etc. It can be deployed by enabling the following in the config:
```
performance_monitoring:
deploy_dashboards: True
```
The route and credentials to access the dashboards will be printed on the stdout before Kraken starts creating chaos. The dashboards can be edited/modified to include your queries of interest.
**NOTE**: The dashboards leverage Prometheus for scraping the metrics off of the cluster and currently only supports OpenShift since Prometheus is setup on the cluster by default and leverages routes object to expose the grafana dashboards externally.