Adding getting started docs

This commit is contained in:
prubenda
2021-02-25 10:06:10 -05:00
committed by Naga Ravi Chaitanya Elluri
parent 41bf815f98
commit 5456fce924
5 changed files with 96 additions and 17 deletions

View File

@@ -7,14 +7,17 @@ Kraken injects deliberate failures into Kubernetes/OpenShift clusters to check i
![Kraken workflow](media/kraken-workflow.png)
### Installation and usage
### How to Get Started
Instructions on how to setup, configure and run Kraken can be found at [Installation](docs/installation.md).
See the [getting started doc](docs/getting_started.md) on support on how to get started with your own custom scenario or editing current scenarios for your specific usage
After installation, refer back to the below sections for supported scenarios and how to tweak the kraken config to load them on your cluster
### Config
Instructions on how to setup the config and the options supported can be found at [Config](docs/config.md).
### Kubernetes/OpenShift chaos scenarios supported
Kraken supports pod, node, time/date and [litmus](https://github.com/litmuschaos/litmus) based scenarios.

View File

@@ -47,6 +47,7 @@ $ git rebase -i <commit_id_of_first_change_commit>
In the interactive rebase screen, set the first commit to `pick` and all others to `squash` (or whatever else you may need to do).
Push your rebased commits (you may need to force), then issue your PR.
```

81
docs/getting_started.md Normal file
View File

@@ -0,0 +1,81 @@
## Getting Started Running Chaos Scenarios
#### Adding New Scenarios
Adding a new scenario is as simple as adding a new config file under [scenarios directory](https://github.com/cloud-bulldozer/kraken/tree/master/scenarios) and defining it in the main kraken [config](https://github.com/cloud-bulldozer/kraken/blob/master/config/config.yaml#L8).
You can either copy an existing yaml file and make it your own or fill in one of the templates below to suit your needs
### Templates
#### Pod Scenario Yaml Template
For example, for adding a pod level scenario for a new application, refer to the sample scenario below to know what fields are necessary and what to add in each location:
```
config:
runStrategy:
runs: <number of times to execute the scenario>
#This will choose a random number to wait between min and max
maxSecondsBetweenRuns: 30
minSecondsBetweenRuns: 1
scenarios:
- name: "delete pods example"
steps:
- podAction:
matches:
- labels:
namespace: "<namespace>"
selector: "<pod label>" #this can be left blank
filters:
- randomSample:
size: <number of pods to kill>
actions:
- kill:
probability: 1
force: true
- podAction:
matches:
- labels:
namespace: "<namespace>"
selector: "<pod label>" #this can be left blank
retries:
retriesTimeout:
# Amount of time to wait with retrying, before failing if pod count doesn't match expected
timeout: 180
actions:
- checkPodCount:
count: <expected number of pods that match namespace and label"
```
More information on specific items that you can add to the pod killing scenarios can be found in the [powerfulseal policies](https://powerfulseal.github.io/powerfulseal/policies) documentation
#### Node Scenario Yaml Template
```
node_scenarios:
- actions: # node chaos scenarios to be injected
- <chaos scenario>
- <chaos scenario>
node_name: <node name> # can be left blank
label_selector: <node label>
instance_kill_count: <number of ndoes to perform action on>
timeout: <duration to wait for completion>
cloud_type: <cloud provider>
```
#### Time Chaos Scenario Template
```
time_scenarios:
- action: 'skew_time' or 'skew_date'
object_type: 'pod' or 'node'
label_selector: <label of pod or node>
```
### Common Scenario Edits
If you just want to make small changes to pre-existing scenarios, feel free to edit the scenario file itself
#### Example of Quick Pod Scenario Edit:
If you want to kill 2 pods instead of 1 in any of the pre-existing scenarios, you can either edit the number located at filters -> randomSample -> size or the runs under the config -> runStrategy section
#### Example of Quick Nodes Scenario Edit:
If your cluster is build on GCP instead of AWS, just change the cloud type in the node_scenarios_example.yml file

View File

@@ -1,17 +1,14 @@
### Pod Scenarios
Kraken consumes [Powerfulseal](https://github.com/powerfulseal/powerfulseal) under the hood to run the pod scenarios.
These scenarios are in a simple yaml format that you can manipulate to run your specific tests or use the pre-existing scenarios to see how it works
#### Pod chaos scenarios
Following are the components of Kubernetes/OpenShift for which a basic chaos scenario config exists today. Adding a new pod based scenario is as simple as adding a new config under scenarios directory and defining it in the config.
For example, for adding a pod level scenario for a custom application, refer to the sample scenario provided in the scenarios directory (scenarios/customapp_pod.yaml).
The following are the components of Kubernetes/OpenShift for which a basic chaos scenario config exists today.
Component | Description | Working
------------------------ | ---------------------------------------------------------------------------------------------------| ------------------------- |
Etcd | Kills a single/multiple etcd replicas for the specified number of times in a loop | :heavy_check_mark: |
Kube ApiServer | Kills a single/multiple kube-apiserver replicas for the specified number of times in a loop | :heavy_check_mark: |
ApiServer | Kills a single/multiple apiserver replicas for the specified number of times in a loop | :heavy_check_mark: |
Prometheus | Kills a single/multiple prometheus replicas for the specified number of times in a loop | :heavy_check_mark: |
OpenShift System Pods | kills random pods running in the OpenShift system namespaces | :heavy_check_mark: |
**NOTE**: [Writing policies](https://powerfulseal.github.io/powerfulseal/policies) can be referred for more information on how to write new scenarios.
[Etcd](https://github.com/cloud-bulldozer/kraken/blob/master/scenarios/etcd.yml) | Kills a single/multiple etcd replicas for the specified number of times in a loop | :heavy_check_mark: |
[Kube ApiServer](https://github.com/cloud-bulldozer/kraken/blob/master/scenarios/openshift-kube-apiserver.yml) | Kills a single/multiple kube-apiserver replicas for the specified number of times in a loop | :heavy_check_mark: |
[ApiServer](https://github.com/cloud-bulldozer/kraken/blob/master/scenarios/openshift-apiserver.yml) | Kills a single/multiple apiserver replicas for the specified number of times in a loop | :heavy_check_mark: |
[Prometheus](https://github.com/cloud-bulldozer/kraken/blob/master/scenarios/prometheus.yml) | Kills a single/multiple prometheus replicas for the specified number of times in a loop | :heavy_check_mark: |
[OpenShift System Pods](https://github.com/cloud-bulldozer/kraken/blob/master/scenarios/regex_openshift_pod_kill.yml) | Kills random pods running in the OpenShift system namespaces | :heavy_check_mark: |

View File

@@ -115,10 +115,7 @@ def main(cfg):
# Inject pod chaos scenarios specified in the config
if scenario_type == "pod_scenarios":
failed_post_scenarios = pod_scenarios.run(
kubeconfig_path, scenarios_list, config, failed_post_scenarios, wait_duration,
)
failed_post_scenarios = pod_scenarios.run(
kubeconfig_path, scenarios_list, config, failed_post_scenarios, wait_duration,
kubeconfig_path, scenarios_list, config, failed_post_scenarios, wait_duration
)
# Inject node chaos scenarios specified in the config
@@ -134,7 +131,7 @@ def main(cfg):
common_litmus.deploy_all_experiments(litmus_version)
litmus_installed = True
litmus_namespaces = common_litmus.run(
scenarios_list, config, litmus_namespaces, litmus_uninstall, wait_duration,
scenarios_list, config, litmus_namespaces, litmus_uninstall, wait_duration
)
elif scenario_type == "cluster_shut_down_scenarios":
shut_down.run(scenarios_list, config, wait_duration)