Modify documentation to improve readability

This commit:
- Converts various sections in the readme into individual documents.
- Adds pointers to the public blogs.
- Updates workflow/architecture diagram.
- Adds community info and contributing guidelines.
This commit is contained in:
Naga Ravi Chaitanya Elluri
2020-10-10 15:40:15 -04:00
parent d0f7b95814
commit 82743230fe
7 changed files with 153 additions and 96 deletions

21
docs/config.md Normal file
View File

@@ -0,0 +1,21 @@
### Config
Set the scenarios to inject and the tunings like duration to wait between each scenario in the config file located at config/config.yaml. A sample config looks like:
```
kraken:
kubeconfig_path: /root/.kube/config # Path to kubeconfig
scenarios: # List of policies/chaos scenarios to load
- scenarios/etcd.yml
- scenarios/openshift-kube-apiserver.yml
- scenarios/openshift-apiserver.yml
node_scenarios: # List of chaos node scenarios to load
- scenarios/node_scenarios_example.yml
cerberus:
cerberus_enabled: False # Enable it when cerberus is previously installed
cerberus_url: # When cerberus_enabled is set to True, provide the url where cerberus publishes go/no-go signal
tunings:
wait_duration: 60 # Duration to wait between each chaos scenario
iterations: 1 # Number of times to execute the scenarios
daemon_mode: False # Iterations are set to infinity which means that the kraken will cause chaos forever

48
docs/installation.md Normal file
View File

@@ -0,0 +1,48 @@
## Installation
Following ways are supported to run Kraken:
- Standalone python program through Git
- Containerized version using either Podman or Docker as the runtime
- Kubernetes or OpenShift deployment
**NOTE**: It is recommended to run Kraken external to the cluster ( Standalone or Containerized ) hitting the Kubernetes/OpenShift API as running it internal to the cluster might be disruptive to itself and also might not report back the results if the chaos leads to cluster's API server instability.
### Git
#### Clone the repository
```
$ git clone https://github.com/openshift-scale/kraken.git
$ cd kraken
```
#### Install the dependencies
```
$ pip3 install -r requirements.txt
```
#### Run
```
$ python3 run_kraken.py --config <config_file_location>
```
### Run containerized version
Assuming that the latest docker ( 17.05 or greater with multi-build support ) is intalled on the host, run:
```
$ docker pull quay.io/openshift-scale/kraken:latest
$ docker run --name=kraken --net=host -v <path_to_kubeconfig>:/root/.kube/config -v <path_to_kraken_config>:/root/kraken/config/config.yaml -d quay.io/openshift-scale/kraken:latest
$ docker logs -f kraken
```
Similarly, podman can be used to achieve the same:
```
$ podman pull quay.io/openshift-scale/kraken
$ podman run --name=kraken --net=host -v <path_to_kubeconfig>:/root/.kube/config:Z -v <path_to_kraken_config>:/root/kraken/config/config.yaml:Z -d quay.io/openshift-scale/kraken:latest
$ podman logs -f kraken
```
If you want to build your own kraken image see [here](https://github.com/openshift-scale/kraken/tree/master/containers/build_own_image-README.md)
### Run Kraken as a Kubernetes deployment
Refer [Instructions](https://github.com/openshift-scale/kraken/blob/master/containers/README.md) on how to deploy and run Kraken as a Kubernetes/OpenShift deployment.

43
docs/node_scenarios.md Normal file
View File

@@ -0,0 +1,43 @@
### Node Scenarios
Following node chaos scenarios are supported:
1. **node_start_scenario**: scenario to stop the node instance.
2. **node_stop_scenario**: scenario to stop the node instance.
3. **node_stop_start_scenario**: scenario to stop and then start the node instance.
4. **node_termination_scenario**: scenario to terminate the node instance.
5. **node_reboot_scenario**: scenario to reboot the node instance.
6. **stop_kubelet_scenario**: scenario to stop the kubelet of the node instance.
7. **stop_start_kubelet_scenario**: scenario to stop and start the kubelet of the node instance.
8. **node_crash_scenario**: scenario to crash the node instance.
**NOTE**: If the node doesn't recover from the node_crash_scenario injection, reboot the node to get it back to Ready state.
**NOTE**: node_start_scenario, node_stop_scenario, node_stop_start_scenario, node_termination_scenario, node_reboot_scenario and stop_start_kubelet_scenario are supported only on AWS as of now.
**NOTE**: AWS is the only cloud platform supported as of today but we are looking into adding more. Make sure [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html) is installed.
**NOTE**: The `stop_start_kubelet_scenario` and `node_crash_scenario` scenarios are supported as they are independent of the cloud platform.
Node scenarios can be injected by placing the node scenarios config files under node_scenarios option in the kraken config. Refer to [node_scenarios_example](https://github.com/openshift-scale/kraken/blob/master/scenarios/node_scenarios_example.yml) config file.
```
node_scenarios:
- actions: # node chaos scenarios to be injected
- node_stop_start_scenario
- stop_start_kubelet_scenario
- node_crash_scenario
node_name: # node on which scenario has to be injected
label_selector: node-role.kubernetes.io/worker # when node_name is not specified, a node with matching label_selector is selected for node chaos scenario injection
instance_kill_count: 1 # number of times to inject each scenario under actions
timeout: 120 # duration to wait for completion of node scenario injection
cloud_type: aws # cloud type on which Kubernetes/OpenShift runs
- actions:
- node_reboot_scenario
node_name:
label_selector: node-role.kubernetes.io/infra
instance_kill_count: 1
timeout: 120
cloud_type: aws
```

16
docs/pod_scenarios.md Normal file
View File

@@ -0,0 +1,16 @@
### Pod Scenarios
Kraken consumes [Powerfulseal](https://github.com/powerfulseal/powerfulseal) under the hood to run the pod scenarios.
#### Pod chaos scenarios
Following are the components of Kubernetes/OpenShift for which a basic chaos scenario config exists today. Adding a new pod based scenario is as simple as adding a new config under scenarios directory and defining it in the config.
Component | Description | Working
------------------------ | ---------------------------------------------------------------------------------------------------| ------------------------- |
Etcd | Kills a single/multiple etcd replicas for the specified number of times in a loop | :heavy_check_mark: |
Kube ApiServer | Kills a single/multiple kube-apiserver replicas for the specified number of times in a loop | :heavy_check_mark: |
ApiServer | Kills a single/multiple apiserver replicas for the specified number of times in a loop | :heavy_check_mark: |
Prometheus | Kills a single/multiple prometheus replicas for the specified number of times in a loop | :heavy_check_mark: |
OpenShift System Pods | kills random pods running in the OpenShift system namespaces | :heavy_check_mark: |
**NOTE**: [Writing policies](https://powerfulseal.github.io/powerfulseal/policies) can be referred for more information on how to write new scenarios.