mirror of
https://github.com/krkn-chaos/krkn.git
synced 2026-04-15 06:57:28 +00:00
Modify documentation to improve readability
This commit: - Converts various sections in the readme into individual documents. - Adds pointers to the public blogs. - Updates workflow/architecture diagram. - Adds community info and contributing guidelines.
This commit is contained in:
21
docs/config.md
Normal file
21
docs/config.md
Normal file
@@ -0,0 +1,21 @@
|
||||
### Config
|
||||
Set the scenarios to inject and the tunings like duration to wait between each scenario in the config file located at config/config.yaml. A sample config looks like:
|
||||
|
||||
```
|
||||
kraken:
|
||||
kubeconfig_path: /root/.kube/config # Path to kubeconfig
|
||||
scenarios: # List of policies/chaos scenarios to load
|
||||
- scenarios/etcd.yml
|
||||
- scenarios/openshift-kube-apiserver.yml
|
||||
- scenarios/openshift-apiserver.yml
|
||||
node_scenarios: # List of chaos node scenarios to load
|
||||
- scenarios/node_scenarios_example.yml
|
||||
|
||||
cerberus:
|
||||
cerberus_enabled: False # Enable it when cerberus is previously installed
|
||||
cerberus_url: # When cerberus_enabled is set to True, provide the url where cerberus publishes go/no-go signal
|
||||
|
||||
tunings:
|
||||
wait_duration: 60 # Duration to wait between each chaos scenario
|
||||
iterations: 1 # Number of times to execute the scenarios
|
||||
daemon_mode: False # Iterations are set to infinity which means that the kraken will cause chaos forever
|
||||
48
docs/installation.md
Normal file
48
docs/installation.md
Normal file
@@ -0,0 +1,48 @@
|
||||
## Installation
|
||||
|
||||
Following ways are supported to run Kraken:
|
||||
|
||||
- Standalone python program through Git
|
||||
- Containerized version using either Podman or Docker as the runtime
|
||||
- Kubernetes or OpenShift deployment
|
||||
|
||||
**NOTE**: It is recommended to run Kraken external to the cluster ( Standalone or Containerized ) hitting the Kubernetes/OpenShift API as running it internal to the cluster might be disruptive to itself and also might not report back the results if the chaos leads to cluster's API server instability.
|
||||
|
||||
### Git
|
||||
|
||||
#### Clone the repository
|
||||
```
|
||||
$ git clone https://github.com/openshift-scale/kraken.git
|
||||
$ cd kraken
|
||||
```
|
||||
|
||||
#### Install the dependencies
|
||||
```
|
||||
$ pip3 install -r requirements.txt
|
||||
```
|
||||
|
||||
#### Run
|
||||
```
|
||||
$ python3 run_kraken.py --config <config_file_location>
|
||||
```
|
||||
|
||||
### Run containerized version
|
||||
Assuming that the latest docker ( 17.05 or greater with multi-build support ) is intalled on the host, run:
|
||||
```
|
||||
$ docker pull quay.io/openshift-scale/kraken:latest
|
||||
$ docker run --name=kraken --net=host -v <path_to_kubeconfig>:/root/.kube/config -v <path_to_kraken_config>:/root/kraken/config/config.yaml -d quay.io/openshift-scale/kraken:latest
|
||||
$ docker logs -f kraken
|
||||
```
|
||||
|
||||
Similarly, podman can be used to achieve the same:
|
||||
```
|
||||
$ podman pull quay.io/openshift-scale/kraken
|
||||
$ podman run --name=kraken --net=host -v <path_to_kubeconfig>:/root/.kube/config:Z -v <path_to_kraken_config>:/root/kraken/config/config.yaml:Z -d quay.io/openshift-scale/kraken:latest
|
||||
$ podman logs -f kraken
|
||||
```
|
||||
|
||||
If you want to build your own kraken image see [here](https://github.com/openshift-scale/kraken/tree/master/containers/build_own_image-README.md)
|
||||
|
||||
|
||||
### Run Kraken as a Kubernetes deployment
|
||||
Refer [Instructions](https://github.com/openshift-scale/kraken/blob/master/containers/README.md) on how to deploy and run Kraken as a Kubernetes/OpenShift deployment.
|
||||
43
docs/node_scenarios.md
Normal file
43
docs/node_scenarios.md
Normal file
@@ -0,0 +1,43 @@
|
||||
### Node Scenarios
|
||||
|
||||
Following node chaos scenarios are supported:
|
||||
|
||||
1. **node_start_scenario**: scenario to stop the node instance.
|
||||
2. **node_stop_scenario**: scenario to stop the node instance.
|
||||
3. **node_stop_start_scenario**: scenario to stop and then start the node instance.
|
||||
4. **node_termination_scenario**: scenario to terminate the node instance.
|
||||
5. **node_reboot_scenario**: scenario to reboot the node instance.
|
||||
6. **stop_kubelet_scenario**: scenario to stop the kubelet of the node instance.
|
||||
7. **stop_start_kubelet_scenario**: scenario to stop and start the kubelet of the node instance.
|
||||
8. **node_crash_scenario**: scenario to crash the node instance.
|
||||
|
||||
**NOTE**: If the node doesn't recover from the node_crash_scenario injection, reboot the node to get it back to Ready state.
|
||||
|
||||
**NOTE**: node_start_scenario, node_stop_scenario, node_stop_start_scenario, node_termination_scenario, node_reboot_scenario and stop_start_kubelet_scenario are supported only on AWS as of now.
|
||||
|
||||
**NOTE**: AWS is the only cloud platform supported as of today but we are looking into adding more. Make sure [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html) is installed.
|
||||
|
||||
**NOTE**: The `stop_start_kubelet_scenario` and `node_crash_scenario` scenarios are supported as they are independent of the cloud platform.
|
||||
|
||||
|
||||
Node scenarios can be injected by placing the node scenarios config files under node_scenarios option in the kraken config. Refer to [node_scenarios_example](https://github.com/openshift-scale/kraken/blob/master/scenarios/node_scenarios_example.yml) config file.
|
||||
|
||||
```
|
||||
node_scenarios:
|
||||
- actions: # node chaos scenarios to be injected
|
||||
- node_stop_start_scenario
|
||||
- stop_start_kubelet_scenario
|
||||
- node_crash_scenario
|
||||
node_name: # node on which scenario has to be injected
|
||||
label_selector: node-role.kubernetes.io/worker # when node_name is not specified, a node with matching label_selector is selected for node chaos scenario injection
|
||||
instance_kill_count: 1 # number of times to inject each scenario under actions
|
||||
timeout: 120 # duration to wait for completion of node scenario injection
|
||||
cloud_type: aws # cloud type on which Kubernetes/OpenShift runs
|
||||
- actions:
|
||||
- node_reboot_scenario
|
||||
node_name:
|
||||
label_selector: node-role.kubernetes.io/infra
|
||||
instance_kill_count: 1
|
||||
timeout: 120
|
||||
cloud_type: aws
|
||||
```
|
||||
16
docs/pod_scenarios.md
Normal file
16
docs/pod_scenarios.md
Normal file
@@ -0,0 +1,16 @@
|
||||
### Pod Scenarios
|
||||
Kraken consumes [Powerfulseal](https://github.com/powerfulseal/powerfulseal) under the hood to run the pod scenarios.
|
||||
|
||||
|
||||
#### Pod chaos scenarios
|
||||
Following are the components of Kubernetes/OpenShift for which a basic chaos scenario config exists today. Adding a new pod based scenario is as simple as adding a new config under scenarios directory and defining it in the config.
|
||||
|
||||
Component | Description | Working
|
||||
------------------------ | ---------------------------------------------------------------------------------------------------| ------------------------- |
|
||||
Etcd | Kills a single/multiple etcd replicas for the specified number of times in a loop | :heavy_check_mark: |
|
||||
Kube ApiServer | Kills a single/multiple kube-apiserver replicas for the specified number of times in a loop | :heavy_check_mark: |
|
||||
ApiServer | Kills a single/multiple apiserver replicas for the specified number of times in a loop | :heavy_check_mark: |
|
||||
Prometheus | Kills a single/multiple prometheus replicas for the specified number of times in a loop | :heavy_check_mark: |
|
||||
OpenShift System Pods | kills random pods running in the OpenShift system namespaces | :heavy_check_mark: |
|
||||
|
||||
**NOTE**: [Writing policies](https://powerfulseal.github.io/powerfulseal/policies) can be referred for more information on how to write new scenarios.
|
||||
Reference in New Issue
Block a user