mirror of
https://github.com/krkn-chaos/krkn.git
synced 2026-02-14 18:10:00 +00:00
Also renames retry_wait to expected_recovery_time to make it clear that the Kraken will exit 1 if the container doesn't recover within the expected time. Fixes https://github.com/redhat-chaos/krkn/issues/414
2.2 KiB
2.2 KiB
Container Scenarios
Kraken uses the oc exec command to kill specific containers in a pod.
This can be based on the pods namespace or labels. If you know the exact object you want to kill, you can also specify the specific container name or pod name in the scenario yaml file.
These scenarios are in a simple yaml format that you can manipulate to run your specific tests or use the pre-existing scenarios to see how it works.
Example Config
The following are the components of Kubernetes/OpenShift for which a basic chaos scenario config exists today.
scenarios:
- name: "<Name of scenario>"
namespace: "<specific namespace>" # can specify "*" if you want to find in all namespaces
label_selector: "<label of pod(s)>"
container_name: "<specific container name>" # This is optional, can take out and will kill all containers in all pods found under namespace and label
pod_names: # This is optional, can take out and will select all pods with given namespace and label
- <pod_name>
count: <number of containers to disrupt, default=1>
action: <Action to run. For example kill 1 ( hang up ) or kill 9. Default is set to kill 1>
expected_recovery_time: <number of seconds to wait for container to be running again> (defaults to 120seconds)
Post Action
In all scenarios we do a post chaos check to wait and verify the specific component.
Here there are two options:
- Pass a custom script in the main config scenario list that will run before the chaos and verify the output matches post chaos scenario.
See scenarios/post_action_etcd_container.py for an example.
- container_scenarios: # List of chaos pod scenarios to load.
- - scenarios/container_etcd.yml
- scenarios/post_action_etcd_container.py
- Allow kraken to wait and check the killed containers until they become ready again. Kraken keeps a list of the specific containers that were killed as well as the namespaces and pods to verify all containers that were affected recover properly.
expected_recovery_time: <seconds to wait for container to recover>