diff --git a/README.md b/README.md index 9fe92dff..20745c6e 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ # Krkn aka Kraken -[![Docker Repository on Quay](https://quay.io/repository/redhat-chaos/krkn/status "Docker Repository on Quay")](https://quay.io/repository/redhat-chaos/krkn?tab=tags&tag=latest) -![Workflow-Status](https://github.com/redhat-chaos/krkn/actions/workflows/docker-image.yml/badge.svg) +[![Docker Repository on Quay](https://quay.io/repository/krkn-chaos/krkn/status "Docker Repository on Quay")](https://quay.io/repository/krkn-chaos/krkn?tab=tags&tag=latest) +![Workflow-Status](https://github.com/krkn-chaos/krkn/actions/workflows/docker-image.yml/badge.svg) ![Krkn logo](media/logo.png) @@ -79,7 +79,7 @@ Scenario type | Kubernetes ### Kraken scenario pass/fail criteria and report It is important to make sure to check if the targeted component recovered from the chaos injection and also if the Kubernetes cluster is healthy as failures in one component can have an adverse impact on other components. Kraken does this by: - Having built in checks for pod and node based scenarios to ensure the expected number of replicas and nodes are up. It also supports running custom scripts with the checks. -- Leveraging [Cerberus](https://github.com/redhat-chaos/cerberus) to monitor the cluster under test and consuming the aggregated go/no-go signal to determine pass/fail post chaos. It is highly recommended to turn on the Cerberus health check feature available in Kraken. Instructions on installing and setting up Cerberus can be found [here](https://github.com/openshift-scale/cerberus#installation) or can be installed from Kraken using the [instructions](https://github.com/redhat-chaos/krkn#setting-up-infrastructure-dependencies). Once Cerberus is up and running, set cerberus_enabled to True and cerberus_url to the url where Cerberus publishes go/no-go signal in the Kraken config file. Cerberus can monitor [application routes](https://github.com/redhat-chaos/cerberus/blob/main/docs/config.md#watch-routes) during the chaos and fails the run if it encounters downtime as it is a potential downtime in a customers, or users environment as well. It is especially important during the control plane chaos scenarios including the API server, Etcd, Ingress etc. It can be enabled by setting `check_applicaton_routes: True` in the [Kraken config](https://github.com/redhat-chaos/krkn/blob/main/config/config.yaml) provided application routes are being monitored in the [cerberus config](https://github.com/redhat-chaos/krkn/blob/main/config/cerberus.yaml). +- Leveraging [Cerberus](https://github.com/krkn-chaos/cerberus) to monitor the cluster under test and consuming the aggregated go/no-go signal to determine pass/fail post chaos. It is highly recommended to turn on the Cerberus health check feature available in Kraken. Instructions on installing and setting up Cerberus can be found [here](https://github.com/openshift-scale/cerberus#installation) or can be installed from Kraken using the [instructions](https://github.com/krkn-chaos/krkn#setting-up-infrastructure-dependencies). Once Cerberus is up and running, set cerberus_enabled to True and cerberus_url to the url where Cerberus publishes go/no-go signal in the Kraken config file. Cerberus can monitor [application routes](https://github.com/redhat-chaos/cerberus/blob/main/docs/config.md#watch-routes) during the chaos and fails the run if it encounters downtime as it is a potential downtime in a customers, or users environment as well. It is especially important during the control plane chaos scenarios including the API server, Etcd, Ingress etc. It can be enabled by setting `check_applicaton_routes: True` in the [Kraken config](https://github.com/redhat-chaos/krkn/blob/main/config/config.yaml) provided application routes are being monitored in the [cerberus config](https://github.com/redhat-chaos/krkn/blob/main/config/cerberus.yaml). - Leveraging built-in alert collection feature to fail the runs in case of critical alerts. ### Signaling @@ -103,7 +103,7 @@ Information on enabling and leveraging this feature can be found [here](docs/SLO ### OCM / ACM integration -Kraken supports injecting faults into [Open Cluster Management (OCM)](https://open-cluster-management.io/) and [Red Hat Advanced Cluster Management for Kubernetes (ACM)](https://www.redhat.com/en/technologies/management/advanced-cluster-management) managed clusters through [ManagedCluster Scenarios](docs/managedcluster_scenarios.md). +Kraken supports injecting faults into [Open Cluster Management (OCM)](https://open-cluster-management.io/) and [Red Hat Advanced Cluster Management for Kubernetes (ACM)](https://www.krkn.com/en/technologies/management/advanced-cluster-management) managed clusters through [ManagedCluster Scenarios](docs/managedcluster_scenarios.md). ### Blogs and other useful resources diff --git a/ROADMAP.md b/ROADMAP.md index 24751600..e60d471c 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -2,14 +2,14 @@ Following are a list of enhancements that we are planning to work on adding support in Krkn. Of course any help/contributions are greatly appreciated. -- [ ] [Ability to run multiple chaos scenarios in parallel under load to mimic real world outages](https://github.com/redhat-chaos/krkn/issues/424) -- [x] [Centralized storage for chaos experiments artifacts](https://github.com/redhat-chaos/krkn/issues/423) -- [ ] [Support for causing DNS outages](https://github.com/redhat-chaos/krkn/issues/394) -- [x] [Chaos recommender](https://github.com/redhat-chaos/krkn/tree/main/utils/chaos-recommender) to suggest scenarios having probability of impacting the service under test using profiling results +- [ ] [Ability to run multiple chaos scenarios in parallel under load to mimic real world outages](https://github.com/krkn-chaos/krkn/issues/424) +- [x] [Centralized storage for chaos experiments artifacts](https://github.com/krkn-chaos/krkn/issues/423) +- [ ] [Support for causing DNS outages](https://github.com/krkn-chaos/krkn/issues/394) +- [x] [Chaos recommender](https://github.com/krkn-chaos/krkn/tree/main/utils/chaos-recommender) to suggest scenarios having probability of impacting the service under test using profiling results - [ ] Chaos AI integration to improve and automate test coverage -- [x] [Support for pod level network traffic shaping](https://github.com/redhat-chaos/krkn/issues/393) -- [ ] [Ability to visualize the metrics that are being captured by Kraken and stored in Elasticsearch](https://github.com/redhat-chaos/krkn/issues/124) -- [ ] Support for running all the scenarios of Kraken on Kubernetes distribution - see https://github.com/redhat-chaos/krkn/issues/185, https://github.com/redhat-chaos/krkn/issues/186 -- [ ] Continue to improve [Chaos Testing Guide](https://redhat-chaos.github.io/krkn) in terms of adding best practices, test environment recommendations and scenarios to make sure the OpenShift platform, as well the applications running on top it, are resilient and performant under chaotic conditions. -- [ ] [Switch documentation references to Kubernetes](https://github.com/redhat-chaos/krkn/issues/495) -- [ ] [OCP and Kubernetes functionalities segregation](https://github.com/redhat-chaos/krkn/issues/497) +- [x] [Support for pod level network traffic shaping](https://github.com/krkn-chaos/krkn/issues/393) +- [ ] [Ability to visualize the metrics that are being captured by Kraken and stored in Elasticsearch](https://github.com/krkn-chaos/krkn/issues/124) +- [ ] Support for running all the scenarios of Kraken on Kubernetes distribution - see https://github.com/krkn-chaos/krkn/issues/185, https://github.com/redhat-chaos/krkn/issues/186 +- [ ] Continue to improve [Chaos Testing Guide](https://krkn-chaos.github.io/krkn) in terms of adding best practices, test environment recommendations and scenarios to make sure the OpenShift platform, as well the applications running on top it, are resilient and performant under chaotic conditions. +- [ ] [Switch documentation references to Kubernetes](https://github.com/krkn-chaos/krkn/issues/495) +- [ ] [OCP and Kubernetes functionalities segregation](https://github.com/krkn-chaos/krkn/issues/497) diff --git a/docs/cluster_shut_down_scenarios.md b/docs/cluster_shut_down_scenarios.md index f610c94b..bb45f0b5 100644 --- a/docs/cluster_shut_down_scenarios.md +++ b/docs/cluster_shut_down_scenarios.md @@ -1,5 +1,5 @@ -#### Kubernetes/OpenShift cluster shut down scenario -Scenario to shut down all the nodes including the masters and restart them after specified duration. Cluster shut down scenario can be injected by placing the shut_down config file under cluster_shut_down_scenario option in the kraken config. Refer to [cluster_shut_down_scenario](https://github.com/redhat-chaos/krkn/blob/main/scenarios/cluster_shut_down_scenario.yml) config file. +#### Kubernetes cluster shut down scenario +Scenario to shut down all the nodes including the masters and restart them after specified duration. Cluster shut down scenario can be injected by placing the shut_down config file under cluster_shut_down_scenario option in the kraken config. Refer to [cluster_shut_down_scenario](https://github.com/krkn-chaos/krkn/blob/main/scenarios/cluster_shut_down_scenario.yml) config file. Refer to [cloud setup](cloud_setup.md) to configure your cli properly for the cloud provider of the cluster you want to shut down. diff --git a/docs/container_scenarios.md b/docs/container_scenarios.md index b93842f4..af888920 100644 --- a/docs/container_scenarios.md +++ b/docs/container_scenarios.md @@ -4,7 +4,7 @@ This can be based on the pods namespace or labels. If you know the exact object These scenarios are in a simple yaml format that you can manipulate to run your specific tests or use the pre-existing scenarios to see how it works. #### Example Config -The following are the components of Kubernetes/OpenShift for which a basic chaos scenario config exists today. +The following are the components of Kubernetes for which a basic chaos scenario config exists today. ``` scenarios: @@ -25,7 +25,7 @@ In all scenarios we do a post chaos check to wait and verify the specific compon Here there are two options: 1. Pass a custom script in the main config scenario list that will run before the chaos and verify the output matches post chaos scenario. -See [scenarios/post_action_etcd_container.py](https://github.com/redhat-chaos/krkn/blob/main/scenarios/post_action_etcd_container.py) for an example. +See [scenarios/post_action_etcd_container.py](https://github.com/krkn-chaos/krkn/blob/main/scenarios/post_action_etcd_container.py) for an example. ``` - container_scenarios: # List of chaos pod scenarios to load. - - scenarios/container_etcd.yml diff --git a/docs/contribute.md b/docs/contribute.md index ebbd765d..5a17c85f 100644 --- a/docs/contribute.md +++ b/docs/contribute.md @@ -62,7 +62,7 @@ If changes go into the main repository while you're working on your code it is b If not already configured, set the upstream url for kraken. ``` - git remote add upstream https://github.com/redhat-chaos/krkn.git + git remote add upstream https://github.com/krkn-chaos/krkn.git ``` Rebase to upstream master branch. diff --git a/docs/installation.md b/docs/installation.md index 9a434c58..f5b484b6 100644 --- a/docs/installation.md +++ b/docs/installation.md @@ -3,13 +3,13 @@ The following ways are supported to run Kraken: - Standalone python program through Git. -- Containerized version using either Podman or Docker as the runtime via [Krkn-hub](https://github.com/redhat-chaos/krkn-hub) +- Containerized version using either Podman or Docker as the runtime via [Krkn-hub](https://github.com/krkn-chaos/krkn-hub) - Kubernetes or OpenShift deployment ( unsupported ) **NOTE**: It is recommended to run Kraken external to the cluster ( Standalone or Containerized ) hitting the Kubernetes/OpenShift API as running it internal to the cluster might be disruptive to itself and also might not report back the results if the chaos leads to cluster's API server instability. **NOTE**: To run Kraken on Power (ppc64le) architecture, build and run a containerized version by following the - instructions given [here](https://github.com/redhat-chaos/krkn/blob/main/containers/build_own_image-README.md). + instructions given [here](https://github.com/krkn-chaos/krkn/blob/main/containers/build_own_image-README.md). **NOTE**: Helper functions for interactions in Krkn are part of [krkn-lib](https://github.com/redhat-chaos/krkn-lib). Please feel free to reuse and expand them as you see fit when adding a new scenario or expanding @@ -19,9 +19,9 @@ the capabilities of the current supported scenarios. ### Git #### Clone the repository -Pick the latest stable release to install [here](https://github.com/redhat-chaos/krkn/releases). +Pick the latest stable release to install [here](https://github.com/krkn-chaos/krkn/releases). ``` -$ git clone https://github.com/redhat-chaos/krkn.git --branch +$ git clone https://github.com/krkn-chaos/krkn.git --branch $ cd kraken ``` @@ -40,13 +40,13 @@ $ python3.9 run_kraken.py --config ``` ### Run containerized version -[Krkn-hub](https://github.com/redhat-chaos/krkn-hub) is a wrapper that allows running Krkn chaos scenarios via podman or docker runtime with scenario parameters/configuration defined as environment variables. +[Krkn-hub](https://github.com/krkn-chaos/krkn-hub) is a wrapper that allows running Krkn chaos scenarios via podman or docker runtime with scenario parameters/configuration defined as environment variables. -Refer [instructions](https://github.com/redhat-chaos/krkn-hub#supported-chaos-scenarios) to get started. +Refer [instructions](https://github.com/krkn-chaos/krkn-hub#supported-chaos-scenarios) to get started. ### Run Kraken as a Kubernetes deployment ( unsupported option - standalone or containerized deployers are recommended ) -Refer [Instructions](https://github.com/redhat-chaos/krkn/blob/main/containers/README.md) on how to deploy and run Kraken as a Kubernetes/OpenShift deployment. +Refer [Instructions](https://github.com/krkn-chaos/krkn/blob/main/containers/README.md) on how to deploy and run Kraken as a Kubernetes/OpenShift deployment. Refer to the [chaos-kraken chart manpage](https://artifacthub.io/packages/helm/startx/chaos-kraken) diff --git a/docs/service_disruption_scenarios.md b/docs/service_disruption_scenarios.md index 7aa5ea68..43a060fe 100644 --- a/docs/service_disruption_scenarios.md +++ b/docs/service_disruption_scenarios.md @@ -16,7 +16,7 @@ Set to '^.*$' and label_selector to "" to randomly select any namespace in your **sleep:** Number of seconds to wait between each iteration/count of killing namespaces. Defaults to 10 seconds if not set -Refer to [namespace_scenarios_example](https://github.com/redhat-chaos/krkn/blob/main/scenarios/regex_namespace.yaml) config file. +Refer to [namespace_scenarios_example](https://github.com/krkn-chaos/krkn/blob/main/scenarios/regex_namespace.yaml) config file. ``` scenarios: diff --git a/docs/time_scenarios.md b/docs/time_scenarios.md index 51a9b94c..22a77dd9 100644 --- a/docs/time_scenarios.md +++ b/docs/time_scenarios.md @@ -16,7 +16,7 @@ Configuration Options: **object_name:** List of the names of pods or nodes you want to skew. -Refer to [time_scenarios_example](https://github.com/redhat-chaos/krkn/blob/main/scenarios/time_scenarios_example.yml) config file. +Refer to [time_scenarios_example](https://github.com/krkn-chaos/krkn/blob/main/scenarios/time_scenarios_example.yml) config file. ``` time_scenarios: diff --git a/requirements.txt b/requirements.txt index 39c3cbee..53895d8f 100644 --- a/requirements.txt +++ b/requirements.txt @@ -11,7 +11,7 @@ coverage datetime docker docker-compose -git+https://github.com/redhat-chaos/arcaflow-plugin-kill-pod.git +git+https://github.com/krkn-chaos/arcaflow-plugin-kill-pod.git git+https://github.com/vmware/vsphere-automation-sdk-python.git@v8.0.0.0 gitpython google-api-python-client diff --git a/utils/chaos_recommender/README.md b/utils/chaos_recommender/README.md index 6f0e04b4..e9aefc2a 100644 --- a/utils/chaos_recommender/README.md +++ b/utils/chaos_recommender/README.md @@ -17,7 +17,7 @@ This tool profiles an application and gathers telemetry data such as CPU, Memory ``` $ python3.9 -m venv chaos $ source chaos/bin/activate - $ git clone https://github.com/redhat-chaos/krkn.git + $ git clone https://github.com/krkn-chaos/krkn.git $ cd krkn $ pip3 install -r requirements.txt $ python3.9 utils/chaos_recommender/chaos_recommender.py @@ -89,7 +89,7 @@ If you provide the input values through command-line arguments, the correspondin ## Podman & Docker image -To run the recommender image please visit the [krkn-hub](https://github.com/redhat-chaos/krkn-hub for further infos. +To run the recommender image please visit the [krkn-hub](https://github.com/krkn-chaos/krkn-hub for further infos. ## How it works