github/krkn - krkn - Gitea: Git with a nice cup of tea

mirror of https://github.com/krkn-chaos/krkn.git synced 2026-02-14 18:10:00 +00:00

Author	SHA1	Message	Date
Tullio Sebastiani	83b811bee4	Arcaflow stress-ng hogs with parallelism support (#418 ) * kubeconfig management for arcaflow + hogs scenario refactoring * kubeconfig authentication parsing refactored to support arcaflow kubernetes deployer * reimplemented all the hog scenarios to allow multiple parallel containers of the same scenarios (eg. to stress two or more nodes in the same run simultaneously) * updated documentation * removed sysbench scenarios * recovered cpu hogs * updated requirements.txt * updated config.yaml * added gitleaks file for test fixtures * imported sys and logging * removed config_arcaflow.yaml * updated readme * refactored arcaflow documentation entrypoint	2023-05-15 09:45:16 -04:00
Paige Rubendall	16ea18c718	Ibm plugin node scenario (#417 ) * Node scenarios for ibmcloud * adding openshift check info	2023-05-09 12:07:38 -04:00
Naga Ravi Chaitanya Elluri	bc863fa01f	Add support to check for critical alerts This commit enables users to opt in to check for critical alerts firing in the cluster post chaos at the end of each scenario. Chaos scenario is considered as failed if the cluster is unhealthy in which case user can start debugging to fix and harden respective areas. Fixes https://github.com/redhat-chaos/krkn/issues/410	2023-05-03 16:14:13 -04:00
Naga Ravi Chaitanya Elluri	6b17dbdbb3	Allow users to set the listening address This commit provides an option for the user to set the listening address for the signal. This also fixes a security vulnerability. Fixes https://github.com/redhat-chaos/krkn/issues/307	2022-11-08 15:59:57 -05:00
Sandro Bonazzola	0c36903fff	config: really default to ~ instead of /root Documentation says we default to ~ for looking up the kubernetes config but then we set everywhere /root. Fixed the config to really look for ~. Should solve #327. Signed-off-by: Sandro Bonazzola <sbonazzo@redhat.com>	2022-09-13 12:01:16 +02:00
Shreyas Anantha Ramaprasad	9421a0c2c2	Added support for ingress traffic shaping (#299 ) * Added plugin for ingress network traffic shaping * Documentation changes * Minor changes * Documentation and formatting fixes * Added trap to sleep infinity command running in containers * Removed shell injection threat for modprobe commands * Added docstrings to cerberus functions * Added checks to prevent shell injection * Bug fix	2022-09-02 07:54:11 +02:00
Naga Ravi Chaitanya Elluri	6c75d3dddb	Add option to skip litmus installation This commit adds an option for the user to pick whether to install litmus or not depending on their use case. One use case is disconnected environments where litmus is pre-installed insted of reaching out to the internet.	2022-08-23 14:09:10 -04:00
Shreyas Anantha Ramaprasad	08deae63dd	Added VMware Node Scenarios (#285 ) * Added VMware node scenarios * Made vmware plugin independent of Krkn * Revert changes made to node status watch * Fixed minor documentation changes	2022-08-15 23:35:16 +02:00
Janos Bonic	ccd902565e	Fixes #265 : Replace Powerfulseal and introduce Wolkenwalze SDK for plugin system	2022-08-02 16:25:03 +01:00
Naga Ravi Chaitanya Elluri	9208f39e06	Add support to run on Kubernetes This commit: - Leverages distribution flag in the config set by the user to skip things not supported on OpenShift to be able to run scenarios on Kubernetes. - Adds sample config and scenario files that work on Kubernetes.	2022-06-01 07:27:06 -05:00
Adolfo Aguirrezabal	3adf5847b2	Add option to avoid litmus uninstall before chaos run (#242 ) * Adds option to avoid litmus uninstall before chaos run * Add new option to the config files	2022-05-05 09:02:25 -04:00
yogananth-subramanian	50dd9873c1	Node egress traffic shaping Patch adds a scenario to create variations in egress traffic of a Node's interface using the tc and Netem.	2021-12-16 12:54:53 -05:00
Alejandro Gullón	baa812b7f0	Added new scenario to fill up a given volumen (#182 ) * Added new scenario to fill up a given volumen * fixing small issues and style * adding PVC as input param instead of pod name * small fix * get container name and volumen name replace oc with kubectl commands * adding yaml file to create a pv, pvc and pod to run pvc_scenario * adding support to match both string for describe command when looking for pod_name * added support to find the pvc from a given pod * small fix * small fix	2021-11-24 12:18:49 -05:00
Naga Ravi Chaitanya Elluri	674eb74a75	Expose setting the signal in the config This commit enables users to start Kraken to act as listener by setting the signal to PAUSE in the config to get the cluster to a desired test or run any setup before injecting chaos by setting the signal to RUN. This helps in cases where we have test cases that need to coordinate the chaos at a desired time depending on the state of the cluster/test run.	2021-10-26 09:05:25 -04:00
Paige Rubendall	6b865fc573	Adding server set up for kraken	2021-10-25 08:58:46 -04:00
Naga Ravi Chaitanya Elluri	cdf3bc03d2	Add support to block traffic to an application This commit enables users to simulate a downtime of an application by blocking the traffic for the specified duration to see how it/other components communicating with it behave in case of downtime.	2021-10-01 10:13:40 -04:00
Paige Rubendall	22df024312	adding validation that namespace becomes active	2021-09-28 09:58:55 -04:00
Naga Ravi Chaitanya Elluri	036e51a6b1	Delete litmus crd's during the cleanup This commit will ensure that the litmus resources installed on the cluster get cleaned up and also creates the chaosengine in the specified namespace.	2021-09-16 16:30:21 -04:00
Paige Rubendall	a9056ddf43	adding litmus logging	2021-09-08 17:11:49 -04:00
Naga Ravi Chaitanya Elluri	5da0b259c5	Run all the litmus resources in a single namespace - This eases the usage and debuggability by running the fault injection pods in the same namespace as other resources of litmus. This will also ease the deletion process and ensure that there are no leftover objects on the cluster. - This commit also enables users to use the same rbac template for all the litmus scenarios without having to pull in a specic one for each of the scenarios.	2021-09-08 16:37:07 -04:00
Naga Ravi Chaitanya Elluri	68a32666cd	Update litmus docs with supported scenarios	2021-09-01 16:41:22 -04:00
prubenda	9b0bcdbf0e	Adding node memory hog scenario	2021-08-20 14:02:00 -04:00
Naga Ravi Chaitanya Elluri	6456eec76a	Add zone outage scenarios This commit adds support to create zone outage in AWS by denying both ingress and egress traffic to the instances belonging to a particular subnet belonging to the zone by tweaking the network acl. This creates an outage of all the nodes in the zone - both master and workers.	2021-08-17 11:43:13 -04:00
Naga Ravi Chaitanya Elluri	c56a8a5356	Add more tunables for cpu hog scenario This commit exposes the flags to tweak the number of cores and node count to hog during the node-cpu-hog scenario.	2021-07-28 17:07:40 -04:00
Naga Ravi Chaitanya Elluri	716057eab6	Monitor user application availability during chaos Current Kraken integration with Cerberus monitors the cluster as well as the application health post chaos and pass/fails if they are not healthy after chaos. This commit adds ability to monitor the user application health during the chaos and fails the run in case of downtime as it's potentially a downtime in case of customers environment as well. It is especially useful in case of control plane failure scenarios including API server, Etcd, Ingress etc.	2021-07-27 13:15:57 -04:00
Paige Rubendall	f051c1c30f	Merge pull request #120 from paigerube14/container_kill Container kill	2021-07-15 15:07:58 -04:00
prubenda	76efac8f9b	Adding delete of namespaces	2021-07-13 13:31:45 -04:00
prubenda	46a1823291	Adding killing of specific containers in pods	2021-07-08 17:10:48 -04:00
prubenda	41bf815f98	Adding shut down scenario for gcp, az, aws, openstack	2021-06-23 09:00:58 -04:00
Naga Ravi Chaitanya Elluri	e30a4243f6	Add support to alerting on metrics evaluation This commit enables alerting in Kraken based on the Prometheus queries defined by the user and modifies the return code of the run to determine pass/fail for the run.	2021-06-22 15:22:37 -04:00
Naga Ravi Chaitanya Elluri	7e8f0450d6	Add support to scrape and index metrics This commit: - Enables Kraken to leverage kube-burner to scrape metrics from Prometheus and index them into Elasticsearch. This way we can take a look at the metrics in Grafana long term even after the cluster is terminated. - Enables separation of operations based on distribution with OpenShift as the default option. One of the use cases is to capture Prometheus instance details as it's installed by default while it's optional for Kubernetes.	2021-06-21 14:55:50 -04:00
Naga Ravi Chaitanya Elluri	a7e28ca490	Add support to deploy performance dashboards This commit enables performance monitoring on the cluster when running Kraken to be able to observe how cluster reacts to failures as it's important to make sure the cluster is healthy in terms of both recovery as well as performance.	2021-02-10 16:06:55 -05:00
prubenda	1fc9683c8c	Adding litmus scenario options	2020-12-03 12:45:35 -05:00
Yashashree1997	47847d86cd	Adds the ability to run a specific type of scenario multiple times With the current implementation, all the scenarios of specific type (for example, pod scenario) has to be executed together. All pod_scenarios are followed by node_scenarios and so on. (pod_scenarios -> node_scenarios -> pod_scenarios is not possible) This commit enables the user to run a specific type of scenario multiple times. For example, few pod_scenarios followed by node_scenarios followed by few_scenarios.	2020-10-30 10:40:42 -04:00
prubenda	6f31519e5f	adding time scenario	2020-10-27 08:37:54 -04:00
Naga Ravi Chaitanya Elluri	82743230fe	Modify documentation to improve readability This commit: - Converts various sections in the readme into individual documents. - Adds pointers to the public blogs. - Updates workflow/architecture diagram. - Adds community info and contributing guidelines.	2020-10-21 15:01:54 -04:00
Mike Fiedler	2e5eac4550	Fix comment in config.yml	2020-10-09 13:20:26 -04:00
prubenda	8f5b688fba	working on powerfulseal retry logic	2020-09-11 17:08:31 -04:00
Yashashree Suresh	31f06b861a	Added node scenarios to stop and terminate instance This commit: - Adds a node scenario to stop and start an instance - Adds a node scenario to terminate an instance - Adds a node scenario to reboot an instance - Adds a node scenario to stop the kubelet - Adds a node scenario to crash the node	2020-08-27 16:50:42 -04:00
prubenda	0fc82090f2	Adding watch to see if components recovered	2020-08-18 16:26:04 -04:00
prubenda	44e753867f	Adding random regex pod kill	2020-07-06 22:00:12 -04:00
prubenda	52e232d0e7	Adding iterations or infinite run of kraken	2020-06-09 10:55:24 -04:00
Yashashree Suresh	f1c145e942	Integrated cerberus for checking cluster health	2020-04-22 23:30:21 -04:00
Naga Ravi Chaitanya Elluri	649134e492	Add initial version of kraken This commit: - Adds support to run pod chaos scenarios including killing an etcd, ApiServer and kube-apiserver using powerfulseal tool. - Adds support to create a report with the details about each chaos injection along with timestamps. The report is generated in the run directory. - Adds kubernetes package with a bunch of functions which can be used later to talk to the kubernetes API to be able to know the status of the targeted components/nodes.	2020-04-20 08:57:00 -04:00

44 Commits