github/krkn - krkn - Gitea: Git with a nice cup of tea

mirror of https://github.com/krkn-chaos/krkn.git synced 2026-02-14 18:10:00 +00:00

Author	SHA1	Message	Date
Tullio Sebastiani	39c0152b7b	Krkn telemetry integration (#435 ) * adapted config.yaml to the new feature * temporarly pointing requirement.txt to the lib feature branch * run_kraken.py + arcaflow scenarios refactoring typo * plugin scenario * node scenarios return failed scenarios * container scenarios fix * time scenarios * cluster shutdown scenarios * namespace scenarios * zone outage scenarios * app outage scenarios * pvc scenarios * network chaos scenarios * run_kraken.py adaptation to telemetry * prometheus telemetry upload + config.yaml some fixes typos and logs max retries in config telemetry id with run_uuid safe_logger * catch send_telemetry exception * scenario collection bug fixes * telemetry enabled check * telemetry run tag * requirements pointing to main + archive_size * requirements.txt and config.yaml update * added telemetry config to common config * fixed scenario array elements for telemetry v1.4.0	2023-08-10 14:42:53 -04:00
jtydlack	491dc17267	Slo via http (#459 ) * Fix typo * Enable loading SLO profile via URL (#438)	2023-08-10 11:02:33 -04:00
yogananth-subramanian	b2b5002f45	Pod egress network shapping Chaos scenario The scenario introduces network latency, packet loss, and bandwidth restriction in the Pod's network interface. The purpose of this scenario is to observe faults caused by random variations in the network. Below example config applies egress traffic shaping to openshift console. ```` - id: pod_egress_shaping config: namespace: openshift-console # Required - Namespace of the pod to which filter need to be applied. label_selector: 'component=ui' # Applies traffic shaping to access openshift console. network_params: latency: 500ms # Add 500ms latency to egress traffic from the pod. ````	2023-08-08 11:45:03 -04:00
Sahil Shah	fccd701dee	Changed the image in volume_scenario.yml to a public one (#458 )	2023-08-02 00:11:38 -04:00
José Castillo Lema	570631ebfc	Widen except (#457 ) Signed-off-by: José Castillo Lema <josecastillolema@gmail.com>	2023-07-26 18:53:52 +02:00
Naga Ravi Chaitanya Elluri	3ab9ca4319	Bump release version to v1.3.6	2023-07-24 14:06:37 -04:00
Naga Ravi Chaitanya Elluri	4084ffd9c6	Bake in virtualenv in krkn images This is needed to tie the python version being used in case multiple versions are installed. v1.3.6	2023-07-24 12:52:20 -04:00
Sahil Shah	19cc2c047f	Fix for pvc scenario	2023-07-21 15:41:28 -04:00
Paige Rubendall	6197fc6722	separating build and test workflows (#448 ) * separating build and test workflows * only run build on pull request	2023-07-20 16:01:50 -04:00
Naga Ravi Chaitanya Elluri	2a8ac41ebf	Bump release version to v1.3.5	2023-07-20 15:24:56 -04:00
Naga Ravi Chaitanya Elluri	b4d235d31c	Bake in yq dependency in Kraken container images (#450 ) This commit also updates ppc64le image to have the latest bits. v1.3.5	2023-07-20 13:17:52 -04:00
Naga Ravi Chaitanya Elluri	e4e4620d10	Bump release version to 1.3.4 (#447 )	2023-06-28 16:30:28 -04:00
Naga Ravi Chaitanya Elluri	a2c24ab7ed	Install latest version of krkn-lib-kubernetes (#446 ) v.1.3.4	2023-06-28 15:21:19 -04:00
Naga Ravi Chaitanya Elluri	fe892fd9bf	Switch from centos to redhat ubi base image This replaces the base image for Kraken container images to use redhat ubi image to be more secure and stable.	2023-06-22 12:10:51 -04:00
Naga Ravi Chaitanya Elluri	74613fdb4b	Install oc and kubectl clients from stable releases This makes sure latest clients are installed and used: - This will avoid compatability issues with the server - Fixes security vulnerabilities and CVEs	2023-06-20 15:39:53 -04:00
Naga Ravi Chaitanya Elluri	28c37c9353	Bump release version to v1.3.3	2023-06-16 09:42:46 -04:00
Naga Ravi Chaitanya Elluri	de0567b067	Tweak the etcd alert severity v1.3.3	2023-06-16 09:19:17 -04:00
Naga Ravi Chaitanya Elluri	83486557f1	Bump release version to v1.3.2 (#439 )	2023-06-15 12:12:42 -04:00
Naga Ravi Chaitanya Elluri	ce409ea6fb	Update kube-burner dependency version to 1.7.0 v1.3.2	2023-06-15 11:55:17 -04:00
Naga Ravi Chaitanya Elluri	0eb8d38596	Expand SLOs profile to cover monitoring for more alerts This commit: - Also sets appropriate severity to avoid false failures for the test cases especially given that theses are monitored during the chaos vs post chaos. Critical alerts are all monitored post chaos with few monitored during the chaos that represent overall health and performance of the service. - Renames Alerts to SLOs validation Metrics reference: `f09a492b13/cmd/kube-burner/ocp-config/alerts.yml`	2023-06-14 16:58:36 -04:00
Tullio Sebastiani	68dc17bc44	krkn-lib-kubernetes refactoring proposal (#400 ) * run_kraken.py updated + renamed kubernetes library folder unstaged files kubecli marker * container scenarios updated * node scenarios updated typo injected kubecli * managed cluster scenarios updated * time scenarios updated * litmus scenarios updated * cluster scenarios updated * namespace scenarios updated * pvc scenarios updated * network chaos scenarios updated * common_managed_cluster functions updated * switched draft library to official one * regression on rebase	2023-06-13 10:02:35 -04:00
Naga Ravi Chaitanya Elluri	572eeefaf4	Minor fixes This commit fixes few typos and duplicate logs	2023-06-12 21:05:27 -04:00
Naga Ravi Chaitanya Elluri	81376bad56	Bump release version to v1.3.1 This updates the Krkn container images to use the latest v1.3.1 minor release: https://github.com/redhat-chaos/krkn/releases.	2023-06-07 14:41:09 -04:00
Tullio Sebastiani	72b46f8393	temporarly removed io-hog scenario (#433 ) * temporarly removed io-hog scenario * removed litmus documentation & config v1.3.1	2023-06-05 11:03:44 -04:00
José Castillo Lema	a7938e58d2	Allow kraken to run with environment variables instead of kubeconfig file (#429 ) * Include check for inside k8s scenario * Include check for inside k8s scenario (2) * Include check for inside k8s scenario (3) * Include check for inside k8s scenario (4)	2023-06-01 14:43:01 -04:00
Naga Ravi Chaitanya Elluri	9858f96c78	Change the severity of the etcd leader election check to warning This is the first step towards the goal to only have metrics tracking the overall health and performance of the component/cluster. For instance, for etcd disruption scenarios, leader elections are expected, we should instead track etcd leader availability and fsync latency under critical catergory vs leader elections.	2023-05-31 11:50:20 -04:00
Tullio Sebastiani	c91e8db928	Added Tullio Sebastiani to the mantainers list	2023-05-25 06:18:33 -04:00
Naga Ravi Chaitanya Elluri	54ea98be9c	Add enhancements being planned as part of the roadmap (#425 )	2023-05-24 14:36:59 -04:00
Pradeep Surisetty	9748622e4f	Add maintainers details	2023-05-24 10:38:53 -04:00
Pradeep Surisetty	47f93b39c2	Add Code of Conduct	2023-05-22 13:25:52 -04:00
Tullio Sebastiani	aa715bf566	bump Dockerfile to release v1.3.0	2023-05-15 12:50:44 -04:00
Tullio Sebastiani	b9c08a45db	extracted the namespace as scenario input (#419 ) fixed sub-workflow and input Co-authored-by: Naga Ravi Chaitanya Elluri <nelluri@redhat.com> v1.3.0	2023-05-15 18:24:23 +02:00
Naga Ravi Chaitanya Elluri	d9f4607aa6	Add blogs and update roadmap	2023-05-15 11:50:16 -04:00
yogananth-subramanian	8806781a4f	Pod network outage Chaos scenario Pod network outage chaos scenario blocks traffic at pod level irrespective of the network policy used. With the current network policies, it is not possible to explicitly block ports which are enabled by allowed network policy rule. This chaos scenario addresses this issue by using OVS flow rules to block ports related to the pod. It supports OpenShiftSDN and OVNKubernetes based networks. Below example config blocks access to openshift console. ```` - id: pod_network_outage config: namespace: openshift-console direction: - ingress ingress_ports: - 8443 label_selector: 'component=ui' ````	2023-05-15 10:43:58 -04:00
Tullio Sebastiani	83b811bee4	Arcaflow stress-ng hogs with parallelism support (#418 ) * kubeconfig management for arcaflow + hogs scenario refactoring * kubeconfig authentication parsing refactored to support arcaflow kubernetes deployer * reimplemented all the hog scenarios to allow multiple parallel containers of the same scenarios (eg. to stress two or more nodes in the same run simultaneously) * updated documentation * removed sysbench scenarios * recovered cpu hogs * updated requirements.txt * updated config.yaml * added gitleaks file for test fixtures * imported sys and logging * removed config_arcaflow.yaml * updated readme * refactored arcaflow documentation entrypoint	2023-05-15 09:45:16 -04:00
Paige Rubendall	16ea18c718	Ibm plugin node scenario (#417 ) * Node scenarios for ibmcloud * adding openshift check info	2023-05-09 12:07:38 -04:00
Naga Ravi Chaitanya Elluri	1ab94754e3	Add missing parameters supported by container scenarios (#415 ) Also renames retry_wait to expected_recovery_time to make it clear that the Kraken will exit 1 if the container doesn't recover within the expected time. Fixes https://github.com/redhat-chaos/krkn/issues/414	2023-05-05 13:02:07 -04:00
Tullio Sebastiani	278b2bafd7	Kraken is pointing to a buggy kill-pod plugin implementation (#416 )	2023-05-04 18:19:54 +02:00
Naga Ravi Chaitanya Elluri	bc863fa01f	Add support to check for critical alerts This commit enables users to opt in to check for critical alerts firing in the cluster post chaos at the end of each scenario. Chaos scenario is considered as failed if the cluster is unhealthy in which case user can start debugging to fix and harden respective areas. Fixes https://github.com/redhat-chaos/krkn/issues/410	2023-05-03 16:14:13 -04:00
Naga Ravi Chaitanya Elluri	900ca74d80	Reorganize the content from https://github.com/startx-lab (#346 ) Moving the content around installing kraken using helm to the chaos in practice section of the guide to showcase how startx-lab is deploying and leveraging Kraken.	2023-04-24 13:51:49 -04:00
Tullio Sebastiani	82b8df4e85	kill-pod plugin dependency pointing to specific commit switched to redhat-chaos repo	2023-04-20 08:26:51 -04:00
Tullio Sebastiani	691be66b0a	kubeconfig_path in new_client_from_config added clients in the same context of the config	2023-04-19 14:12:46 -04:00
Tullio Sebastiani	019b036f9f	renamed trigger work from /test to funtest (#401 ) added quotes renamed trigger to funtest Co-authored-by: Naga Ravi Chaitanya Elluri <nelluri@redhat.com>	2023-04-10 09:30:53 -04:00
Paige Rubendall	13fa711c9b	adding privileged namespace (#399 ) Co-authored-by: Naga Ravi Chaitanya Elluri <nelluri@redhat.com>	2023-04-06 16:18:57 -04:00
Naga Ravi Chaitanya Elluri	17f61625e4	Exit on critical alert failures This commit captures and exits on non-zero return code i.e when critical alerts are fired Fixes https://github.com/redhat-chaos/krkn/issues/396	2023-03-27 12:43:57 -04:00
Tullio Sebastiani	3627b5ba88	cpu hog scenario + basic arcaflow documentation (#391 ) typo typo updated documentation fixed workflow map issue	2023-03-15 16:52:20 +01:00
Tullio Sebastiani	fee4f7d2bf	arcaflow integration (#384 ) arcaflow library version Co-authored-by: Tullio Sebastiani <tsebasti@redhat.com>	2023-03-08 12:01:03 +01:00
Tullio Sebastiani	0534e03c48	removed useless step that was failing (#389 ) removed only old namespace file cat Co-authored-by: Tullio Sebastiani <tsebasti@redhat.com>	2023-02-23 16:28:09 +01:00
Tullio Sebastiani	bb9a19ab71	removed blocking event check	2023-02-22 09:41:52 -05:00
Tullio Sebastiani	c5b9554de5	check user's authorization before running functional tests check users authorization before running functional tests removed usesless checkout step rename typo in trigger	2023-02-21 12:38:34 -05:00

1 2 3 4 5 ...

324 Commits