Commit Graph

33 Commits

Author SHA1 Message Date
Naga Ravi Chaitanya Elluri
94bec8dc9b Add missing import to get values from yaml (#526)
* Add missing import to get values from yaml

* Update Dockerfile

* Update Dockerfile-ppc64le

---------

Co-authored-by: Tullio Sebastiani <tsebastiani@users.noreply.github.com>
2023-11-07 11:07:17 +01:00
Tullio Sebastiani
7a966a71d0 krkn integration of telemetry events collection (#523)
* function package refactoring in krkn-lib

* cluster events collection flag

* krkn-lib version bump

requirements

* dockerfile bump
2023-10-31 14:31:33 -04:00
Tullio Sebastiani
27fabfd4af OCP/K8S functionalities and packages splitting in krkn-lib (#507)
* krkn-lib ocp/k8s split adaptation

* library reference updated

* requirements update

* rebase with main + fix
2023-10-30 17:31:48 +01:00
jtydlack
ff469579e9 Use function get_yaml_item_value
Enables using default even though the value was loaded as None.
2023-10-24 14:55:49 -04:00
Tullio Sebastiani
f868000ebd Switched from krkn_lib_kubernetes to krkn_lib v1.0.0 (#469)
* changed all the references to krkn_lib_kubernetes to the new krkn_lib


changed all the references

* added krkn-lib pointer in documentation
2023-08-22 12:41:40 -04:00
Tullio Sebastiani
39c0152b7b Krkn telemetry integration (#435)
* adapted config.yaml to the new feature

* temporarly pointing requirement.txt to the lib feature branch

* run_kraken.py + arcaflow scenarios refactoring


typo

* plugin scenario

* node scenarios


return failed scenarios

* container scenarios


fix

* time scenarios

* cluster shutdown  scenarios

* namespace scenarios

* zone outage scenarios

* app outage scenarios

* pvc scenarios

* network chaos scenarios

* run_kraken.py adaptation to telemetry

* prometheus telemetry upload + config.yaml


some fixes


typos and logs


max retries in config


telemetry id with run_uuid


safe_logger

* catch send_telemetry exception

* scenario collection bug fixes

* telemetry enabled check

* telemetry run tag

* requirements pointing to main + archive_size

* requirements.txt and config.yaml update

* added telemetry config to common config

* fixed scenario array elements for telemetry
2023-08-10 14:42:53 -04:00
Tullio Sebastiani
68dc17bc44 krkn-lib-kubernetes refactoring proposal (#400)
* run_kraken.py updated + renamed kubernetes library folder


unstaged files


kubecli marker

* container scenarios updated

* node scenarios updated


typo


injected kubecli

* managed cluster scenarios updated

* time scenarios updated

* litmus scenarios updated

* cluster scenarios updated

* namespace scenarios updated

* pvc scenarios updated

* network chaos scenarios updated

* common_managed_cluster functions updated

* switched draft library to official one

* regression on rebase
2023-06-13 10:02:35 -04:00
José Castillo Lema
493a8a245f Docker provider for node actions (#369)
* Docker provider for node actions

* Adjusted dependencies and imports

* Update config_kind.yaml

Signed-off-by: José Castillo Lema <josecastillolema@gmail.com>

Signed-off-by: José Castillo Lema <josecastillolema@gmail.com>
2023-01-10 14:36:18 -05:00
Naga Ravi Chaitanya Elluri
1c207538b6 Use run dir instead of tmp
This commit also logs a message to handle the exception during the
node checks.

Fixes https://github.com/redhat-chaos/krkn/issues/356, https://github.com/redhat-chaos/krkn/issues/357
2022-11-08 15:46:08 -05:00
Naga Ravi Chaitanya Elluri
b9d5a7af4d Use safe loader for Yaml
This fixes the security vulnerabilities for example - it raises an
exception when opening a yaml file with code.

Fixes https://github.com/redhat-chaos/krkn/issues/352
2022-11-08 13:35:06 -05:00
Robert O'Brien
69fc8e8d1b Add resource version to list node call 2022-08-03 16:51:49 +02:00
Robert O'Brien
77f53b3a23 Rework node status to use watches 2022-08-03 16:51:49 +02:00
Paige Rubendall
94909fca94 Updating unknown status for when cluster becomes disconnected 2022-05-09 10:00:22 -04:00
Robert O'Brien
849ea7851b rework node ready wait status 2022-05-04 21:15:27 +02:00
Paige Rubendall
7f60701444 adding alibaba node scenario start 2022-04-01 16:46:29 -04:00
Pravin Dsilva
38302e7d95 Add timeout for Openstack node scenarios
Signed-off-by: Pravin Dsilva <pdsilva@redhat.com>
2021-11-25 20:56:59 -05:00
Paige Rubendall
87aa9eef4d Adding multiple node names and instance count for label selectors 2021-10-26 13:44:28 -04:00
Paige Rubendall
10e9b09819 Adding fix for openstack node name issue 2021-10-14 14:56:46 -04:00
Naga Ravi Chaitanya Elluri
adb465cab0 Add support for multi-zone disruption
This will enable users to disrupt multiple zones in the cluster simultaneously
to be able to understand the behaviour of various components.
2021-08-26 08:23:24 -04:00
Naga Ravi Chaitanya Elluri
6456eec76a Add zone outage scenarios
This commit adds support to create zone outage in AWS by denying both
ingress and egress traffic to the instances belonging to a particular
subnet belonging to the zone by tweaking the network acl. This creates
an outage of all the nodes in the zone - both master and workers.
2021-08-17 11:43:13 -04:00
Naga Ravi Chaitanya Elluri
716057eab6 Monitor user application availability during chaos
Current Kraken integration with Cerberus monitors the cluster as well as the
application health post chaos and pass/fails if they are not healthy after chaos.
This commit adds ability to monitor the user application health during the chaos
and fails the run in case of downtime as it's potentially a downtime in case of
customers environment as well. It is especially useful in case of control plane
failure scenarios including API server, Etcd, Ingress etc.
2021-07-27 13:15:57 -04:00
Jared O'Connell
9b83dbcf04 Baremetal Node Support (#74)
* Support for baremtal node scenarious

* Finished baremetal support

* Added documentation for baremetal

* Clarify limitations of implementation in documentation

* Add baremetal support to new run.py file

* Allow use on newer machines

Some older machines require lanplus instead of lan

* Setup to allow per-device user, pass, and bmc address

Also set min version for a dependency

* Fix linting issues

* More linting issue fixes

* More linter issues

* Account for linter standard non-conformity

* Added baremetal warning

Co-authored-by: jaredoconnell <jocnnel@redhat.com>
2021-07-02 17:31:40 -04:00
prubenda
41bf815f98 Adding shut down scenario for gcp, az, aws, openstack 2021-06-23 09:00:58 -04:00
Naga Ravi Chaitanya Elluri
871eb3d74e Avoid circular dependencies
This commit deletes unneeded imports and fixes the circular dependency
issues.
2021-06-17 11:18:34 -04:00
Naga Ravi Chaitanya Elluri
5c2453b07e Refactor code base
This commit:
- Refactors the code base to be more modular by moving functions
  into respective modules to make it lean and reusable.
- Uses black to reformat the code to follow PEP 8 practices.
2021-06-14 17:41:10 -04:00
Paige Rubendall
190cf5d462 Blank node name error message (#97)
* adding contribute doc

* Fixing blank node name param printing off incorrect data
2021-05-06 10:13:17 -04:00
Amit Sagtani
d00d6ec69e Install pre-commit and use GitHub Actions (#94)
* added pre-commit and code-cleaning

* removed tox and TravisCI
2021-05-05 09:53:45 -04:00
prubenda
c7bb32f633 Adding azure to node scenarios 2021-03-17 17:41:07 -04:00
Pravin Dsilva
918b5fb6d3 Add node level chaos scenarios for bastion node
Signed-off-by: Pravin Dsilva <pravin.d-silva@ibm.com>
2021-02-16 09:04:55 -08:00
arcprabh
8dd18af161 Enable support for Openstack cloud.
Signed-off-by: arcprabh <arcprabh@in.ibm.com>

Incorporated first round of review comments

Signed-off-by: arcprabh <arcprabh@in.ibm.com>

Resolve multiple node name issue for single ip

Signed-off-by: arcprabh <arcprabh@in.ibm.com>
2021-02-02 20:47:30 +05:30
prubenda
d3e01db574 adding start to fix for all other cloud types 2020-11-24 16:32:43 -05:00
prubenda
72fe662e05 Adding GCP node scenarios support 2020-11-17 09:57:39 -05:00
Yashashree Suresh
31f06b861a Added node scenarios to stop and terminate instance
This commit:
- Adds a node scenario to stop and start an instance
- Adds a node scenario to terminate an instance
- Adds a node scenario to reboot an instance
- Adds a node scenario to stop the kubelet
- Adds a node scenario to crash the node
2020-08-27 16:50:42 -04:00