120 Commits

Author SHA1 Message Date
Darshan Jain
625e1e90cf feat: add color-coded console logging (#1122) (#1146)
Some checks failed
Functional & Unit Tests / Functional & Unit Tests (push) Failing after 2m16s
Functional & Unit Tests / Generate Coverage Badge (push) Has been skipped
Manage Stale Issues and Pull Requests / Mark and Close Stale Issues and PRs (push) Successful in 24s
Signed-off-by: ddjain <darjain@redhat.com>
2026-02-05 14:27:52 +05:30
Paige Patton
c3f6b1a7ff updating return code (#1001)
Some checks failed
Functional & Unit Tests / Functional & Unit Tests (push) Failing after 9m37s
Functional & Unit Tests / Generate Coverage Badge (push) Has been skipped
Signed-off-by: Paige Patton <prubenda@redhat.com>
2025-12-16 10:27:24 -05:00
Paige Patton
4ebfc5dde5 adding thread lock (#974)
Signed-off-by: Paige Patton <prubenda@redhat.com>
2025-11-26 09:37:19 -05:00
Paige Patton
35609484d4 fixing batch size limit (#964)
Signed-off-by: Paige Patton <prubenda@redhat.com>
2025-11-21 09:47:41 -05:00
Paige Patton
eb86885bcd adding kube virt check failure (#952)
Some checks failed
Functional & Unit Tests / Functional & Unit Tests (push) Failing after 9m14s
Functional & Unit Tests / Generate Coverage Badge (push) Has been skipped
Signed-off-by: Paige Patton <prubenda@redhat.com>
2025-11-13 10:37:42 -05:00
Paige Patton
9ee76ce337 post chaos (#939)
Some checks failed
Functional & Unit Tests / Functional & Unit Tests (push) Failing after 9m40s
Functional & Unit Tests / Generate Coverage Badge (push) Has been skipped
Signed-off-by: Paige Patton <prubenda@redhat.com>
2025-11-11 14:11:04 -05:00
Paige Patton
166204e3c5 adding debug command line option
Signed-off-by: Paige Patton <prubenda@redhat.com>
2025-10-31 11:12:46 -04:00
Paige Patton
a5459792ef adding critical alerts to post to elastic search
Signed-off-by: Paige Patton <prubenda@redhat.com>
2025-10-08 15:38:20 -04:00
Paige Patton
84169e2d4e adding no scenario type (#869)
Some checks failed
Functional & Unit Tests / Functional & Unit Tests (push) Failing after 5m32s
Functional & Unit Tests / Generate Coverage Badge (push) Has been skipped
Signed-off-by: Paige Patton <prubenda@redhat.com>
2025-08-29 08:55:06 -04:00
LIU ZHE YOU
816363d151 [Rollback Scenarios] Perform rollback (#879)
Some checks failed
Functional & Unit Tests / Functional & Unit Tests (push) Failing after 9m18s
Functional & Unit Tests / Generate Coverage Badge (push) Has been skipped
* Add rollback config

* Inject rollback handler to scenario plugin

* Add Serializer

* Add decorator

* Add test with SimpleRollbackScenarioPlugin

* Add logger for verbose debug flow

* Resolve review comment

- remove additional rollback config in config.yaml
- set KUBECONFIG to ~/.kube/config in test_rollback

* Simplify set_rollback_context_decorator

* Fix integration of rollback_handler in __load_plugins

* Refactor rollback.config module

  - make it singleton class with register method to construct
  - RollbackContext ( <timestamp>-<run_uuid> )
  - add get_rollback_versions_directory for moduling the directory
    format

* Adapt new rollback.config

* Refactor serialization

- respect rollback_callable_name
- refactor _parse_rollback_callable_code
- refine VERSION_FILE_TEMPLATE

* Add get_scenario_rollback_versions_directory in RollbackConfig

* Add rollback in ApplicationOutageScenarioPlugin

* Add RollbackCallable and RollbackContent for type annotation

* Refactor rollback_handler with limited arguments

* Refactor the serialization for rollback

- limited arguments: callback and rollback_content just these two!
- always constuct lib_openshift and lib_telemetry in version file
- add _parse_rollback_content_definition for retrieving scenaio specific
  rollback_content
- remove utils for formating variadic function

* Refactor applicaton outage scenario

* Fix test_rollback

* Make RollbackContent with static fields

* simplify serialization

  - Remove all unused format dynamic arguments utils
  - Add jinja template for version file
  - Replace set_context for serialization with passing version to serialize_callable

* Add rollback for hogs scenario

* Fix version file full path based on feedback

- {versions_directory}/<timestamp(ns)>-<run_uuid>/{scenario_type}-<timestamp(ns)>-<random_hash>.py

* Fix scenario plugins after rebase

* Add rollback config

* Inject rollback handler to scenario plugin

* Add test with SimpleRollbackScenarioPlugin

* Resolve review comment

- remove additional rollback config in config.yaml
- set KUBECONFIG to ~/.kube/config in test_rollback

* Fix integration of rollback_handler in __load_plugins

* Refactor rollback.config module

  - make it singleton class with register method to construct
  - RollbackContext ( <timestamp>-<run_uuid> )
  - add get_rollback_versions_directory for moduling the directory
    format

* Adapt new rollback.config

* Add rollback in ApplicationOutageScenarioPlugin

* Add RollbackCallable and RollbackContent for type annotation

* Refactor applicaton outage scenario

* Fix test_rollback

* Make RollbackContent with static fields

* simplify serialization

  - Remove all unused format dynamic arguments utils
  - Add jinja template for version file
  - Replace set_context for serialization with passing version to serialize_callable

* Add rollback for hogs scenario

* Fix version file full path based on feedback

- {versions_directory}/<timestamp(ns)>-<run_uuid>/{scenario_type}-<timestamp(ns)>-<random_hash>.py

* Fix scenario plugins after rebase

* Add execute rollback

* Add CLI for list and execute rollback

* Replace subprocess with importlib

* Fix error after rebase

* fixup! Fix docstring

- Add telemetry_ocp in execute_rollback docstring
- Remove rollback_config in create_plugin docstring
- Remove scenario_types in set_rollback_callable docsting

* fixup! Replace os.urandom with krkn_lib.utils.get_random_string

* fixup! Add missing telemetry_ocp for execute_rollback_version_files

* fixup! Remove redundant import

- Remove duplicate TYPE_CHECKING in handler module
- Remove cast in signal module
- Remove RollbackConfig in scenario_plugin_factory

* fixup! Replace sys.exit(1) with return

* fixup! Remove duplicate rollback_network_policy

* fixup! Decouple Serializer initialization

* fixup! Rename callback to rollback_callable

* fixup! Refine comment for constructing AbstractScenarioPlugin with
placeholder value

* fixup! Add version in docstring

* fixup! Remove uv.lock
2025-08-20 16:50:52 +02:00
Paige Patton
5002f210ae removing dashboard installation
Some checks failed
Functional & Unit Tests / Functional & Unit Tests (push) Failing after 9m9s
Functional & Unit Tests / Generate Coverage Badge (push) Has been skipped
Signed-off-by: Paige Patton <prubenda@redhat.com>
2025-08-05 11:27:41 -04:00
Paige Patton
87c2b3c8fd adding recovery times to metrics
Some checks failed
Functional & Unit Tests / Functional & Unit Tests (push) Failing after 8m26s
Functional & Unit Tests / Generate Coverage Badge (push) Has been skipped
Signed-off-by: Paige Patton <prubenda@redhat.com>
2025-05-22 13:49:30 -04:00
Paige Patton
cad6b68f43 adding collecting metrics (#752)
Some checks failed
Functional & Unit Tests / Functional & Unit Tests (push) Failing after 1m28s
Functional & Unit Tests / Generate Coverage Badge (push) Has been skipped
Signed-off-by: Paige Patton <prubenda@redhat.com>
2025-03-19 17:08:44 +01:00
kattameghana
dd4d0d0389 Health checks implementation for application endpoints (#761)
* Hog scenario porting from arcaflow to native (#748)

* added new native hog scenario

* removed arcaflow dependency + legacy hog scenarios

* config update

* changed hog configuration structure + added average samples

* fix on cpu count

* removes tripledes warning

* changed selector format

* changed selector syntax

* number of nodes option

* documentation

* functional tests

* exception handling on hog deployment thread

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Hog scenario porting from arcaflow to native (#748)

* added new native hog scenario

* removed arcaflow dependency + legacy hog scenarios

* config update

* changed hog configuration structure + added average samples

* fix on cpu count

* removes tripledes warning

* changed selector format

* changed selector syntax

* number of nodes option

* documentation

* functional tests

* exception handling on hog deployment thread

Signed-off-by: Paige Patton <prubenda@redhat.com>
Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* adding vsphere updates to non native

Signed-off-by: Paige Patton <prubenda@redhat.com>
Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* adding node id to affected node

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Fixed the spelling mistake

Signed-off-by: Meghana Katta <mkatta@mkatta-thinkpadt14gen4.bengluru.csb>
Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* adding v4.0.8 version (#756)

Signed-off-by: Paige Patton <prubenda@redhat.com>
Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Add autodetecting distribution (#753)

Used is_openshift function from krkn lib

Remove distribution from config

Remove distribution from documentation

Signed-off-by: jtydlack <139967002+jtydlack@users.noreply.github.com>
Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* initial version of health checks

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Changes for appending success response and health check config format

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Changes include health check doc and exit_on_failure config

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Update config.yaml

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* initial version of health checks

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Changes for appending success response and health check config format

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Update config.yaml

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* initial version of health checks

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Changes for appending success response and health check config format

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Changes include health check doc and exit_on_failure config

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Update config.yaml

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* initial version of health checks

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Changes for appending success response and health check config format

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Update config.yaml

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Added the health check config in functional test config

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Modified the health checks documentation

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Changes for debugging the functional test failing

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* changed the code for debugging in run_test.sh

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Debugging

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Removed the functional test running line

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Removing the health check config in common_test_config for debugging

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Fixing functional test fialure

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Removing the changes that are added for debugging

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* few modifications

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Renamed timestamp

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Changed the start timestamp and end timestamp data type to the datetime

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* initial version of health checks

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Changes for appending success response and health check config format

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Changes include health check doc and exit_on_failure config

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Update config.yaml

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* initial version of health checks

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Changes for appending success response and health check config format

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Update config.yaml

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Hog scenario porting from arcaflow to native (#748)

* added new native hog scenario

* removed arcaflow dependency + legacy hog scenarios

* config update

* changed hog configuration structure + added average samples

* fix on cpu count

* removes tripledes warning

* changed selector format

* changed selector syntax

* number of nodes option

* documentation

* functional tests

* exception handling on hog deployment thread

Signed-off-by: Paige Patton <prubenda@redhat.com>
Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* adding node id to affected node

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* initial version of health checks

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Changes for appending success response and health check config format

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Changes include health check doc and exit_on_failure config

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Update config.yaml

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* initial version of health checks

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Changes for appending success response and health check config format

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Update config.yaml

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Added the health check config in functional test config

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Modified the health checks documentation

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Changes for debugging the functional test failing

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* changed the code for debugging in run_test.sh

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Debugging

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Removed the functional test running line

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Removing the health check config in common_test_config for debugging

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Fixing functional test fialure

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Removing the changes that are added for debugging

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* few modifications

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Renamed timestamp

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* initial version of health checks

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Changes for appending success response and health check config format

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* initial version of health checks

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Hog scenario porting from arcaflow to native (#748)

* added new native hog scenario

* removed arcaflow dependency + legacy hog scenarios

* config update

* changed hog configuration structure + added average samples

* fix on cpu count

* removes tripledes warning

* changed selector format

* changed selector syntax

* number of nodes option

* documentation

* functional tests

* exception handling on hog deployment thread

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Hog scenario porting from arcaflow to native (#748)

* added new native hog scenario

* removed arcaflow dependency + legacy hog scenarios

* config update

* changed hog configuration structure + added average samples

* fix on cpu count

* removes tripledes warning

* changed selector format

* changed selector syntax

* number of nodes option

* documentation

* functional tests

* exception handling on hog deployment thread

Signed-off-by: Paige Patton <prubenda@redhat.com>
Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* adding node id to affected node

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* initial version of health checks

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Changes include health check doc and exit_on_failure config

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Update config.yaml

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* initial version of health checks

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Changes for appending success response and health check config format

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Update config.yaml

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Added the health check config in functional test config

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Changes for debugging the functional test failing

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* changed the code for debugging in run_test.sh

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Debugging

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Removed the functional test running line

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Removing the health check config in common_test_config for debugging

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Fixing functional test fialure

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Removing the changes that are added for debugging

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* few modifications

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Renamed timestamp

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* passing the health check response as HealthCheck object

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Updated the krkn-lib version in requirements.txt

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

* Changed the coverage

Signed-off-by: kattameghana <meghanakatta8@gmail.com>

---------

Signed-off-by: kattameghana <meghanakatta8@gmail.com>
Signed-off-by: Paige Patton <prubenda@redhat.com>
Signed-off-by: Meghana Katta <mkatta@mkatta-thinkpadt14gen4.bengluru.csb>
Signed-off-by: jtydlack <139967002+jtydlack@users.noreply.github.com>
Co-authored-by: Tullio Sebastiani <tsebastiani@users.noreply.github.com>
Co-authored-by: Paige Patton <prubenda@redhat.com>
Co-authored-by: Meghana Katta <mkatta@mkatta-thinkpadt14gen4.bengluru.csb>
Co-authored-by: Paige Patton <64206430+paigerube14@users.noreply.github.com>
Co-authored-by: jtydlack <139967002+jtydlack@users.noreply.github.com>
2025-03-18 12:08:30 +00:00
jtydlack
a25736ad08 Add autodetecting distribution (#753)
Some checks failed
Functional & Unit Tests / Functional & Unit Tests (push) Failing after 9m12s
Functional & Unit Tests / Generate Coverage Badge (push) Has been skipped
Used is_openshift function from krkn lib



Remove distribution from config



Remove distribution from documentation

Signed-off-by: jtydlack <139967002+jtydlack@users.noreply.github.com>
2025-02-13 15:45:08 -05:00
Tullio Sebastiani
c7e068a562 Hog scenario porting from arcaflow to native (#748)
* added new native hog scenario

* removed arcaflow dependency + legacy hog scenarios

* config update

* changed hog configuration structure + added average samples

* fix on cpu count

* removes tripledes warning

* changed selector format

* changed selector syntax

* number of nodes option

* documentation

* functional tests

* exception handling on hog deployment thread
2025-01-31 17:01:26 +01:00
Naga Ravi Chaitanya Elluri
462c9ac67e Rename test suite name to chaos-krkn
This is needed for the TRT/component readiness integration to improve
dashboard readability and tie results back to chaos.

Signed-off-by: Naga Ravi Chaitanya Elluri <nelluri@redhat.com>
2024-10-21 14:38:37 -04:00
Tullio Sebastiani
d91172d9b2 Core Refactoring, Krkn Scenario Plugin API (#694)
* relocated shared libraries from `kraken` to `krkn` folder

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* AbstractScenarioPlugin and ScenarioPluginFactory

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* application_outage porting

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* arcaflow_scenarios porting

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* managedcluster_scenarios porting

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* network_chaos porting

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* node_actions porting

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* plugin_scenarios porting

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* pvc_scenarios porting

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* service_disruption porting

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* service_hijacking porting

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* cluster_shut_down_scenarios porting

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* syn_flood porting

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* time_scenarios porting

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* zone_outages porting

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* ScenarioPluginFactory tests

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* unit tests update

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* pod_scenarios and post actions deprecated

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

scenarios post_actions

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* funtests and config update

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* run_krkn.py update

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* utils porting

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* API Documentation

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* container_scenarios porting

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

fix

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* funtest fix

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* document gif update

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* Documentation + tests update

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* removed example plugin

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* global renaming

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

test fix

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

test fix

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* config.yaml typos

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

typos

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* removed `plugin_scenarios` from NativScenarioPlugin class

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* pod_network_scenarios type added

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* documentation update

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* krkn-lib update

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

typo

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

---------

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>
2024-10-03 20:48:04 +02:00
Tullio Sebastiani
736c90e937 Namespaced cluster events and logs integration (#690)
* namespaced events integration

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* namespaced logs  implementation

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

namespaced logs plugin scenario

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

namespaced logs integration

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* logs collection fix

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* krkn-lib 3.1.0 update

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

---------

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>
2024-09-12 11:54:57 +02:00
Paige Patton
26460a0dce Adding elastic set to none (#691)
* adding elastic set to none

rh-pre-commit.version: 2.2.0
rh-pre-commit.check-secrets: ENABLED

Signed-off-by: Auto User <auto@users.noreply.github.com>

* too many ls

rh-pre-commit.version: 2.2.0
rh-pre-commit.check-secrets: ENABLED

---------

Signed-off-by: Auto User <auto@users.noreply.github.com>
Co-authored-by: Auto User <auto@users.noreply.github.com>
2024-09-05 16:05:19 -04:00
Tullio Sebastiani
6186555c15 Elastic search krkn-lib integration (#658)
* Elastic search krkn-lib integration

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

removed default urls

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* Fix alerts bug on prometheus

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* fixed prometheus object initialization bug

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* updated requirements to krkn-lib 2.1.8

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* disabled alerts and metrics by default

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* reverted requirement to elastic branch on krkn-lib

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* numpy downgrade

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* maximium retries added to hijacking funtest

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* added elastic settings to funtest config

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* krkn-lib 3.0.0 update

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

---------

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>
2024-08-28 10:46:42 -04:00
Tullio Sebastiani
9cd086f59c Adds the startup option to produce prow junit XML output for sippy integration (#684)
* removed legacy kubernetes module

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* added sippy junit XML file production options

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* krkn-lib update

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

krkn-lib update

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

---------

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>
2024-08-13 12:40:34 +02:00
Tullio Sebastiani
e02c6d1287 SYN flood scenario (#668)
* scenario config file

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* syn flood plugin

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* run_krkn.py updaated

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* requirements.txt + documentation + config.yaml

* set node selector defaults to worker

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

---------

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>
2024-07-29 15:31:37 -04:00
Tullio Sebastiani
5f836f294b Kill pod arca plugin update adaptation (#656)
* new kill-pod interface adaptation

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* unit test fix

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* requirements update

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* fixed duplicate requirement

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* added conditional dockerfile build

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

fix

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

fix

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

fix

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

removed useless print

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

---------

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>
2024-07-03 15:50:43 +02:00
Paige Rubendall
42fc8eea40 adding wait in pvc scenarios and serivce hijack
rh-pre-commit.version: 2.2.0
rh-pre-commit.check-secrets: ENABLED

Signed-off-by: Paige Rubendall <prubenda@redhat.com>
2024-05-29 16:34:33 -04:00
Tullio Sebastiani
a142f6e7a4 Service hijacking scenario (#617)
* WIP: service hijacking scenario

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* wip

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* error handling

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

adapted run_raken.py

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* restored config.yaml

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* added funtest

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

test fix

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

fix

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

fixed test

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

fix

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

fix test

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

fixed funtest

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

funtest fix

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

minor nit

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

added explicit curl method

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

push

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

fix

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

restored all funtests

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

added mime type test

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

fixed pipeline

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

commented unit

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

utf-8

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

test restored

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

fix test pipeline

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* documentation

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* krkn-lib 2.1.3

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* added other funtests to main merge to collect coverage

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

---------

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>
2024-05-13 10:04:06 +02:00
Tullio Sebastiani
ab98e416a6 Integration of the new pod recovery monitoring strategy implemented in krkn-lib (#609)
* pod monitoring integration in plugin scenario

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* pod monitoring integration in container scenario

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* removed wait-for-pod step from plugin scenario config files

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* introduced global pod recovery time

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

nit

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* introduced krkn_pod_recovery_time in plugin scenario and removed all the references to wait-for-pods

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

fix

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* functional test fix

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* main branch functional test fix

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* increased recovery times

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

---------

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>
2024-04-23 10:49:01 +02:00
Paige Rubendall
b79e526cfd adding app outage not creating file (#605)
Signed-off-by: Paige Rubendall <prubenda@redhat.com>
2024-03-29 14:35:14 -04:00
Tullio Sebastiani
606fb60811 changed exit codes on post chaos alerts and post_scenario failure (#592)
Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>
2024-03-07 16:31:55 +01:00
Tullio Sebastiani
c71ce31779 integrated new telemetry library for WS 2.0
Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

updated krkn-lib version

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>
2024-02-28 22:58:54 -05:00
Tullio Sebastiani
1298f220a6 Critical alerts collection and upload (#577)
* added prometheus client method for critical alerts

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* adapted run_kraken to the new plugin method for critical_alerts collection + telemetry upload

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* requirements.txt pointing temporarly to git

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* fixed severity level

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* added functional tests

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* exit on post chaos critical alerts

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

log moved

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* removed noisy log

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

fixed log

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* updated requirements.txt to krkn-lib 1.4.13

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* krkn lib

* added check on variable that makes kraken return 1 whether post critical alerts are > 0

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

---------

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>
2024-02-28 09:48:29 -05:00
Naga Ravi Chaitanya Elluri
5a8d5b0fe1 Allow critical alerts check when enable_alerts is disabled
This covers use case where user wants to just check for critical alerts
post chaos without having to enable the alerts evaluation feature which
evaluates prom queries specified in an alerts file.

Signed-off-by: Naga Ravi Chaitanya Elluri <nelluri@redhat.com>
2024-02-19 23:15:47 -05:00
Paige Rubendall
c440dc4b51 Taking out start and end time for critical alerts (#572)
* taking out start and end time"

Signed-off-by: Paige Rubendall <prubenda@redhat.com>

* adding only break when alert fires

Signed-off-by: Paige Rubendall <prubenda@redhat.com>

* fail at end if alert had fired

Signed-off-by: Paige Rubendall <prubenda@redhat.com>

* adding new krkn-lib function with no range

Signed-off-by: Paige Rubendall <prubenda@redhat.com>

* updating requirements to new krkn-lib

Signed-off-by: Paige Rubendall <prubenda@redhat.com>

---------

Signed-off-by: Paige Rubendall <prubenda@redhat.com>
2024-02-19 09:28:13 -05:00
Paige Rubendall
b174c51ee0 adding check if connection was properly set
Signed-off-by: Paige Rubendall <prubenda@redhat.com>
2024-02-15 17:28:20 -05:00
Paige Rubendall
fec0434ce1 adding upload to elastic search
Signed-off-by: Paige Rubendall <prubenda@redhat.com>
2024-02-13 12:01:40 -05:00
Tullio Sebastiani
2e38b8b033 Kubernetes prometheus telemetry + functional tests (#566)
added comment on the node selector input.yaml

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>
2024-02-09 16:38:12 +01:00
Paige Rubendall
f154bcb692 adding krkn report location
Signed-off-by: Paige Rubendall <prubenda@redhat.com>
2024-01-25 10:45:01 -05:00
Tullio Sebastiani
a7e5ae6c80 Replaced oc debug command execution on node with a native version (#547)
* native time skew feature

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* fixed podname conflict issue

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* updated krkn-lib to v1.4.6

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

* fixed pod conflict issue

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>

---------

Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>
2024-01-15 12:15:38 -05:00
Tullio Sebastiani
aa030a21d3 Fixes the critical alerts exception with the start_time > end_time
Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>
2024-01-15 11:11:45 -05:00
Tullio Sebastiani
d9e137e85a fixes prometheus url check on Kubernetes
Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>
2024-01-10 11:23:02 -05:00
Tullio Sebastiani
f2d7f88cb8 Krkn lib prometheus client + kube_burner references removed
Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>
2024-01-09 10:43:32 -05:00
Sahil Shah
82db2fca75 Removing Litmus Scenario 2023-11-16 09:50:04 -05:00
Naga Ravi Chaitanya Elluri
afe8d817a9 Print telemetry data location to stdout
This commit also deprecates litmus integration.
2023-11-13 10:01:17 -05:00
Tullio Sebastiani
7a966a71d0 krkn integration of telemetry events collection (#523)
* function package refactoring in krkn-lib

* cluster events collection flag

* krkn-lib version bump

requirements

* dockerfile bump
2023-10-31 14:31:33 -04:00
Tullio Sebastiani
27fabfd4af OCP/K8S functionalities and packages splitting in krkn-lib (#507)
* krkn-lib ocp/k8s split adaptation

* library reference updated

* requirements update

* rebase with main + fix
2023-10-30 17:31:48 +01:00
jtydlack
ff469579e9 Use function get_yaml_item_value
Enables using default even though the value was loaded as None.
2023-10-24 14:55:49 -04:00
Paige Rubendall
f7f1b2dfb0 Service disruption (#494)
* adding service disruption

* fixing kil services

* service log changes

* remvoing extra logging

* adding daemon set

* adding service disruption name changes

* cerberus config back

* bad string
2023-10-06 12:51:10 -04:00
Tullio Sebastiani
61356fd70b Added log telemetry piece to Krkn (#500)
* config

* log collection and upload

dictionary key fix

* escape regex in config.yaml

* bump krkn-lib version

* updated funtest github cli command

* update krkn-lib version to 1.3.2

* fixed requirements.txt
2023-10-06 10:08:46 -04:00
Tullio Sebastiani
782d04c1b1 Prints the telemetry json after sending it to the webservice (#479)
* prints telemetry json after sending it to the service


deserialized base64 parameters

* json output even if telemetry collection is disabled.
2023-09-25 12:00:08 -04:00
Tullio Sebastiani
f868000ebd Switched from krkn_lib_kubernetes to krkn_lib v1.0.0 (#469)
* changed all the references to krkn_lib_kubernetes to the new krkn_lib


changed all the references

* added krkn-lib pointer in documentation
2023-08-22 12:41:40 -04:00