Allow critical alerts check when enable_alerts is disabled

This covers use case where user wants to just check for critical alerts post chaos without having to enable the alerts evaluation feature which evaluates prom queries specified in an alerts file. Signed-off-by: Naga Ravi Chaitanya Elluri <nelluri@redhat.com>
Taking out start and end time for critical alerts (#572 )
2026-02-23 06:13:50 +00:00 · 2024-02-19 23:15:47 -05:00 · 2024-02-19 09:28:13 -05:00 · 2024-02-15 17:28:20 -05:00 · 2024-02-13 12:01:40 -05:00 · 2024-02-13 17:06:20 +01:00
62 changed files with 301 additions and 851 deletions
--- a/.github/workflows/docker-image.yml
+++ b/.github/workflows/docker-image.yml
@@ -37,5 +37,5 @@ jobs:
      if: github.ref == 'refs/heads/main' && github.event_name == 'push'
      uses: redhat-chaos/actions/krkn-hub@main
      with:
-        QUAY_USER: ${{ secrets.QUAY_USER_1 }}
-        QUAY_TOKEN: ${{ secrets.QUAY_TOKEN_1 }}
+        QUAY_USER: ${{ secrets.QUAY_USERNAME }}
+        QUAY_TOKEN: ${{ secrets.QUAY_PASSWORD }}
--- a/.github/workflows/tests.yml
+++ b/.github/workflows/tests.yml
@@ -1,8 +1,12 @@
 name: Functional & Unit Tests
 on:
  pull_request:
+  push:
+    branches:
+      - main
 jobs:
  tests:
+    # Common steps
    name: Functional & Unit Tests
    runs-on: ubuntu-latest
    steps:
@@ -47,8 +51,7 @@ jobs:
          sudo apt-get install build-essential python3-dev
          pip install --upgrade pip
          pip install -r requirements.txt
-#      - name: Run unit tests
-#        run: python -m coverage run -a -m unittest discover -s tests -v
+
      - name: Deploy test workloads
        run: |
          kubectl apply -f CI/templates/outage_pod.yaml
@@ -61,28 +64,52 @@ jobs:
      - name: Get Kind nodes
        run: |
          kubectl get nodes --show-labels=true
+      # Pull request only steps
+      - name: Run unit tests
+        if: github.event_name == 'pull_request'
+        run: python -m coverage run -a -m unittest discover -s tests -v

-      - name: Setup Functional Tests
+      - name: Setup Pull Request Functional Tests
+        if: github.event_name == 'pull_request'
        run: |
-            yq -i '.kraken.distribution="kubernetes"' CI/config/common_test_config.yaml
            yq -i '.kraken.port="8081"' CI/config/common_test_config.yaml
            yq -i '.kraken.signal_address="0.0.0.0"' CI/config/common_test_config.yaml
            yq -i '.kraken.performance_monitoring="localhost:9090"' CI/config/common_test_config.yaml
-            echo "test_app_outages" > ./CI/tests/my_tests
-            echo "test_container"      >> ./CI/tests/my_tests
-            echo "test_namespace"      >> ./CI/tests/my_tests
-            echo "test_net_chaos"      >> ./CI/tests/my_tests
-            echo "test_time"           >> ./CI/tests/my_tests
-            echo "test_arca_cpu_hog" >> ./CI/tests/my_tests
-            echo "test_arca_memory_hog" >> ./CI/tests/my_tests
-            echo "test_arca_io_hog" >> ./CI/tests/my_tests
+            echo "test_app_outages" > ./CI/tests/functional_tests
+            echo "test_container"      >> ./CI/tests/functional_tests
+            echo "test_namespace"      >> ./CI/tests/functional_tests
+            echo "test_net_chaos"      >> ./CI/tests/functional_tests
+            echo "test_time"           >> ./CI/tests/functional_tests
+            echo "test_arca_cpu_hog" >> ./CI/tests/functional_tests
+            echo "test_arca_memory_hog" >> ./CI/tests/functional_tests
+            echo "test_arca_io_hog" >> ./CI/tests/functional_tests
+
+      # Push on main only steps
+      - name: Configure AWS Credentials
+        if: github.ref == 'refs/heads/main' && github.event_name == 'push'
+        uses: aws-actions/configure-aws-credentials@v4
+        with:
+          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
+          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
+          aws-region : ${{ secrets.AWS_REGION }}
+      - name: Setup Post Merge Request Functional Tests
+        if: github.ref == 'refs/heads/main' && github.event_name == 'push'
+        run: |
+          yq -i '.kraken.port="8081"' CI/config/common_test_config.yaml
+          yq -i '.kraken.signal_address="0.0.0.0"' CI/config/common_test_config.yaml
+          yq -i '.kraken.performance_monitoring="localhost:9090"' CI/config/common_test_config.yaml
+          yq -i '.telemetry.username="${{secrets.TELEMETRY_USERNAME}}"' CI/config/common_test_config.yaml
+          yq -i '.telemetry.password="${{secrets.TELEMETRY_PASSWORD}}"' CI/config/common_test_config.yaml
+          echo "test_telemetry" > ./CI/tests/functional_tests
+
+      # Final common steps
      - name: Run Functional tests
+        env:
+          AWS_BUCKET: ${{ secrets.AWS_BUCKET }}
        run: |
          ./CI/run.sh
          cat ./CI/results.markdown >> $GITHUB_STEP_SUMMARY
          echo >> $GITHUB_STEP_SUMMARY
-      - name: Run Unit tests
-        run: python -m coverage run -a -m unittest discover -s tests -v
      - name: Upload CI logs
        uses: actions/upload-artifact@v3
        with:
--- a/.gitignore
+++ b/.gitignore
@@ -61,7 +61,7 @@ inspect.local.*
 !CI/config/common_test_config.yaml
 CI/out/*
 CI/ci_results
-CI/scenarios/*node.yaml
+CI/legacy/*node.yaml
 CI/results.markdown

 #env
--- a/CI/README.md
+++ b/CI/README.md
@@ -1,7 +1,7 @@
 ## CI Tests

 ### First steps
-Edit [my_tests](tests/my_tests) with tests you want to run
+Edit [functional_tests](tests/functional_tests) with tests you want to run

 ### How to run
 ```./CI/run.sh```
@@ -11,7 +11,7 @@ This will run kraken using python, make sure python3 is set up and configured pr

 ### Adding a test case

-1. Add in simple scenario yaml file to execute under [../CI/scenarios/](scenarios)
+1. Add in simple scenario yaml file to execute under [../CI/scenarios/](legacy)

 2. Copy [test_application_outages.sh](tests/test_app_outages.sh) for example on how to get started

@@ -27,7 +27,7 @@ This will run kraken using python, make sure python3 is set up and configured pr

    e. 15: Make sure name of config in line 14 matches what you pass on this line

-4. Add test name to [my_tests](../CI/tests/my_tests) file
+4. Add test name to [functional_tests](../CI/tests/functional_tests) file

    a. This will be the name of the file without ".sh"

--- a/CI/config/common_test_config.yaml
+++ b/CI/config/common_test_config.yaml
@@ -1,5 +1,5 @@
 kraken:
-    distribution: openshift                                # Distribution can be kubernetes or openshift.
+    distribution: kubernetes                                # Distribution can be kubernetes or openshift.
    kubeconfig_path: ~/.kube/config                        # Path to kubeconfig.
    exit_on_failure: False                                 # Exit when a post action scenario fails.
    litmus_version: v1.13.6                                # Litmus version to install.
@@ -29,9 +29,12 @@ tunings:
    daemon_mode: False                                     # Iterations are set to infinity which means that the kraken will cause chaos forever.
 telemetry:
    enabled: False                                           # enable/disables the telemetry collection feature
-    api_url: https://ulnmf9xv7j.execute-api.us-west-2.amazonaws.com/production #telemetry service endpoint
-    username: username                                      # telemetry service username
-    password: password                                      # telemetry service password
+    api_url: https://yvnn4rfoi7.execute-api.us-west-2.amazonaws.com/test #telemetry service endpoint
+    username: $TELEMETRY_USERNAME                                      # telemetry service username
+    password: $TELEMETRY_PASSWORD                                      # telemetry service password
+    prometheus_namespace: 'prometheus-k8s'                                # prometheus namespace
+    prometheus_pod_name: 'prometheus-kind-prometheus-kube-prome-prometheus-0'                                 # prometheus pod_name
+    prometheus_container_name: 'prometheus'
    prometheus_backup: True                                 # enables/disables prometheus data collection
    full_prometheus_backup: False                           # if is set to False only the /prometheus/wal folder will be downloaded.
    backup_threads: 5                                       # number of telemetry download/upload threads
@@ -39,3 +42,10 @@ telemetry:
    max_retries: 0                                          # maximum number of upload retries (if 0 will retry forever)
    run_tag: ''                                             # if set, this will be appended to the run folder in the bucket (useful to group the runs)
    archive_size: 10000                                     # the size of the prometheus data archive size in KB. The lower the size of archive is
+    logs_backup: True
+    logs_filter_patterns:
+        - "(\\w{3}\\s\\d{1,2}\\s\\d{2}:\\d{2}:\\d{2}\\.\\d+).+"         # Sep 9 11:20:36.123425532
+        - "kinit (\\d+/\\d+/\\d+\\s\\d{2}:\\d{2}:\\d{2})\\s+"          # kinit 2023/09/15 11:20:36 log
+        - "(\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:\\d{2}\\.\\d+Z).+"      # 2023-09-15T11:20:36.123425532Z log
+    oc_cli_path: /usr/bin/oc                                # optional, if not specified will be search in $PATH
+    events_backup: True                                     # enables/disables cluster events collection
--- a/CI/legacy/scenarios/cluster_shut_down_scenario.yml
+++ b/CI/legacy/scenarios/cluster_shut_down_scenario.yml
--- a/CI/legacy/scenarios/node_scenario.yml
+++ b/CI/legacy/scenarios/node_scenario.yml
--- a/CI/legacy/scenarios/volume_scenario.yaml
+++ b/CI/legacy/scenarios/volume_scenario.yaml
--- a/CI/legacy/scenarios/zone_outage.yaml
+++ b/CI/legacy/scenarios/zone_outage.yaml
--- a/CI/legacy/scenarios/zone_outage_env.yaml
+++ b/CI/legacy/scenarios/zone_outage_env.yaml
--- a/CI/legacy/tests/test_nodes.sh
+++ b/CI/legacy/tests/test_nodes.sh
--- a/CI/legacy/tests/test_shut_down.sh
+++ b/CI/legacy/tests/test_shut_down.sh
--- a/CI/legacy/tests/test_zone.sh
+++ b/CI/legacy/tests/test_zone.sh
--- a/CI/run.sh
+++ b/CI/run.sh
@@ -17,7 +17,7 @@ wait_cluster_become_ready() {



-ci_tests_loc="CI/tests/my_tests"
+ci_tests_loc="CI/tests/functional_tests"

 echo -e "********* Running Functional Tests Suite *********\n\n"

@@ -37,9 +37,9 @@ echo '-----------------------|--------|---------' >> $results

 # Run each test
 failed_tests=()
-for test_name in `cat CI/tests/my_tests`
+for test_name in `cat CI/tests/functional_tests`
 do
-  wait_cluster_become_ready
+  #wait_cluster_become_ready
  return_value=`./CI/run_test.sh $test_name $results`
  if [[ $return_value == 1 ]]
  then
@@ -49,6 +49,7 @@ do
  wait_cluster_become_ready
 done

+
 if (( ${#failed_tests[@]}>0 ))
 then
  echo -e "\n\n======================================================================"
--- a/CI/scenarios/app_outage.yaml
+++ b/CI/scenarios/app_outage.yaml
@@ -1,5 +0,0 @@
-application_outage:                                  # Scenario to create an outage of an application by blocking traffic
-  duration: 10                                      # Duration in seconds after which the routes will be accessible
-  namespace: openshift-monitoring            # Namespace to target - all application routes will go inaccessible if pod selector is empty
-  pod_selector: {}                            # Pods to target
-  block: [Ingress, Egress]                           # It can be Ingress or Egress or Ingress, Egress
--- a/CI/scenarios/arcaflow/cpu-hog/config.yaml
+++ b/CI/scenarios/arcaflow/cpu-hog/config.yaml
@@ -1,12 +0,0 @@
---
-deployers:
-  image:
-    connection: {}
-    deployer_name: kubernetes
-log:
-  level: debug
-logged_outputs:
-  error:
-    level: error
-  success:
-    level: debug
--- a/CI/scenarios/arcaflow/cpu-hog/input.yaml
+++ b/CI/scenarios/arcaflow/cpu-hog/input.yaml
@@ -1,9 +0,0 @@
-input_list:
- cpu_count: 1
-  cpu_load_percentage: 80
-  cpu_method: all
-  duration: 1s
-  kubeconfig: ''
-  namespace: default
-  node_selector:
-    kubernetes.io/hostname: kind-worker2
--- a/CI/scenarios/arcaflow/cpu-hog/sub-workflow.yaml
+++ b/CI/scenarios/arcaflow/cpu-hog/sub-workflow.yaml
@@ -1,98 +0,0 @@
-version: v0.2.0
-input:
-  root: RootObject
-  objects:
-    RootObject:
-      id: input_item
-      properties:
-        kubeconfig:
-          display:
-            description: The complete kubeconfig file as a string
-            name: Kubeconfig file contents
-          type:
-            type_id: string
-          required: true
-        namespace:
-          display:
-            description: The namespace where the container will be deployed
-            name: Namespace
-          type:
-            type_id: string
-          required: true
-        node_selector:
-            display:
-              description: kubernetes node name where the plugin must be deployed
-            type:
-              type_id: map
-              values:
-                type_id: string
-              keys:
-                type_id: string
-            required: true
-        duration:
-          display:
-            name: duration the scenario expressed in seconds
-            description: stop stress test after T seconds. One can also specify the units of time in
-              seconds, minutes, hours, days or years with the suffix s, m, h, d or y
-          type:
-            type_id: string
-          required: true
-        cpu_count:
-          display:
-            description: Number of CPU cores to be used (0 means all)
-            name: number of CPUs
-          type:
-            type_id: integer
-          required: true
-        cpu_method:
-          display:
-            description: CPU stress method
-            name: fine grained control of which cpu stressors to use (ackermann, cfloat etc.)
-          type:
-            type_id: string
-          required: true
-        cpu_load_percentage:
-          display:
-            description: load CPU by percentage
-            name: CPU load
-          type:
-            type_id: integer
-          required: true
-
-steps:
-  kubeconfig:
-    plugin: 
-      src: quay.io/arcalot/arcaflow-plugin-kubeconfig:0.2.0
-      deployment_type: image
-    input:
-      kubeconfig: !expr $.input.kubeconfig
-  stressng:
-    plugin: 
-      src: quay.io/arcalot/arcaflow-plugin-stressng:0.5.0
-      deployment_type: image
-    step: workload
-    input:
-      cleanup: "true"
-      StressNGParams:
-        timeout: !expr $.input.duration
-        stressors:
-          - stressor: cpu
-            cpu_count: !expr $.input.cpu_count
-            cpu_method: !expr $.input.cpu_method
-            cpu_load: !expr $.input.cpu_load_percentage
-    deploy:
-      deployer_name: kubernetes
-      connection: !expr $.steps.kubeconfig.outputs.success.connection
-      pod:
-        metadata:
-          namespace: !expr $.input.namespace
-          labels:
-            arcaflow: stressng
-        spec:
-          nodeSelector: !expr $.input.node_selector
-          pluginContainer:
-            imagePullPolicy: Always
-outputs:
-  success:
-    stressng: !expr $.steps.stressng.outputs.success
-
--- a/CI/scenarios/arcaflow/cpu-hog/workflow.yaml
+++ b/CI/scenarios/arcaflow/cpu-hog/workflow.yaml
@@ -1,77 +0,0 @@
-version: v0.2.0
-input:
-  root: RootObject
-  objects:
-    RootObject:
-      id: RootObject
-      properties:
-        input_list:
-          type:
-            type_id: list
-            items:
-              id: input_item
-              type_id: object
-              properties:
-                kubeconfig:
-                  display:
-                    description: The complete kubeconfig file as a string
-                    name: Kubeconfig file contents
-                  type:
-                    type_id: string
-                  required: true
-                namespace:
-                    display:
-                      description: The namespace where the container will be deployed
-                      name: Namespace
-                    type:
-                      type_id: string
-                    required: true
-                node_selector:
-                    display:
-                      description: kubernetes node name where the plugin must be deployed
-                    type:
-                      type_id: map
-                      values:
-                        type_id: string
-                      keys:
-                        type_id: string
-                    required: true
-                duration:
-                  display:
-                    name: duration the scenario expressed in seconds
-                    description: stop stress test after T seconds. One can also specify the units of time in
-                      seconds, minutes, hours, days or years with the suffix s, m, h, d or y
-                  type:
-                    type_id: string
-                  required: true
-                cpu_count:
-                  display:
-                    description: Number of CPU cores to be used (0 means all)
-                    name: number of CPUs
-                  type:
-                    type_id: integer
-                  required: true
-                cpu_method:
-                  display:
-                    description: CPU stress method
-                    name: fine grained control of which cpu stressors to use (ackermann, cfloat etc.)
-                  type:
-                    type_id: string
-                  required: true
-                cpu_load_percentage:
-                  display:
-                    description: load CPU by percentage
-                    name: CPU load
-                  type:
-                    type_id: integer
-                  required: true
-steps:
-  workload_loop:
-    kind: foreach
-    items: !expr $.input.input_list
-    workflow: sub-workflow.yaml
-    parallelism: 1000
-outputs:
-  success:
-    workloads: !expr $.steps.workload_loop.outputs.success.data
-
--- a/CI/scenarios/arcaflow/io-hog/config.yaml
+++ b/CI/scenarios/arcaflow/io-hog/config.yaml
@@ -1,11 +0,0 @@
-deployers:
-  image:
-    connection: {}
-    deployer_name: kubernetes
-log:
-  level: debug
-logged_outputs:
-  error:
-    level: error
-  success:
-    level: debug
--- a/CI/scenarios/arcaflow/io-hog/input.yaml
+++ b/CI/scenarios/arcaflow/io-hog/input.yaml
@@ -1,14 +0,0 @@
-input_list:
- duration: 30s
-  io_block_size: 1m
-  io_workers: 1
-  io_write_bytes: 10m
-  kubeconfig: ''
-  namespace: default
-  node_selector:
-    kubernetes.io/hostname: kind-worker2
-  target_pod_folder: /hog-data
-  target_pod_volume:
-    hostPath:
-      path: /tmp
-    name: node-volume
--- a/CI/scenarios/arcaflow/io-hog/sub-workflow.yaml
+++ b/CI/scenarios/arcaflow/io-hog/sub-workflow.yaml
@@ -1,142 +0,0 @@
-version: v0.2.0
-input:
-  root: RootObject
-  objects:
-    hostPath:
-      id: HostPathVolumeSource
-      properties:
-        path:
-          type:
-            type_id: string
-    Volume:
-      id: Volume
-      properties:
-        name:
-          type:
-            type_id: string
-        hostPath:
-          type:
-            id: hostPath
-            type_id: ref
-    RootObject:
-      id: input_item
-      properties:
-        kubeconfig:
-          display:
-            description: The complete kubeconfig file as a string
-            name: Kubeconfig file contents
-          type:
-            type_id: string
-          required: true
-        namespace:
-          display:
-            description: The namespace where the container will be deployed
-            name: Namespace
-          type:
-            type_id: string
-          required: true
-        node_selector:
-            display:
-              description: kubernetes node name where the plugin must be deployed
-            type:
-              type_id: map
-              values:
-                type_id: string
-              keys:
-                type_id: string
-            required: true
-        duration:
-          display:
-            name: duration the scenario expressed in seconds
-            description: stop  stress  test  after  T  seconds.  One  can  also specify the units of time in
-              seconds, minutes, hours, days or years with the suffix s, m, h, d or  y
-          type:
-            type_id: string
-          required: true
-        io_workers:
-          display:
-            description: number of workers
-            name: start N workers continually writing, reading  and  removing  temporary  files
-          type:
-            type_id: integer
-          required: true
-        io_block_size:
-            display:
-              description: single write size
-              name: specify size of each write in bytes. Size can be from 1 byte to 4MB.
-            type:
-              type_id: string
-            required: true
-        io_write_bytes:
-          display:
-            description: Total number of bytes written
-            name: write  N  bytes for each hdd process, the default is 1 GB. One can specify the size
-              as % of free space on the file system or in units  of  Bytes,  KBytes,  MBytes  and
-              GBytes using the suffix b, k, m or g
-          type:
-            type_id: string
-          required: true
-        target_pod_folder:
-          display:
-            description: Target Folder
-            name: Folder in the pod where the test will be executed and the test files will be written
-          type:
-            type_id: string
-          required: true
-        target_pod_volume:
-          display:
-            name: kubernetes volume definition
-            description: the volume that will be attached to the pod. In order to stress
-                         the node storage only hosPath mode is currently supported
-          type:
-            type_id: ref
-            id: Volume
-          required: true
-
-steps:
-  kubeconfig:
-    plugin: 
-      src: quay.io/arcalot/arcaflow-plugin-kubeconfig:0.2.0
-      deployment_type: image
-    input:
-      kubeconfig: !expr $.input.kubeconfig
-  stressng:
-    plugin: 
-      src: quay.io/arcalot/arcaflow-plugin-stressng:0.5.0
-      deployment_type: image
-    step: workload
-    input:
-      cleanup: "true"
-      StressNGParams:
-        timeout: !expr $.input.duration
-        workdir: !expr $.input.target_pod_folder
-        stressors:
-          - stressor: hdd
-            hdd: !expr $.input.io_workers
-            hdd_bytes: !expr $.input.io_write_bytes
-            hdd_write_size: !expr $.input.io_block_size
-
-    deploy:
-      deployer_name: kubernetes
-      connection: !expr $.steps.kubeconfig.outputs.success.connection
-      pod:
-        metadata:
-          namespace: !expr $.input.namespace
-          labels:
-            arcaflow: stressng
-        spec:
-          nodeSelector: !expr $.input.node_selector
-          pluginContainer:
-            imagePullPolicy: Always
-            securityContext:
-              privileged: true
-            volumeMounts:
-              - mountPath: /hog-data
-                name: node-volume
-          volumes:
-            - !expr $.input.target_pod_volume
-
-outputs:
-  success:
-    stressng: !expr $.steps.stressng.outputs.success
-
--- a/CI/scenarios/arcaflow/io-hog/workflow.yaml
+++ b/CI/scenarios/arcaflow/io-hog/workflow.yaml
@@ -1,113 +0,0 @@
-version: v0.2.0
-input:
-  root: RootObject
-  objects:
-    hostPath:
-      id: HostPathVolumeSource
-      properties:
-        path:
-          type:
-            type_id: string
-    Volume:
-      id: Volume
-      properties:
-        name:
-          type:
-            type_id: string
-        hostPath:
-          type:
-            id: hostPath
-            type_id: ref
-    RootObject:
-      id: RootObject
-      properties:
-        input_list:
-          type:
-            type_id: list
-            items:
-              id: input_item
-              type_id: object
-              properties:
-                kubeconfig:
-                  display:
-                    description: The complete kubeconfig file as a string
-                    name: Kubeconfig file contents
-                  type:
-                    type_id: string
-                  required: true
-                namespace:
-                  display:
-                    description: The namespace where the container will be deployed
-                    name: Namespace
-                  type:
-                    type_id: string
-                  required: true
-                node_selector:
-                  display:
-                    description: kubernetes node name where the plugin must be deployed
-                  type:
-                    type_id: map
-                    values:
-                      type_id: string
-                    keys:
-                      type_id: string
-                  required: true
-                duration:
-                  display:
-                    name: duration the scenario expressed in seconds
-                    description: stop  stress  test  after  T  seconds.  One  can  also specify the units of time in
-                      seconds, minutes, hours, days or years with the suffix s, m, h, d or  y
-                  type:
-                    type_id: string
-                  required: true
-                io_workers:
-                  display:
-                    description: number of workers
-                    name: start N workers continually writing, reading  and  removing  temporary  files
-                  type:
-                    type_id: integer
-                  required: true
-                io_block_size:
-                  display:
-                    description: single write size
-                    name: specify size of each write in bytes. Size can be from 1 byte to 4MB.
-                  type:
-                    type_id: string
-                  required: true
-                io_write_bytes:
-                  display:
-                    description: Total number of bytes written
-                    name: write  N  bytes for each hdd process, the default is 1 GB. One can specify the size
-                      as % of free space on the file system or in units  of  Bytes,  KBytes,  MBytes  and
-                      GBytes using the suffix b, k, m or g
-                  type:
-                    type_id: string
-                  required: true
-                target_pod_folder:
-                  display:
-                    description: Target Folder
-                    name: Folder in the pod where the test will be executed and the test files will be written
-                  type:
-                    type_id: string
-                  required: true
-                target_pod_volume:
-                  display:
-                    name: kubernetes volume definition
-                    description: the volume that will be attached to the pod. In order to stress
-                      the node storage only hosPath mode is currently supported
-                  type:
-                    type_id: ref
-                    id: Volume
-                  required: true
-steps:
-  workload_loop:
-    kind: foreach
-    items: !expr $.input.input_list
-    workflow: sub-workflow.yaml
-    parallelism: 1000
-outputs:
-  success:
-    workloads: !expr $.steps.workload_loop.outputs.success.data
-
-
-
--- a/CI/scenarios/arcaflow/memory-hog/config.yaml
+++ b/CI/scenarios/arcaflow/memory-hog/config.yaml
@@ -1,12 +0,0 @@
---
-deployers:
-  image:
-    connection: {}
-    deployer_name: kubernetes
-log:
-  level: debug
-logged_outputs:
-  error:
-    level: error
-  success:
-    level: debug
--- a/CI/scenarios/arcaflow/memory-hog/input.yaml
+++ b/CI/scenarios/arcaflow/memory-hog/input.yaml
@@ -1,14 +0,0 @@
-input_list:
- duration: 30s
-  vm_bytes: 10%
-  vm_workers: 2
-  node_selector:
-    kubernetes.io/hostname: kind-worker2
-  # node selector example
-  # node_selector:
-  #   kubernetes.io/hostname: master
-  kubeconfig: ""
-  namespace: default
-
-# duplicate this section to run simultaneous stressors in the same run
-
--- a/CI/scenarios/arcaflow/memory-hog/sub-workflow.yaml
+++ b/CI/scenarios/arcaflow/memory-hog/sub-workflow.yaml
@@ -1,90 +0,0 @@
-version: v0.2.0
-input:
-  root: RootObject
-  objects:
-    RootObject:
-      id: input_item
-      properties:
-        kubeconfig:
-          display:
-            description: The complete kubeconfig file as a string
-            name: Kubeconfig file contents
-          type:
-            type_id: string
-          required: true
-        namespace:
-          display:
-            description: The namespace where the container will be deployed
-            name: Namespace
-          type:
-            type_id: string
-          required: true
-        node_selector:
-            display:
-              description: kubernetes node name where the plugin must be deployed
-            type:
-              type_id: map
-              values:
-                type_id: string
-              keys:
-                type_id: string
-            required: true
-        duration:
-          display:
-            name: duration the scenario expressed in seconds
-            description: stop stress test after T seconds. One can also specify the units of time in seconds, minutes, hours, days or years with the suffix s, m, h, d or  y
-          type:
-            type_id: string
-          required: true
-        vm_workers:
-          display:
-            description: Number of VM stressors to be run (0 means 1 stressor per CPU)
-            name: Number of VM stressors
-          type:
-            type_id: integer
-          required: true
-        vm_bytes:
-          display:
-            description: N bytes per vm process, the default is 256MB. The size can be expressed in units of Bytes, KBytes, MBytes and GBytes using the suffix b, k, m or g.
-            name: Kubeconfig file contents
-          type:
-            type_id: string
-          required: true
-
-steps:
-  kubeconfig:
-    plugin: 
-      src: quay.io/arcalot/arcaflow-plugin-kubeconfig:0.2.0
-      deployment_type: image
-    input:
-      kubeconfig: !expr $.input.kubeconfig
-  stressng:
-    plugin: 
-      src: quay.io/arcalot/arcaflow-plugin-stressng:0.5.0
-      deployment_type: image
-    step: workload
-    input:
-      cleanup: "true"
-      StressNGParams:
-        timeout: !expr $.input.duration
-        stressors:
-          - stressor: vm
-            vm: !expr $.input.vm_workers
-            vm_bytes: !expr $.input.vm_bytes
-    deploy:
-      deployer_name: kubernetes
-      connection: !expr $.steps.kubeconfig.outputs.success.connection
-      pod:
-        metadata:
-          namespace: !expr $.input.namespace
-          labels:
-            arcaflow: stressng
-        spec:
-          nodeSelector: !expr $.input.node_selector
-          pluginContainer:
-            imagePullPolicy: Always
-
-outputs:
-  success:
-    stressng: !expr $.steps.stressng.outputs.success
-
--- a/CI/scenarios/arcaflow/memory-hog/workflow.yaml
+++ b/CI/scenarios/arcaflow/memory-hog/workflow.yaml
@@ -1,73 +0,0 @@
-version: v0.2.0
-input:
-  root: RootObject
-  objects:
-    RootObject:
-      id: RootObject
-      properties:
-        input_list:
-          type:
-            type_id: list
-            items:
-              id: input_item
-              type_id: object
-              properties:
-                kubeconfig:
-                  display:
-                    description: The complete kubeconfig file as a string
-                    name: Kubeconfig file contents
-                  type:
-                    type_id: string
-                  required: true
-                namespace:
-                    display:
-                      description: The namespace where the container will be deployed
-                      name: Namespace
-                    type:
-                      type_id: string
-                    required: true
-                node_selector:
-                  display:
-                    description: kubernetes node name where the plugin must be deployed
-                  type:
-                    type_id: map
-                    values:
-                      type_id: string
-                    keys:
-                      type_id: string
-                  required: true
-                duration:
-                  display:
-                    name: duration the scenario expressed in seconds
-                    description: stop stress test after T seconds. One can also specify the units of time in seconds, minutes, hours, days or years with the suffix s, m, h, d or  y
-                  type:
-                    type_id: string
-                  required: true
-                vm_workers:
-                  display:
-                    description: Number of VM stressors to be run (0 means 1 stressor per CPU)
-                    name: Number of VM stressors
-                  type:
-                    type_id: integer
-                  required: true
-                vm_bytes:
-                  display:
-                    description: N bytes per vm process, the default is 256MB. The size can be expressed in units of Bytes, KBytes, MBytes and GBytes using the suffix b, k, m or g.
-                    name: Kubeconfig file contents
-                  type:
-                    type_id: string
-                  required: true
-steps:
-  workload_loop:
-    kind: foreach
-    items: !expr $.input.input_list
-    workflow: sub-workflow.yaml
-    parallelism: 1000
-outputs:
-  success:
-    workloads: !expr $.steps.workload_loop.outputs.success.data
-
-
-
-
-
--- a/CI/scenarios/container_scenario.yml
+++ b/CI/scenarios/container_scenario.yml
@@ -1,8 +0,0 @@
-scenarios:
- name: "kill test container"
-  namespace: "default"
-  label_selector: "scenario=container"
-  container_name: "fedtools"
-  action: 1
-  count: 1
-  retry_wait: 60
--- a/CI/scenarios/network_chaos.yaml
+++ b/CI/scenarios/network_chaos.yaml
@@ -1,8 +0,0 @@
-network_chaos:                   # Scenario to create an outage by simulating random variations in the network.
-  duration: 10                   # seconds
-  instance_count: 1
-  node_name: kind-worker2
-  execution: serial
-  egress:
-    bandwidth: 100mbit
-
--- a/CI/scenarios/network_diagnostics_namespace.yaml
+++ b/CI/scenarios/network_diagnostics_namespace.yaml
@@ -1,7 +0,0 @@
-scenarios:
-  - action: delete
-    namespace: "^namespace-scenario$"
-    label_selector:
-    runs: 1
-    sleep: 15
-    wait_time: 30
--- a/CI/scenarios/time_scenarios.yml
+++ b/CI/scenarios/time_scenarios.yml
@@ -1,5 +0,0 @@
-time_scenarios:
- action: skew_time
-  object_type: pod
-  label_selector: scenario=time-skew
-  container_name: ""
--- a/CI/tests/functional_tests
+++ b/CI/tests/functional_tests
--- a/CI/tests/test_app_outages.sh
+++ b/CI/tests/test_app_outages.sh
@@ -7,10 +7,11 @@ trap finish EXIT


 function functional_test_app_outage {
-  yq -i '.application_outage.pod_selector={"scenario":"outage"}' CI/scenarios/app_outage.yaml
-  yq -i '.application_outage.namespace="default"' CI/scenarios/app_outage.yaml
+  yq -i '.application_outage.duration=10' scenarios/openshift/app_outage.yaml
+  yq -i '.application_outage.pod_selector={"scenario":"outage"}' scenarios/openshift/app_outage.yaml
+  yq -i '.application_outage.namespace="default"' scenarios/openshift/app_outage.yaml
  export scenario_type="application_outages"
-  export scenario_file="CI/scenarios/app_outage.yaml"
+  export scenario_file="scenarios/openshift/app_outage.yaml"
  export post_config=""
  envsubst < CI/config/common_test_config.yaml > CI/config/app_outage.yaml
  python3 -m coverage run -a run_kraken.py -c CI/config/app_outage.yaml
--- a/CI/tests/test_arca_cpu_hog.sh
+++ b/CI/tests/test_arca_cpu_hog.sh
@@ -7,8 +7,9 @@ trap finish EXIT


 function functional_test_arca_cpu_hog {
+  yq -i '.input_list[0].node_selector={"kubernetes.io/hostname":"kind-worker2"}' scenarios/arcaflow/cpu-hog/input.yaml
  export scenario_type="arcaflow_scenarios"
-  export scenario_file="CI/scenarios/arcaflow/cpu-hog/input.yaml"
+  export scenario_file="scenarios/arcaflow/cpu-hog/input.yaml"
  export post_config=""
  envsubst < CI/config/common_test_config.yaml > CI/config/arca_cpu_hog.yaml
  python3 -m coverage run -a run_kraken.py -c CI/config/arca_cpu_hog.yaml
--- a/CI/tests/test_arca_io_hog.sh
+++ b/CI/tests/test_arca_io_hog.sh
@@ -7,8 +7,9 @@ trap finish EXIT


 function functional_test_arca_io_hog {
+  yq -i '.input_list[0].node_selector={"kubernetes.io/hostname":"kind-worker2"}' scenarios/arcaflow/io-hog/input.yaml
  export scenario_type="arcaflow_scenarios"
-  export scenario_file="CI/scenarios/arcaflow/io-hog/input.yaml"
+  export scenario_file="scenarios/arcaflow/io-hog/input.yaml"
  export post_config=""
  envsubst < CI/config/common_test_config.yaml > CI/config/arca_io_hog.yaml
  python3 -m coverage run -a run_kraken.py -c CI/config/arca_io_hog.yaml
--- a/CI/tests/test_arca_memory_hog.sh
+++ b/CI/tests/test_arca_memory_hog.sh
@@ -7,8 +7,9 @@ trap finish EXIT


 function functional_test_arca_memory_hog {
+  yq -i '.input_list[0].node_selector={"kubernetes.io/hostname":"kind-worker2"}' scenarios/arcaflow/memory-hog/input.yaml
  export scenario_type="arcaflow_scenarios"
-  export scenario_file="CI/scenarios/arcaflow/memory-hog/input.yaml"
+  export scenario_file="scenarios/arcaflow/memory-hog/input.yaml"
  export post_config=""
  envsubst < CI/config/common_test_config.yaml > CI/config/arca_memory_hog.yaml
  python3 -m coverage run -a run_kraken.py -c CI/config/arca_memory_hog.yaml
--- a/CI/tests/test_container.sh
+++ b/CI/tests/test_container.sh
@@ -8,9 +8,11 @@ trap finish EXIT
 pod_file="CI/scenarios/hello_pod.yaml"

 function functional_test_container_crash {
-
+  yq -i '.scenarios[0].namespace="default"' scenarios/openshift/app_outage.yaml
+  yq -i '.scenarios[0].label_selector="scenario=container"' scenarios/openshift/app_outage.yaml
+  yq -i '.scenarios[0].container_name="fedtools"' scenarios/openshift/app_outage.yaml
  export scenario_type="container_scenarios"
-  export scenario_file="- CI/scenarios/container_scenario.yml"
+  export scenario_file="- scenarios/openshift/app_outage.yaml"
  export post_config=""
  envsubst < CI/config/common_test_config.yaml > CI/config/container_config.yaml

--- a/CI/tests/test_namespace.sh
+++ b/CI/tests/test_namespace.sh
@@ -7,12 +7,13 @@ trap finish EXIT

 function funtional_test_namespace_deletion {
  export scenario_type="namespace_scenarios"
-  export scenario_file="-  CI/scenarios/network_diagnostics_namespace.yaml"
+  export scenario_file="-  scenarios/openshift/ingress_namespace.yaml"
  export post_config=""
-  yq '.scenarios.[0].namespace="^openshift-network-diagnostics$"' -i CI/scenarios/network_diagnostics_namespace.yaml
+  yq '.scenarios[0].namespace="^namespace-scenario$"' -i scenarios/openshift/ingress_namespace.yaml
+  yq '.scenarios[0].wait_time=30' -i scenarios/openshift/ingress_namespace.yaml
+  yq '.scenarios[0].action="delete"' -i scenarios/openshift/ingress_namespace.yaml
  envsubst < CI/config/common_test_config.yaml > CI/config/namespace_config.yaml
  python3 -m coverage run -a run_kraken.py -c CI/config/namespace_config.yaml
-  echo $?
  echo "Namespace scenario test: Success"
 }

--- a/CI/tests/test_net_chaos.sh
+++ b/CI/tests/test_net_chaos.sh
@@ -7,9 +7,16 @@ trap finish EXIT


 function functional_test_network_chaos {
+  yq -i '.network_chaos.duration=10' scenarios/openshift/network_chaos.yaml
+  yq -i '.network_chaos.node_name="kind-worker2"' scenarios/openshift/network_chaos.yaml
+  yq -i '.network_chaos.egress.bandwidth="100mbit"' scenarios/openshift/network_chaos.yaml
+  yq -i 'del(.network_chaos.interfaces)' scenarios/openshift/network_chaos.yaml
+  yq -i 'del(.network_chaos.label_selector)' scenarios/openshift/network_chaos.yaml
+  yq -i 'del(.network_chaos.egress.latency)' scenarios/openshift/network_chaos.yaml
+  yq -i 'del(.network_chaos.egress.loss)' scenarios/openshift/network_chaos.yaml

  export scenario_type="network_chaos"
-  export scenario_file="CI/scenarios/network_chaos.yaml"
+  export scenario_file="scenarios/openshift/network_chaos.yaml"
  export post_config=""
  envsubst < CI/config/common_test_config.yaml > CI/config/network_chaos.yaml
  python3 -m coverage run -a run_kraken.py -c CI/config/network_chaos.yaml
--- a/CI/tests/test_telemetry.sh
+++ b/CI/tests/test_telemetry.sh
@@ -0,0 +1,33 @@
+set -xeEo pipefail
+
+source CI/tests/common.sh
+
+trap error ERR
+trap finish EXIT
+
+
+function functional_test_telemetry {
+  AWS_CLI=`which aws`
+  [ -z "$AWS_CLI" ]&& echo "AWS cli not found in path" && exit 1
+  [ -z "$AWS_BUCKET" ] && echo "AWS bucket not set in environment" && exit 1
+
+  export RUN_TAG="funtest-telemetry"
+  yq -i '.telemetry.enabled=True' CI/config/common_test_config.yaml
+  yq -i '.telemetry.full_prometheus_backup=True' CI/config/common_test_config.yaml
+  yq -i '.telemetry.run_tag=env(RUN_TAG)' CI/config/common_test_config.yaml
+  export scenario_type="arcaflow_scenarios"
+  export scenario_file="scenarios/arcaflow/cpu-hog/input.yaml"
+  export post_config=""
+  envsubst < CI/config/common_test_config.yaml > CI/config/telemetry.yaml
+  python3 -m coverage run -a run_kraken.py -c CI/config/telemetry.yaml
+  RUN_FOLDER=`cat CI/out/test_telemetry.out | grep amazonaws.com | sed -rn "s#.*https:\/\/.*\/download/(.*)#\1#p"`
+  $AWS_CLI s3 ls "s3://$AWS_BUCKET/$RUN_FOLDER/" | awk '{ print $4 }' > s3_remote_files
+  echo "checking if telemetry files are uploaded on s3"
+  cat s3_remote_files | grep events-00.json || ( echo "FAILED: events-00.json not uploaded" && exit 1 )
+  cat s3_remote_files | grep prometheus-00.tar || ( echo "FAILED: prometheus backup not uploaded" && exit 1 )
+  cat s3_remote_files | grep telemetry.json || ( echo "FAILED: telemetry.json not uploaded" && exit 1 )
+  echo "all files uploaded!"
+  echo "Telemetry Collection: Success"
+}
+
+functional_test_telemetry
--- a/CI/tests/test_time.sh
+++ b/CI/tests/test_time.sh
@@ -7,8 +7,12 @@ trap finish EXIT


 function functional_test_time_scenario {
+  yq -i '.time_scenarios[0].label_selector="scenario=time-skew"' scenarios/openshift/time_scenarios_example.yml
+  yq -i '.time_scenarios[0].container_name=""' scenarios/openshift/time_scenarios_example.yml
+  yq -i '.time_scenarios[0].namespace="default"' scenarios/openshift/time_scenarios_example.yml
+  yq -i '.time_scenarios[1].label_selector="kubernetes.io/hostname=kind-worker2"' scenarios/openshift/time_scenarios_example.yml
  export scenario_type="time_scenarios"
-  export scenario_file="CI/scenarios/time_scenarios.yml"
+  export scenario_file="scenarios/openshift/time_scenarios_example.yml"
  export post_config=""
  envsubst < CI/config/common_test_config.yaml > CI/config/time_config.yaml

--- a/README.md
+++ b/README.md
@@ -1,6 +1,6 @@
 # Krkn aka Kraken
-[![Docker Repository on Quay](https://quay.io/repository/redhat-chaos/krkn/status "Docker Repository on Quay")](https://quay.io/repository/redhat-chaos/krkn?tab=tags&tag=latest)
-![Workflow-Status](https://github.com/redhat-chaos/krkn/actions/workflows/docker-image.yml/badge.svg)
+[![Docker Repository on Quay](https://quay.io/repository/krkn-chaos/krkn/status "Docker Repository on Quay")](https://quay.io/repository/krkn-chaos/krkn?tab=tags&tag=latest)
+![Workflow-Status](https://github.com/krkn-chaos/krkn/actions/workflows/docker-image.yml/badge.svg)

 ![Krkn logo](media/logo.png)

@@ -79,7 +79,7 @@ Scenario type               | Kubernetes
 ### Kraken scenario pass/fail criteria and report
 It is important to make sure to check if the targeted component recovered from the chaos injection and also if the Kubernetes cluster is healthy as failures in one component can have an adverse impact on other components. Kraken does this by:
 - Having built in checks for pod and node based scenarios to ensure the expected number of replicas and nodes are up. It also supports running custom scripts with the checks.
- Leveraging [Cerberus](https://github.com/redhat-chaos/cerberus) to monitor the cluster under test and consuming the aggregated go/no-go signal to determine pass/fail post chaos. It is highly recommended to turn on the Cerberus health check feature available in Kraken. Instructions on installing and setting up Cerberus can be found [here](https://github.com/openshift-scale/cerberus#installation) or can be installed from Kraken using the [instructions](https://github.com/redhat-chaos/krkn#setting-up-infrastructure-dependencies). Once Cerberus is up and running, set cerberus_enabled to True and cerberus_url to the url where Cerberus publishes go/no-go signal in the Kraken config file. Cerberus can monitor [application routes](https://github.com/redhat-chaos/cerberus/blob/main/docs/config.md#watch-routes) during the chaos and fails the run if it encounters downtime as it is a potential downtime in a customers, or users environment as well. It is especially important during the control plane chaos scenarios including the API server, Etcd, Ingress etc. It can be enabled by setting `check_applicaton_routes: True` in the [Kraken config](https://github.com/redhat-chaos/krkn/blob/main/config/config.yaml) provided application routes are being monitored in the [cerberus config](https://github.com/redhat-chaos/krkn/blob/main/config/cerberus.yaml).
+- Leveraging [Cerberus](https://github.com/krkn-chaos/cerberus) to monitor the cluster under test and consuming the aggregated go/no-go signal to determine pass/fail post chaos. It is highly recommended to turn on the Cerberus health check feature available in Kraken. Instructions on installing and setting up Cerberus can be found [here](https://github.com/openshift-scale/cerberus#installation) or can be installed from Kraken using the [instructions](https://github.com/krkn-chaos/krkn#setting-up-infrastructure-dependencies). Once Cerberus is up and running, set cerberus_enabled to True and cerberus_url to the url where Cerberus publishes go/no-go signal in the Kraken config file. Cerberus can monitor [application routes](https://github.com/redhat-chaos/cerberus/blob/main/docs/config.md#watch-routes) during the chaos and fails the run if it encounters downtime as it is a potential downtime in a customers, or users environment as well. It is especially important during the control plane chaos scenarios including the API server, Etcd, Ingress etc. It can be enabled by setting `check_applicaton_routes: True` in the [Kraken config](https://github.com/redhat-chaos/krkn/blob/main/config/config.yaml) provided application routes are being monitored in the [cerberus config](https://github.com/redhat-chaos/krkn/blob/main/config/cerberus.yaml).
 - Leveraging built-in alert collection feature to fail the runs in case of critical alerts.

 ### Signaling
@@ -103,7 +103,7 @@ Information on enabling and leveraging this feature can be found [here](docs/SLO

 ### OCM / ACM integration

-Kraken supports injecting faults into [Open Cluster Management (OCM)](https://open-cluster-management.io/) and [Red Hat Advanced Cluster Management for Kubernetes (ACM)](https://www.redhat.com/en/technologies/management/advanced-cluster-management) managed clusters through [ManagedCluster Scenarios](docs/managedcluster_scenarios.md).
+Kraken supports injecting faults into [Open Cluster Management (OCM)](https://open-cluster-management.io/) and [Red Hat Advanced Cluster Management for Kubernetes (ACM)](https://www.krkn.com/en/technologies/management/advanced-cluster-management) managed clusters through [ManagedCluster Scenarios](docs/managedcluster_scenarios.md).


 ### Blogs and other useful resources
@@ -129,6 +129,7 @@ Please read [this file]((CI/README.md#adding-a-test-case)) for more information


 ### Community
-Key Members(slack_usernames/full name): paigerube14/Paige Rubendall, mffiedler/Mike Fiedler, ravielluri/Naga Ravi Chaitanya Elluri.
-* [**#krkn on Kubernetes Slack**](https://kubernetes.slack.com)
-* [**#forum-chaos on CoreOS Slack internal to Red Hat**](https://coreos.slack.com)
+Key Members(slack_usernames/full name): paigerube14/Paige Rubendall, mffiedler/Mike Fiedler, tsebasti/Tullio Sebastiani, yogi/Yogananth Subramanian, sahil/Sahil Shah, pradeep/Pradeep Surisetty and ravielluri/Naga Ravi Chaitanya Elluri.
+* [**#krkn on Kubernetes Slack**](https://kubernetes.slack.com/messages/C05SFMHRWK1)
+
+The Linux Foundation® (TLF) has registered trademarks and uses trademarks. For a list of TLF trademarks, see [Trademark Usage](https://www.linuxfoundation.org/legal/trademark-usage).
--- a/ROADMAP.md
+++ b/ROADMAP.md
@@ -2,14 +2,14 @@

 Following are a list of enhancements that we are planning to work on adding support in Krkn. Of course any help/contributions are greatly appreciated.

- [ ] [Ability to run multiple chaos scenarios in parallel under load to mimic real world outages](https://github.com/redhat-chaos/krkn/issues/424)
- [x] [Centralized storage for chaos experiments artifacts](https://github.com/redhat-chaos/krkn/issues/423)
- [ ] [Support for causing DNS outages](https://github.com/redhat-chaos/krkn/issues/394)
- [x] [Chaos recommender](https://github.com/redhat-chaos/krkn/tree/main/utils/chaos-recommender) to suggest scenarios having probability of impacting the service under test using profiling results 
+- [ ] [Ability to run multiple chaos scenarios in parallel under load to mimic real world outages](https://github.com/krkn-chaos/krkn/issues/424)
+- [x] [Centralized storage for chaos experiments artifacts](https://github.com/krkn-chaos/krkn/issues/423)
+- [ ] [Support for causing DNS outages](https://github.com/krkn-chaos/krkn/issues/394)
+- [x] [Chaos recommender](https://github.com/krkn-chaos/krkn/tree/main/utils/chaos-recommender) to suggest scenarios having probability of impacting the service under test using profiling results 
 - [ ] Chaos AI integration to improve and automate test coverage
- [x] [Support for pod level network traffic shaping](https://github.com/redhat-chaos/krkn/issues/393)
- [ ] [Ability to visualize the metrics that are being captured by Kraken and stored in Elasticsearch](https://github.com/redhat-chaos/krkn/issues/124)
- [ ] Support for running all the scenarios of Kraken on Kubernetes distribution - see https://github.com/redhat-chaos/krkn/issues/185, https://github.com/redhat-chaos/krkn/issues/186
- [ ] Continue to improve [Chaos Testing Guide](https://redhat-chaos.github.io/krkn) in terms of adding best practices, test environment recommendations and scenarios to make sure the OpenShift platform, as well the applications running on top it, are resilient and performant under chaotic conditions.
- [ ] [Switch documentation references to Kubernetes](https://github.com/redhat-chaos/krkn/issues/495)
- [ ] [OCP and Kubernetes functionalities segregation](https://github.com/redhat-chaos/krkn/issues/497)
+- [x] [Support for pod level network traffic shaping](https://github.com/krkn-chaos/krkn/issues/393)
+- [ ] [Ability to visualize the metrics that are being captured by Kraken and stored in Elasticsearch](https://github.com/krkn-chaos/krkn/issues/124)
+- [ ] Support for running all the scenarios of Kraken on Kubernetes distribution - see https://github.com/krkn-chaos/krkn/issues/185, https://github.com/redhat-chaos/krkn/issues/186
+- [ ] Continue to improve [Chaos Testing Guide](https://krkn-chaos.github.io/krkn) in terms of adding best practices, test environment recommendations and scenarios to make sure the OpenShift platform, as well the applications running on top it, are resilient and performant under chaotic conditions.
+- [ ] [Switch documentation references to Kubernetes](https://github.com/krkn-chaos/krkn/issues/495)
+- [ ] [OCP and Kubernetes functionalities segregation](https://github.com/krkn-chaos/krkn/issues/497)
--- a/config/alerts.yaml
+++ b/config/alerts.yaml
@@ -8,7 +8,7 @@
  description: 10 minutes avg. 99th etcd fsync latency on {{$labels.pod}} higher than 1s. {{$value}}s
  severity: error

- expr: avg_over_time(histogram_quantile(0.99, rate(etcd_disk_backend_commit_duration_seconds_bucket[2m]))[10m:]) > 0.007
+- expr: avg_over_time(histogram_quantile(0.99, rate(etcd_disk_backend_commit_duration_seconds_bucket[2m]))[10m:]) > 0.03
  description: 10 minutes avg. 99th etcd commit latency on {{$labels.pod}} higher than 30ms. {{$value}}s
  severity: warning

--- a/config/config.yaml
+++ b/config/config.yaml
@@ -85,6 +85,9 @@ telemetry:
     - "(\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:\\d{2}\\.\\d+Z).+"      # 2023-09-15T11:20:36.123425532Z log
    oc_cli_path: /usr/bin/oc                                # optional, if not specified will be search in $PATH
    events_backup: True                                     # enables/disables cluster events collection
+elastic: 
+    elastic_url: ""                                         # To track results in elasticsearch, give url to server here; will post telemetry details when url and index not blank
+    elastic_index: ""                                       # Elastic search index pattern to post results to



--- a/config/config_performance.yaml
+++ b/config/config_performance.yaml
@@ -77,3 +77,8 @@ telemetry:
     - "kinit (\\d+/\\d+/\\d+\\s\\d{2}:\\d{2}:\\d{2})\\s+"          # kinit 2023/09/15 11:20:36 log
     - "(\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:\\d{2}\\.\\d+Z).+"      # 2023-09-15T11:20:36.123425532Z log
    oc_cli_path: /usr/bin/oc                                # optional, if not specified will be search in $PATH
+elastic: 
+    elastic_url: ""                                         # To track results in elasticsearch, give url to server here; will post telemetry details when url and index not blank
+    elastic_index: ""                                       # Elastic search index pattern to post results to
+
+
--- a/containers/Dockerfile
+++ b/containers/Dockerfile
@@ -12,7 +12,7 @@ COPY --from=azure-cli /usr/local/bin/az /usr/bin/az
 # Install dependencies
 RUN yum install -y git python39 python3-pip jq gettext wget && \
    python3.9 -m pip install -U pip && \
-    git clone https://github.com/krkn-chaos/krkn.git --branch v1.5.5 /root/kraken && \
+    git clone https://github.com/krkn-chaos/krkn.git --branch v1.5.7 /root/kraken && \
    mkdir -p /root/.kube && cd /root/kraken && \
    pip3.9 install -r requirements.txt && \
    pip3.9 install virtualenv && \
--- a/containers/Dockerfile-ppc64le
+++ b/containers/Dockerfile-ppc64le
@@ -14,7 +14,7 @@ COPY --from=azure-cli /usr/local/bin/az /usr/bin/az
 # Install dependencies
 RUN yum install -y git python39 python3-pip jq gettext wget && \
    python3.9 -m pip install -U pip && \
-    git clone https://github.com/redhat-chaos/krkn.git --branch v1.5.5 /root/kraken && \
+    git clone https://github.com/redhat-chaos/krkn.git --branch v1.5.7 /root/kraken && \
    mkdir -p /root/.kube && cd /root/kraken && \
    pip3.9 install -r requirements.txt && \
    pip3.9 install virtualenv && \
--- a/docs/cluster_shut_down_scenarios.md
+++ b/docs/cluster_shut_down_scenarios.md
@@ -1,5 +1,5 @@
-#### Kubernetes/OpenShift cluster shut down scenario
-Scenario to shut down all the nodes including the masters and restart them after specified duration. Cluster shut down scenario can be injected by placing the shut_down config file under cluster_shut_down_scenario option in the kraken config. Refer to [cluster_shut_down_scenario](https://github.com/redhat-chaos/krkn/blob/main/scenarios/cluster_shut_down_scenario.yml) config file.
+#### Kubernetes cluster shut down scenario
+Scenario to shut down all the nodes including the masters and restart them after specified duration. Cluster shut down scenario can be injected by placing the shut_down config file under cluster_shut_down_scenario option in the kraken config. Refer to [cluster_shut_down_scenario](https://github.com/krkn-chaos/krkn/blob/main/scenarios/cluster_shut_down_scenario.yml) config file.

 Refer to [cloud setup](cloud_setup.md) to configure your cli properly for the cloud provider of the cluster you want to shut down.

--- a/docs/container_scenarios.md
+++ b/docs/container_scenarios.md
@@ -4,7 +4,7 @@ This can be based on the pods namespace or labels. If you know the exact object
 These scenarios are in a simple yaml format that you can manipulate to run your specific tests or use the pre-existing scenarios to see how it works.

 ####  Example Config
-The following are the components of Kubernetes/OpenShift for which a basic chaos scenario config exists today.
+The following are the components of Kubernetes for which a basic chaos scenario config exists today.

 ```
 scenarios:
@@ -25,7 +25,7 @@ In all scenarios we do a post chaos check to wait and verify the specific compon
 Here there are two options:
 1. Pass a custom script in the main config scenario list that will run before the chaos and verify the output matches post chaos scenario.

-See [scenarios/post_action_etcd_container.py](https://github.com/redhat-chaos/krkn/blob/main/scenarios/post_action_etcd_container.py) for an example.
+See [scenarios/post_action_etcd_container.py](https://github.com/krkn-chaos/krkn/blob/main/scenarios/post_action_etcd_container.py) for an example.
 ```
 -   container_scenarios:                                 # List of chaos pod scenarios to load.
            - -    scenarios/container_etcd.yml
--- a/docs/contribute.md
+++ b/docs/contribute.md
@@ -62,7 +62,7 @@ If changes go into the main repository while you're working on your code it is b

 If not already configured, set the upstream url for kraken.
 ```
- git remote add upstream https://github.com/redhat-chaos/krkn.git
+ git remote add upstream https://github.com/krkn-chaos/krkn.git
 ```

 Rebase to upstream master branch.
--- a/docs/installation.md
+++ b/docs/installation.md
@@ -3,13 +3,13 @@
 The following ways are supported to run Kraken:

 - Standalone python program through Git.
- Containerized version using either Podman or Docker as the runtime via [Krkn-hub](https://github.com/redhat-chaos/krkn-hub)
+- Containerized version using either Podman or Docker as the runtime via [Krkn-hub](https://github.com/krkn-chaos/krkn-hub)
 - Kubernetes or OpenShift deployment ( unsupported )

 **NOTE**: It is recommended to run Kraken external to the cluster ( Standalone or Containerized ) hitting the Kubernetes/OpenShift API as running it internal to the cluster might be disruptive to itself and also might not report back the results if the chaos leads to cluster's API server instability.

 **NOTE**: To run Kraken on Power (ppc64le) architecture, build and run a containerized version by following the
- instructions given [here](https://github.com/redhat-chaos/krkn/blob/main/containers/build_own_image-README.md).
+ instructions given [here](https://github.com/krkn-chaos/krkn/blob/main/containers/build_own_image-README.md).

 **NOTE**: Helper functions for interactions in Krkn are part of [krkn-lib](https://github.com/redhat-chaos/krkn-lib). 
 Please feel free to reuse and expand them as you see fit when adding a new scenario or expanding 
@@ -19,9 +19,9 @@ the capabilities of the current supported scenarios.
 ### Git

 #### Clone the repository
-Pick the latest stable release to install [here](https://github.com/redhat-chaos/krkn/releases).
+Pick the latest stable release to install [here](https://github.com/krkn-chaos/krkn/releases).
 ```
-$ git clone https://github.com/redhat-chaos/krkn.git --branch <release version>
+$ git clone https://github.com/krkn-chaos/krkn.git --branch <release version>
 $ cd kraken
 ```

@@ -40,13 +40,13 @@ $ python3.9 run_kraken.py --config <config_file_location>
 ```

 ### Run containerized version
-[Krkn-hub](https://github.com/redhat-chaos/krkn-hub) is a wrapper that allows running Krkn chaos scenarios via podman or docker runtime with scenario parameters/configuration defined as environment variables.
+[Krkn-hub](https://github.com/krkn-chaos/krkn-hub) is a wrapper that allows running Krkn chaos scenarios via podman or docker runtime with scenario parameters/configuration defined as environment variables.

-Refer [instructions](https://github.com/redhat-chaos/krkn-hub#supported-chaos-scenarios) to get started.
+Refer [instructions](https://github.com/krkn-chaos/krkn-hub#supported-chaos-scenarios) to get started.


 ### Run Kraken as a Kubernetes deployment ( unsupported option - standalone or containerized deployers are recommended )
-Refer [Instructions](https://github.com/redhat-chaos/krkn/blob/main/containers/README.md) on how to deploy and run Kraken as a Kubernetes/OpenShift deployment.
+Refer [Instructions](https://github.com/krkn-chaos/krkn/blob/main/containers/README.md) on how to deploy and run Kraken as a Kubernetes/OpenShift deployment.


 Refer to the [chaos-kraken chart manpage](https://artifacthub.io/packages/helm/startx/chaos-kraken)
--- a/docs/service_disruption_scenarios.md
+++ b/docs/service_disruption_scenarios.md
@@ -16,7 +16,7 @@ Set to '^.*$' and label_selector to "" to randomly select any namespace in your

 **sleep:** Number of seconds to wait between each iteration/count of killing namespaces. Defaults to 10 seconds if not set

-Refer to [namespace_scenarios_example](https://github.com/redhat-chaos/krkn/blob/main/scenarios/regex_namespace.yaml) config file.
+Refer to [namespace_scenarios_example](https://github.com/krkn-chaos/krkn/blob/main/scenarios/regex_namespace.yaml) config file.

 ```
 scenarios:
--- a/docs/time_scenarios.md
+++ b/docs/time_scenarios.md
@@ -16,7 +16,7 @@ Configuration Options:

 **object_name:** List of the names of pods or nodes you want to skew.

-Refer to [time_scenarios_example](https://github.com/redhat-chaos/krkn/blob/main/scenarios/time_scenarios_example.yml) config file.
+Refer to [time_scenarios_example](https://github.com/krkn-chaos/krkn/blob/main/scenarios/time_scenarios_example.yml) config file.

 ```
 time_scenarios:
--- a/requirements.txt
+++ b/requirements.txt
@@ -1,40 +1,40 @@
-PyYAML>=5.1
 aliyun-python-sdk-core==2.13.36
 aliyun-python-sdk-ecs==4.24.25
 arcaflow==0.9.0
 arcaflow-plugin-sdk==0.10.0
-azure-identity
-azure-keyvault
-azure-mgmt-compute
 boto3==1.28.61
-coverage
-datetime
-docker
-docker-compose
-git+https://github.com/redhat-chaos/arcaflow-plugin-kill-pod.git
-git+https://github.com/vmware/vsphere-automation-sdk-python.git@v8.0.0.0
-gitpython
-google-api-python-client
-ibm_cloud_sdk_core
-ibm_vpc
+azure-identity==1.15.0
+azure-keyvault==4.2.0
+azure-mgmt-compute==30.5.0
 itsdangerous==2.0.1
+coverage==7.4.1
+datetime==5.4
+docker==7.0.0
+gitpython==3.1.41
+google-api-python-client==2.116.0
+ibm_cloud_sdk_core==3.18.0
+ibm_vpc==0.20.0
 jinja2==3.1.3
-krkn-lib >= 1.4.6
-kubernetes
-lxml >= 4.3.0
-oauth2client>=4.1.3
-openshift-client
-paramiko
-podman-compose
-pyVmomi >= 6.7
-pyfiglet
-pytest
-python-ipmi
-python-openstackclient
-requests
-service_identity
+krkn-lib==1.4.12
+lxml==5.1.0
+kubernetes==26.1.0
+oauth2client==4.1.3
+pandas==2.2.0
+openshift-client==1.0.21
+paramiko==3.4.0
+podman-compose==1.0.6
+pyVmomi==8.0.2.0.1
+pyfiglet==1.0.2
+pytest==8.0.0
+python-ipmi==0.5.4
+python-openstackclient==6.5.0
+requests==2.31.0
+service_identity==24.1.0
+PyYAML==6.0
 setuptools==65.5.1
 werkzeug==3.0.1
-wheel
+wheel==0.42.0
 zope.interface==5.4.0
-pandas<2.0.0
+
+git+https://github.com/krkn-chaos/arcaflow-plugin-kill-pod.git
+git+https://github.com/vmware/vsphere-automation-sdk-python.git@v8.0.0.0
--- a/run_kraken.py
+++ b/run_kraken.py
@@ -27,6 +27,7 @@ import server as server
 from kraken import plugins
 from krkn_lib.k8s import KrknKubernetes
 from krkn_lib.ocp import KrknOpenshift
+from krkn_lib.telemetry.elastic import KrknElastic
 from krkn_lib.telemetry.k8s import KrknTelemetryKubernetes
 from krkn_lib.telemetry.ocp import KrknTelemetryOpenshift
 from krkn_lib.models.telemetry import ChaosRunTelemetry
@@ -34,7 +35,7 @@ from krkn_lib.utils import SafeLogger
 from krkn_lib.utils.functions import get_yaml_item_value


-
+report_file = ""

 # Main function
 def main(cfg):
@@ -94,6 +95,9 @@ def main(cfg):
            config["performance_monitoring"], "check_critical_alerts", False
        )
        telemetry_api_url = config["telemetry"].get("api_url")
+        elastic_config = get_yaml_item_value(config,"elastic",{})
+        elastic_url = get_yaml_item_value(elastic_config,"elastic_url","")
+        elastic_index = get_yaml_item_value(elastic_config,"elastic_index","")
        
        # Initialize clients
        if (not os.path.isfile(kubeconfig_path) and
@@ -129,8 +133,6 @@ def main(cfg):
        except:
            kubecli.initialize_clients(None)

-
-
        # find node kraken might be running on
        kubecli.find_kraken_node()

@@ -156,12 +158,22 @@ def main(cfg):
        # Cluster info
        logging.info("Fetching cluster info")
        cv = ""
-        if config["kraken"]["distribution"] == "openshift":
+        if distribution == "openshift":
            cv = ocpcli.get_clusterversion_string()
            if prometheus_url is None:
-                connection_data = ocpcli.get_prometheus_api_connection_data()
-                prometheus_url = connection_data.endpoint
-                prometheus_bearer_token = connection_data.token
+                try:
+                    connection_data = ocpcli.get_prometheus_api_connection_data()
+                    if connection_data:
+                        prometheus_url = connection_data.endpoint
+                        prometheus_bearer_token = connection_data.token
+                    else: 
+                        # If can't make a connection, set alerts to false
+                        enable_alerts = False
+                        critical_alerts = False
+                except Exception:
+                    logging.error("invalid distribution selected, running openshift scenarios against kubernetes cluster."
+                                  "Please set 'kubernetes' in config.yaml krkn.platform and try again")
+                    sys.exit(1)
        if cv != "":
            logging.info(cv)
        else:
@@ -170,9 +182,9 @@ def main(cfg):
        # KrknTelemetry init
        telemetry_k8s = KrknTelemetryKubernetes(safe_logger, kubecli)
        telemetry_ocp = KrknTelemetryOpenshift(safe_logger, ocpcli)
+        telemetry_elastic = KrknElastic(safe_logger,elastic_url)

-
-        if enable_alerts:
+        if enable_alerts or check_critical_alerts:
            prometheus = KrknPrometheus(prometheus_url, prometheus_bearer_token)

        logging.info("Server URL: %s" % kubecli.get_host())
@@ -203,6 +215,7 @@ def main(cfg):

        # Capture the start time
        start_time = int(time.time())
+        critical_alerts_count = 0

        chaos_telemetry = ChaosRunTelemetry()
        chaos_telemetry.run_uuid = run_uuid
@@ -334,23 +347,20 @@ def main(cfg):
                            failed_post_scenarios, scenario_telemetries = network_chaos.run(scenarios_list, config, wait_duration, kubecli, telemetry_k8s)

                        # Check for critical alerts when enabled
-                        if enable_alerts and check_critical_alerts :
+                        if check_critical_alerts:
                            logging.info("Checking for critical alerts firing post choas")

                            ##PROM
                            query = r"""ALERTS{severity="critical"}"""
                            end_time = datetime.datetime.now()
-                            critical_alerts = prometheus.process_prom_query_in_range(
-                                query,
-                                start_time = datetime.datetime.fromtimestamp(start_time),
-                                end_time = end_time
-
+                            critical_alerts = prometheus.process_query(
+                                query
                            )
                            critical_alerts_count = len(critical_alerts)
                            if critical_alerts_count > 0:
                                logging.error("Critical alerts are firing: %s", critical_alerts)
                                logging.error("Please check, exiting")
-                                sys.exit(1)
+                                break
                            else:
                                logging.info("No critical alerts are firing!!")

@@ -366,14 +376,14 @@ def main(cfg):
        # if platform is openshift will be collected
        # Cloud platform and network plugins metadata
        # through OCP specific APIs
-        if config["kraken"]["distribution"] == "openshift":
+        if distribution == "openshift":
            telemetry_ocp.collect_cluster_metadata(chaos_telemetry)
        else:
            telemetry_k8s.collect_cluster_metadata(chaos_telemetry)

        decoded_chaos_run_telemetry = ChaosRunTelemetry(json.loads(chaos_telemetry.to_json()))
        logging.info(f"Telemetry data:\n{decoded_chaos_run_telemetry.to_json()}")
-
+        telemetry_elastic.upload_data_to_elasticsearch(decoded_chaos_run_telemetry.to_json(), elastic_index)
        if config["telemetry"]["enabled"]:
            logging.info(f"telemetry data will be stored on s3 bucket folder: {telemetry_api_url}/download/{telemetry_request_id}")
            logging.info(f"telemetry upload log: {safe_logger.log_file_name}")
@@ -381,12 +391,33 @@ def main(cfg):
                telemetry_k8s.send_telemetry(config["telemetry"], telemetry_request_id, chaos_telemetry)
                telemetry_k8s.put_cluster_events(telemetry_request_id, config["telemetry"], start_time, end_time)
                # prometheus data collection is available only on Openshift
-                if config["telemetry"]["prometheus_backup"] and config["kraken"]["distribution"] == "openshift":
-                    safe_logger.info("archives download started:")
-                    prometheus_archive_files = telemetry_ocp.get_ocp_prometheus_data(config["telemetry"], telemetry_request_id)
-                    safe_logger.info("archives upload started:")
-                    telemetry_k8s.put_prometheus_data(config["telemetry"], prometheus_archive_files, telemetry_request_id)
-                if config["telemetry"]["logs_backup"]:
+                if config["telemetry"]["prometheus_backup"]:
+                    prometheus_archive_files = ''
+                    if distribution == "openshift" :
+                        prometheus_archive_files = telemetry_ocp.get_ocp_prometheus_data(config["telemetry"], telemetry_request_id)
+                    else:
+                        if (config["telemetry"]["prometheus_namespace"] and
+                                config["telemetry"]["prometheus_pod_name"] and
+                                config["telemetry"]["prometheus_container_name"]):
+                            try:
+                                prometheus_archive_files = telemetry_k8s.get_prometheus_pod_data(
+                                    config["telemetry"],
+                                    telemetry_request_id,
+                                    config["telemetry"]["prometheus_pod_name"],
+                                    config["telemetry"]["prometheus_container_name"],
+                                    config["telemetry"]["prometheus_namespace"]
+                                )
+                            except Exception as e:
+                                logging.error(f"failed to get prometheus backup with exception {str(e)}")
+                        else:
+                            logging.warning("impossible to backup prometheus,"
+                                            "check if config contains telemetry.prometheus_namespace, "
+                                            "telemetry.prometheus_pod_name and "
+                                            "telemetry.prometheus_container_name")
+                    if prometheus_archive_files:
+                        safe_logger.info("starting prometheus archive upload:")
+                        telemetry_k8s.put_prometheus_data(config["telemetry"], prometheus_archive_files, telemetry_request_id)
+                if config["telemetry"]["logs_backup"] and distribution == "openshift":
                    telemetry_ocp.put_ocp_logs(telemetry_request_id, config["telemetry"], start_time, end_time)
            except Exception as e:
                logging.error(f"failed to send telemetry data: {str(e)}")
@@ -408,16 +439,19 @@ def main(cfg):
                logging.error("Alert profile is not defined")
                sys.exit(1)

+        if critical_alerts_count > 0:
+            logging.error("Critical alerts are firing, please check; exiting")
+            sys.exit(1)
+
        if failed_post_scenarios:
            logging.error(
                "Post scenarios are still failing at the end of all iterations"
            )
            sys.exit(1)

-        run_dir = os.getcwd() + "/kraken.report"
        logging.info(
            "Successfully finished running Kraken. UUID for the run: "
-            "%s. Report generated at %s. Exiting" % (run_uuid, run_dir)
+            "%s. Report generated at %s. Exiting" % (run_uuid, report_file)
        )
    else:
        logging.error("Cannot find a config at %s, please check" % (cfg))
@@ -434,12 +468,21 @@ if __name__ == "__main__":
        help="config location",
        default="config/config.yaml",
    )
+    parser.add_option(
+        "-o",
+        "--output",
+        dest="output",
+        help="output report location",
+        default="kraken.report",
+    )
+    
    (options, args) = parser.parse_args()
+    report_file = options.output
    logging.basicConfig(
        level=logging.INFO,
        format="%(asctime)s [%(levelname)s] %(message)s",
        handlers=[
-            logging.FileHandler("kraken.report", mode="w"),
+            logging.FileHandler(report_file, mode="w"),
            logging.StreamHandler(),
        ],
    )
--- a/scenarios/arcaflow/cpu-hog/input.yaml
+++ b/scenarios/arcaflow/cpu-hog/input.yaml
@@ -1,8 +1,13 @@
 input_list:
- cpu_count: 1
-  cpu_load_percentage: 80
-  cpu_method: all
-  duration: 1s
-  kubeconfig: ''
-  namespace: default
-  node_selector: {}
+  - cpu_count: 1
+    cpu_load_percentage: 80
+    cpu_method: all
+    duration: 1s
+    kubeconfig: ''
+    namespace: default
+    # set the node selector as a key-value pair eg.
+    # node_selector:
+    #  kubernetes.io/hostname: kind-worker2
+    node_selector: {}
+
+
--- a/scenarios/arcaflow/io-hog/input.yaml
+++ b/scenarios/arcaflow/io-hog/input.yaml
@@ -5,6 +5,9 @@ input_list:
  io_write_bytes: 10m
  kubeconfig: ''
  namespace: default
+  # set the node selector as a key-value pair eg.
+  # node_selector:
+  #  kubernetes.io/hostname: kind-worker2
  node_selector: {}
  target_pod_folder: /hog-data
  target_pod_volume:
--- a/scenarios/arcaflow/memory-hog/input.yaml
+++ b/scenarios/arcaflow/memory-hog/input.yaml
@@ -2,10 +2,10 @@ input_list:
 - duration: 30s
  vm_bytes: 10%
  vm_workers: 2
-  node_selector: { }
-  # node selector example
+  # set the node selector as a key-value pair eg.
  # node_selector:
-  #   kubernetes.io/hostname: master
+  #  kubernetes.io/hostname: kind-worker2
+  node_selector: { }
  kubeconfig: ""
  namespace: default

--- a/scenarios/kube/container_dns.yml
+++ b/scenarios/kube/container_dns.yml
@@ -3,6 +3,6 @@ scenarios:
  namespace: "kube-system"
  label_selector: "k8s-app=kube-dns"
  container_name: ""
-  action: "kill 1"
+  action: 1
  count: 1
  retry_wait: 60
--- a/scenarios/openshift/network_chaos.yaml
+++ b/scenarios/openshift/network_chaos.yaml
@@ -1,12 +1,11 @@
-network_chaos:                       # Scenario to create an outage by simulating random variations in the network.
-  duration: 300                      # seconds
-  node_name:                         # node on which scenario has to be injected;
-  label_selector: <label_selector>   # when node_name is not specified, a node with matching label_selector is selected for running the scenario.
+network_chaos: # Scenario to create an outage by simulating random variations in the network.
+  duration: 300 # seconds
+  node_name: # node on which scenario has to be injected;
+  label_selector: <label_selector> # when node_name is not specified, a node with matching label_selector is selected for running the scenario.
  instance_count: 1
-  interfaces:                        # Interface name would be the Kernel host network interface name.
-  - "<interface_name>"
+  interfaces: # Interface name would be the Kernel host network interface name.
+    - "<interface_name>"
  execution: serial
  egress:
-    latency: 50ms                    # 50ms
-    loss: 0.02                       # percentage
-    bandwidth: 100mbit
+    latency: 50ms # 50ms
+    loss: 0.02 # percentage
--- a/utils/chaos_recommender/README.md
+++ b/utils/chaos_recommender/README.md
@@ -17,7 +17,7 @@ This tool profiles an application and gathers telemetry data such as CPU, Memory
    ```
    $ python3.9 -m venv chaos
    $ source chaos/bin/activate
-    $ git clone https://github.com/redhat-chaos/krkn.git 
+    $ git clone https://github.com/krkn-chaos/krkn.git 
    $ cd krkn
    $ pip3 install -r requirements.txt
    $ python3.9 utils/chaos_recommender/chaos_recommender.py
@@ -89,7 +89,7 @@ If you provide the input values through command-line arguments, the correspondin

 ## Podman & Docker image

-To run the recommender image please visit the [krkn-hub](https://github.com/redhat-chaos/krkn-hub for further infos.
+To run the recommender image please visit the [krkn-hub](https://github.com/krkn-chaos/krkn-hub for further infos.

 ## How it works
Author	SHA1	Message	Date
Naga Ravi Chaitanya Elluri	5a8d5b0fe1	Allow critical alerts check when enable_alerts is disabled This covers use case where user wants to just check for critical alerts post chaos without having to enable the alerts evaluation feature which evaluates prom queries specified in an alerts file. Signed-off-by: Naga Ravi Chaitanya Elluri <nelluri@redhat.com>	2024-02-19 23:15:47 -05:00
Paige Rubendall	c440dc4b51	Taking out start and end time for critical alerts (#572 ) * taking out start and end time" Signed-off-by: Paige Rubendall <prubenda@redhat.com> * adding only break when alert fires Signed-off-by: Paige Rubendall <prubenda@redhat.com> * fail at end if alert had fired Signed-off-by: Paige Rubendall <prubenda@redhat.com> * adding new krkn-lib function with no range Signed-off-by: Paige Rubendall <prubenda@redhat.com> * updating requirements to new krkn-lib Signed-off-by: Paige Rubendall <prubenda@redhat.com> --------- Signed-off-by: Paige Rubendall <prubenda@redhat.com>	2024-02-19 09:28:13 -05:00
Paige Rubendall	b174c51ee0	adding check if connection was properly set Signed-off-by: Paige Rubendall <prubenda@redhat.com>	2024-02-15 17:28:20 -05:00
Paige Rubendall	fec0434ce1	adding upload to elastic search Signed-off-by: Paige Rubendall <prubenda@redhat.com>	2024-02-13 12:01:40 -05:00
Tullio Sebastiani	1067d5ec8d	changed telemetry endpoint for funtests (#571 ) Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>	2024-02-13 17:06:20 +01:00
Tullio Sebastiani	85ea1ef7e1	Dockerfiles update (#570 ) Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>	2024-02-09 17:20:06 +01:00
Tullio Sebastiani	2e38b8b033	Kubernetes prometheus telemetry + functional tests (#566 ) added comment on the node selector input.yaml Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>	2024-02-09 16:38:12 +01:00
Tullio Sebastiani	c7ea366756	frozen package versions (#569 ) Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>	2024-02-09 16:10:25 +01:00
Paige Rubendall	67d4ee9fa2	updating comment to match query (#568 ) Signed-off-by: Paige Rubendall <prubenda@redhat.com>	2024-02-08 22:09:37 -05:00
Paige Rubendall	fa59834bae	updating release versin (#565 ) Signed-off-by: Paige Rubendall <prubenda@redhat.com>	2024-01-25 11:12:00 -05:00
Paige Rubendall	f154bcb692	adding krkn report location Signed-off-by: Paige Rubendall <prubenda@redhat.com>	2024-01-25 10:45:01 -05:00
Naga Ravi Chaitanya Elluri	60ece4b1b8	Use 0.38.0 wheel version to fix security vulnerability Reported by https://snyk.io/ Signed-off-by: Naga Ravi Chaitanya Elluri <nelluri@redhat.com>	2024-01-25 09:51:19 -05:00
Naga Ravi Chaitanya Elluri	d660542a40	Add CNCF trademark guidelines and update community members (#560 ) Signed-off-by: Naga Ravi Chaitanya Elluri <nelluri@redhat.com>	2024-01-24 14:13:53 -05:00
Naga Ravi Chaitanya Elluri	2e651798fa	Update redhat-chaos references with krkn-chaos The tools are now hosted under https://github.com/krkn-chaos Signed-off-by: Naga Ravi Chaitanya Elluri <nelluri@redhat.com>	2024-01-24 13:40:39 -05:00
Tullio Sebastiani	f801dfce54	functional tests pointing to real scenario config files Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com> typo Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com> app_outage fix Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com> typo Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com> typo Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>	2024-01-18 12:54:39 -05:00