SYN flood scenario (#668 )

* scenario config file Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com> * syn flood plugin Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com> * run_krkn.py updaated Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com> * requirements.txt + documentation + config.yaml * set node selector defaults to worker Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com> --------- Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>
Add alerts to alert.yaml
2026-02-19 20:40:33 +00:00 · 2024-07-29 15:31:37 -04:00 · 2024-07-25 10:51:15 -04:00 · 2024-07-22 10:12:14 -04:00 · 2024-07-18 12:56:08 -04:00 · 2024-07-16 18:04:24 +02:00
19 changed files with 308 additions and 125 deletions
--- a/.github/workflows/docker-image.yml
+++ b/.github/workflows/docker-image.yml
@@ -1,8 +1,7 @@
 name: Docker Image CI
 on:
  push:
-    branches:
-      - main
+    tags: ['v[0-9].[0-9]+.[0-9]+']
  pull_request:

 jobs:
@@ -12,30 +11,43 @@ jobs:
    - name: Check out code
      uses: actions/checkout@v3
    - name: Build the Docker images
+      if: startsWith(github.ref, 'refs/tags')
      run:  |
-        docker build --no-cache -t quay.io/krkn-chaos/krkn containers/
+        docker build --no-cache -t quay.io/krkn-chaos/krkn containers/ --build-arg TAG=${GITHUB_REF#refs/tags/}
        docker tag quay.io/krkn-chaos/krkn quay.io/redhat-chaos/krkn
+        docker tag quay.io/krkn-chaos/krkn quay.io/krkn-chaos/krkn:${GITHUB_REF#refs/tags/}
+        docker tag quay.io/krkn-chaos/krkn quay.io/redhat-chaos/krkn:${GITHUB_REF#refs/tags/}
+
+    - name: Test Build the Docker images
+      if: ${{ github.event_name == 'pull_request' }}
+      run: |
+        docker build --no-cache -t quay.io/krkn-chaos/krkn containers/ --build-arg PR_NUMBER=${{ github.event.pull_request.number }}
    - name: Login in quay
-      if: github.ref == 'refs/heads/main' && github.event_name == 'push'
+      if: startsWith(github.ref, 'refs/tags')
      run: docker login quay.io -u ${QUAY_USER} -p ${QUAY_TOKEN}
      env:
        QUAY_USER: ${{ secrets.QUAY_USERNAME }}
        QUAY_TOKEN: ${{ secrets.QUAY_PASSWORD }}
    - name: Push the KrknChaos Docker images
-      if: github.ref == 'refs/heads/main' && github.event_name == 'push'
-      run: docker push quay.io/krkn-chaos/krkn
+      if: startsWith(github.ref, 'refs/tags')
+      run: |
+        docker push quay.io/krkn-chaos/krkn
+        docker push quay.io/krkn-chaos/krkn:${GITHUB_REF#refs/tags/}
    - name: Login in to redhat-chaos quay
-      if: github.ref == 'refs/heads/main' && github.event_name == 'push'
+      if: startsWith(github.ref, 'refs/tags/v')
      run: docker login quay.io -u ${QUAY_USER} -p ${QUAY_TOKEN}
      env:
        QUAY_USER: ${{ secrets.QUAY_USER_1 }}
        QUAY_TOKEN: ${{ secrets.QUAY_TOKEN_1 }}
    - name: Push the RedHat Chaos Docker images
-      if: github.ref == 'refs/heads/main' && github.event_name == 'push'
-      run: docker push quay.io/redhat-chaos/krkn
+      if: startsWith(github.ref, 'refs/tags')
+      run: | 
+        docker push quay.io/redhat-chaos/krkn
+        docker push quay.io/redhat-chaos/krkn:${GITHUB_REF#refs/tags/}
    - name: Rebuild krkn-hub
-      if: github.ref == 'refs/heads/main' && github.event_name == 'push'
+      if: startsWith(github.ref, 'refs/tags')
      uses: redhat-chaos/actions/krkn-hub@main
      with:
        QUAY_USER: ${{ secrets.QUAY_USERNAME }}
        QUAY_TOKEN: ${{ secrets.QUAY_PASSWORD }}
+        AUTOPUSH: ${{ secrets.AUTOPUSH }}
--- a/README.md
+++ b/README.md
@@ -41,18 +41,6 @@ After installation, refer back to the below sections for supported scenarios and
 #### Running Kraken with minimal configuration tweaks
 For cases where you want to run Kraken with minimal configuration changes, refer to [krkn-hub](https://github.com/krkn-chaos/krkn-hub). One use case is CI integration where you do not want to carry around different configuration files for the scenarios.

-### Setting up infrastructure dependencies
-Kraken indexes the metrics specified in the profile into Elasticsearch in addition to leveraging Cerberus for understanding the health of the Kubernetes cluster under test. More information on the features is documented below. The infrastructure pieces can be easily installed and uninstalled by running:
-
-```
-$ cd kraken
-$ podman-compose up or $ docker-compose up      # Spins up the containers specified in the docker-compose.yml file present in the run directory.
-$ podman-compose down or $ docker-compose down  # Delete the containers installed.
-```
-This will manage the Cerberus and Elasticsearch containers on the host on which you are running Kraken.
-
-**NOTE**: Make sure you have enough resources (memory and disk) on the machine on top of which the containers are running as Elasticsearch is resource intensive. Cerberus monitors the system components by default, the [config](config/cerberus.yaml) can be tweaked to add applications namespaces, routes and other components to monitor as well. The command will keep running until killed since detached mode is not supported as of now.
-

 ### Config
 Instructions on how to setup the config and the options supported can be found at [Config](docs/config.md).
@@ -76,6 +64,7 @@ Scenario type               | Kubernetes
 [Network_Chaos](docs/network_chaos.md) | :heavy_check_mark: |
 [ManagedCluster Scenarios](docs/managedcluster_scenarios.md) | :heavy_check_mark: |
 [Service Hijacking Scenarios](docs/service_hijacking_scenarios.md) | :heavy_check_mark: |
+[SYN Flood Scenarios](docs/syn_flood_scenarios.md) | :heavy_check_mark: |


 ### Kraken scenario pass/fail criteria and report
--- a/config/alerts.yaml
+++ b/config/alerts.yaml
@@ -88,3 +88,42 @@
 - expr: ALERTS{severity="critical", alertstate="firing"} > 0
  description: Critical prometheus alert. {{$labels.alertname}}
  severity: warning
+
+# etcd CPU and usage increase
+- expr: sum(rate(container_cpu_usage_seconds_total{image!='', namespace='openshift-etcd', container='etcd'}[1m])) * 100 / sum(machine_cpu_cores)  > 5
+  description: Etcd CPU usage increased significantly
+  severity: warning
+
+# etcd memory usage increase
+- expr: sum(deriv(container_memory_usage_bytes{image!='', namespace='openshift-etcd', container='etcd'}[5m])) * 100 / sum(node_memory_MemTotal_bytes) > 5
+  description: Etcd memory usage increased significantly
+  severity: warning
+
+# Openshift API server CPU and memory usage increase
+- expr: sum(rate(container_cpu_usage_seconds_total{image!='', namespace='openshift-apiserver', container='openshift-apiserver'}[1m])) * 100 / sum(machine_cpu_cores) > 5
+  description: openshift apiserver cpu usage increased significantly
+  severity: warning
+
+- expr: (sum(deriv(container_memory_usage_bytes{namespace='openshift-apiserver', container='openshift-apiserver'}[5m]))) * 100 / sum(node_memory_MemTotal_bytes) > 5
+  description: openshift apiserver memory usage increased significantly
+  severity: warning
+
+# Openshift kube API server CPU and memory usage increase
+- expr: sum(rate(container_cpu_usage_seconds_total{image!='', namespace='openshift-kube-apiserver', container='kube-apiserver'}[1m])) * 100 / sum(machine_cpu_cores) > 5
+  description: openshift apiserver cpu usage increased significantly
+  severity: warning
+
+- expr: (sum(deriv(container_memory_usage_bytes{namespace='openshift-kube-apiserver', container='kube-apiserver'}[5m]))) * 100 / sum(node_memory_MemTotal_bytes) > 5
+  description: openshift apiserver memory usage increased significantly
+  severity: warning
+
+# Master node CPU usage increase
+- expr: (sum((sum(deriv(pod:container_cpu_usage:sum{container="",pod!=""}[5m])) BY (namespace, pod) * on(pod, namespace) group_left(node) (node_namespace_pod:kube_pod_info:)  )  *  on(node) group_left(role) (max by (node) (kube_node_role{role="master"})))) * 100 / sum(machine_cpu_cores) > 5
+  description: master nodes cpu usage increased significantly
+  severity: warning
+
+# Master nodes memory usage increase
+- expr: (sum((sum(deriv(container_memory_usage_bytes{container="",pod!=""}[5m])) BY (namespace, pod) * on(pod, namespace) group_left(node) (node_namespace_pod:kube_pod_info:)  )  *  on(node) group_left(role) (max by (node) (kube_node_role{role="master"})))) * 100 / sum(node_memory_MemTotal_bytes) > 5
+  description: master nodes memory usage increased significantly
+  severity: warning
+
--- a/config/config.yaml
+++ b/config/config.yaml
@@ -44,6 +44,8 @@ kraken:
            - scenarios/openshift/network_chaos.yaml
        - service_hijacking:
              - scenarios/kube/service_hijacking.yaml
+        - syn_flood:
+              - scenarios/kube/syn_flood.yaml

 cerberus:
    cerberus_enabled: False                                # Enable it when cerberus is previously installed
--- a/containers/Dockerfile
+++ b/containers/Dockerfile
@@ -1,9 +1,6 @@
-# azure-client
-FROM mcr.microsoft.com/azure-cli:latest as azure-cli
-
 # oc build
 FROM golang:1.22.4 AS oc-build
-RUN apt-get update && apt-get install -y libkrb5-dev
+RUN apt-get update && apt-get install -y --no-install-recommends libkrb5-dev
 WORKDIR /tmp
 RUN git clone --branch release-4.18 https://github.com/openshift/oc.git
 WORKDIR /tmp/oc
@@ -15,12 +12,11 @@ RUN go mod edit -go 1.22.3 &&\
 RUN make GO_REQUIRED_MIN_VERSION:= oc

 FROM fedora:40
+ARG PR_NUMBER
+ARG TAG
 RUN groupadd -g 1001 krkn && useradd -m -u 1001 -g krkn krkn
 RUN dnf update -y

-# krkn version that will be built
-ENV KRKN_VERSION v1.6.1
-
 ENV KUBECONFIG /home/krkn/.kube/config

 # install kubectl
@@ -29,22 +25,30 @@ RUN curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/s
    cp kubectl /usr/bin/kubectl && chmod +x /usr/bin/kubectl

 # This overwrites any existing configuration in /etc/yum.repos.d/kubernetes.repo
-RUN dnf update && dnf install -y git python39 jq yq gettext wget which
-# copy azure client binary from azure-cli image
-COPY --from=azure-cli /usr/local/bin/az /usr/bin/az
+RUN dnf update && dnf install -y --setopt=install_weak_deps=False \
+    git python39 jq yq gettext wget which &&\
+    dnf clean all

 # copy oc client binary from oc-build image
 COPY --from=oc-build /tmp/oc/oc /usr/bin/oc

 # krkn build
-RUN git clone https://github.com/krkn-chaos/krkn.git --branch $KRKN_VERSION /home/krkn/kraken && \
+RUN git clone https://github.com/krkn-chaos/krkn.git /home/krkn/kraken && \
    mkdir -p /home/krkn/.kube
+
 WORKDIR /home/krkn/kraken
+
+# default behaviour will be to build main
+# if it is a PR trigger the PR itself will be checked out
+RUN if [ -n "$PR_NUMBER" ]; then git fetch origin pull/${PR_NUMBER}/head:pr-${PR_NUMBER} && git checkout pr-${PR_NUMBER};fi
+# if it is a TAG trigger checkout the tag
+RUN if [ -n "$TAG" ]; then git checkout "$TAG";fi
+
 RUN python3.9 -m ensurepip
 RUN pip3.9 install -r requirements.txt
 RUN pip3.9 install jsonschema

-RUN chown -R krkn:krkn /home/krkn
+RUN chown -R krkn:krkn /home/krkn && chmod 755 /home/krkn
 USER krkn
 ENTRYPOINT ["python3.9", "run_kraken.py"]
-CMD ["--config=config/config.yaml"]
+CMD ["--config=config/config.yaml"]
--- a/containers/Dockerfile-ppc64le
+++ b/containers/Dockerfile-ppc64le
@@ -1,29 +0,0 @@
-# Dockerfile for kraken
-
-FROM ppc64le/centos:8
-
-FROM mcr.microsoft.com/azure-cli:latest as azure-cli
-
-LABEL org.opencontainers.image.authors="Red Hat OpenShift Chaos Engineering"
-
-ENV KUBECONFIG /root/.kube/config
-
-# Copy azure client binary from azure-cli image
-COPY --from=azure-cli /usr/local/bin/az /usr/bin/az
-
-# Install dependencies
-RUN yum install -y git python39 python3-pip jq gettext wget && \
-    python3.9 -m pip install -U pip && \
-    git clone https://github.com/redhat-chaos/krkn.git --branch v1.5.14 /root/kraken && \
-    mkdir -p /root/.kube && cd /root/kraken && \
-    pip3.9 install -r requirements.txt && \
-    pip3.9 install virtualenv && \
-    wget https://github.com/mikefarah/yq/releases/latest/download/yq_linux_amd64 -O /usr/bin/yq && chmod +x /usr/bin/yq
-
-# Get Kubernetes and OpenShift clients from stable releases
-WORKDIR /tmp
-RUN wget https://mirror.openshift.com/pub/openshift-v4/clients/ocp/stable/openshift-client-linux.tar.gz && tar -xvf openshift-client-linux.tar.gz && cp oc /usr/local/bin/oc && cp oc /usr/bin/oc && cp kubectl /usr/local/bin/kubectl && cp kubectl /usr/bin/kubectl
-
-WORKDIR /root/kraken
-
-ENTRYPOINT python3.9 run_kraken.py --config=config/config.yaml
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -1,31 +0,0 @@
-version: "3"
-services:
-  elastic:
-    image: docker.elastic.co/elasticsearch/elasticsearch:7.13.2
-    deploy:
-      replicas: 1
-      restart_policy:
-        condition: on-failure
-    network_mode: host
-    environment:
-      discovery.type: single-node
-  kibana:
-    image: docker.elastic.co/kibana/kibana:7.13.2
-    deploy:
-      replicas: 1
-      restart_policy:
-        condition: on-failure
-    network_mode: host
-    environment:
-      ELASTICSEARCH_HOSTS: "http://0.0.0.0:9200"
-  cerberus:
-    image: quay.io/openshift-scale/cerberus:latest
-    privileged: true
-    deploy:
-      replicas: 1
-      restart_policy:
-        condition: on-failure
-    network_mode: host
-    volumes:
-       - ./config/cerberus.yaml:/root/cerberus/config/config.yaml:Z  # Modify the config in case of the need to monitor additional components
-       - ${HOME}/.kube/config:/root/.kube/config:Z
--- a/docs/cloud_setup.md
+++ b/docs/cloud_setup.md
@@ -27,14 +27,12 @@ After creating the service account you will need to enable the account using the

 ## Azure

-**NOTE**: For Azure node killing scenarios, make sure [Azure CLI](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli?view=azure-cli-latest) is installed.
-
-You will also need to create a service principal and give it the correct access, see [here](https://docs.openshift.com/container-platform/4.5/installing/installing_azure/installing-azure-account.html) for creating the service principal and setting the proper permissions.
+**NOTE**: You will need to create a service principal and give it the correct access, see [here](https://docs.openshift.com/container-platform/4.5/installing/installing_azure/installing-azure-account.html) for creating the service principal and setting the proper permissions.

 To properly run the service principal requires “Azure Active Directory Graph/Application.ReadWrite.OwnedBy” api permission granted and “User Access Administrator”.

 Before running you will need to set the following:
-1. Login using ```az login```
+1. ```export AZURE_SUBSCRIPTION_ID=<subscription_id>```

 2. ```export AZURE_TENANT_ID=<tenant_id>```

--- a/docs/syn_flood_scenarios.md
+++ b/docs/syn_flood_scenarios.md
@@ -0,0 +1,33 @@
+### SYN Flood Scenarios
+
+This scenario generates a substantial amount of TCP traffic directed at one or more Kubernetes services within 
+the cluster to test the server's resiliency under extreme traffic conditions. 
+It can also target hosts outside the cluster by specifying a reachable IP address or hostname. 
+This scenario leverages the distributed nature of Kubernetes clusters to instantiate multiple instances 
+of the same pod against a single host, significantly increasing the effectiveness of the attack. 
+The configuration also allows for the specification of multiple node selectors, enabling Kubernetes to schedule 
+the attacker pods on a user-defined subset of nodes to make the test more realistic.
+
+ ```yaml
+packet-size: 120 # hping3 packet size
+window-size: 64 # hping 3 TCP window size
+duration: 10 # chaos scenario duration
+namespace: default # namespace where the target service(s) are deployed
+target-service: target-svc # target service name (if set target-service-label must be empty)
+target-port: 80 # target service TCP port
+target-service-label : "" # target service label, can be used to target multiple target at the same time
+                          # if they have the same label set (if set target-service must be empty)
+number-of-pods: 2 # number of attacker pod instantiated per each target
+image: quay.io/krkn-chaos/krkn-syn-flood # syn flood attacker container image
+attacker-nodes: # this will set the node affinity to schedule the attacker node. Per each node label selector
+                # can be specified multiple values in this way the kube scheduler will schedule the attacker pods
+                # in the best way possible based on the provided labels. Multiple labels can be specified
+  kubernetes.io/hostname:
+    - host_1
+    - host_2
+  kubernetes.io/os:
+    - linux
+
+ ```
+
+The attacker container source code is available [here](https://github.com/krkn-chaos/krkn-syn-flood).
--- a/kraken/node_actions/az_node_scenarios.py
+++ b/kraken/node_actions/az_node_scenarios.py
@@ -1,6 +1,6 @@

 import time
-import yaml
+import os
 import kraken.invoke.command as runcommand
 import logging
 import kraken.node_actions.common_node_functions as nodeaction
@@ -17,9 +17,9 @@ class Azure:
        # Acquire a credential object using CLI-based authentication.
        credentials = DefaultAzureCredential()
        logging.info("credential " + str(credentials))
-        az_account = runcommand.invoke("az account list -o yaml")
-        az_account_yaml = yaml.safe_load(az_account, Loader=yaml.FullLoader)
-        subscription_id = az_account_yaml[0]["id"]
+        # az_account = runcommand.invoke("az account list -o yaml")
+        # az_account_yaml = yaml.safe_load(az_account, Loader=yaml.FullLoader)
+        subscription_id = os.getenv("AZURE_SUBSCRIPTION_ID")
        self.compute_client = ComputeManagementClient(credentials, subscription_id)

    # Get the instance ID of the node
--- a/kraken/node_actions/gcp_node_scenarios.py
+++ b/kraken/node_actions/gcp_node_scenarios.py
@@ -1,6 +1,8 @@
+import os
 import sys
 import time
 import logging
+import json
 import kraken.node_actions.common_node_functions as nodeaction
 from kraken.node_actions.abstract_node_scenarios import abstract_node_scenarios
 from googleapiclient import discovery
@@ -10,11 +12,19 @@ from krkn_lib.k8s import KrknKubernetes

 class GCP:
    def __init__(self):
+        try: 
+            gapp_creds = os.getenv("GOOGLE_APPLICATION_CREDENTIALS")
+            with open(gapp_creds, "r") as f:
+                f_str = f.read()
+                self.project = json.loads(f_str)['project_id']
+            #self.project = runcommand.invoke("gcloud config get-value project").split("/n")[0].strip()
+            logging.info("project " + str(self.project) + "!")
+            credentials = GoogleCredentials.get_application_default()
+            self.client = discovery.build("compute", "v1", credentials=credentials, cache_discovery=False)

-        self.project = runcommand.invoke("gcloud config get-value project").split("/n")[0].strip()
-        logging.info("project " + str(self.project) + "!")
-        credentials = GoogleCredentials.get_application_default()
-        self.client = discovery.build("compute", "v1", credentials=credentials, cache_discovery=False)
+        except Exception as e: 
+            logging.error("Error on setting up GCP connection: " + str(e))
+            sys.exit(1)

    # Get the instance ID of the node
    def get_instance_id(self, node):
--- a/kraken/plugins/init.py
+++ b/kraken/plugins/init.py
@@ -53,7 +53,7 @@ class Plugins:
    def unserialize_scenario(self, file: str) -> Any:
        return serialization.load_from_file(abspath(file))

-    def run(self, file: str, kubeconfig_path: str, kraken_config: str):
+    def run(self, file: str, kubeconfig_path: str, kraken_config: str, run_uuid:str):
        """
        Run executes a series of steps
        """
@@ -102,7 +102,8 @@ class Plugins:
                unserialized_input.kubeconfig_path = kubeconfig_path
            if "kraken_config" in step.schema.input.properties:
                unserialized_input.kraken_config = kraken_config
-            output_id, output_data = step.schema(unserialized_input)
+            output_id, output_data = step.schema(params=unserialized_input, run_id=run_uuid)
+
            logging.info(step.render_output(output_id, output_data) + "\n")
            if output_id in step.error_output_ids:
                raise Exception(
@@ -253,7 +254,8 @@ def run(scenarios: List[str],
        failed_post_scenarios: List[str],
        wait_duration: int,
        telemetry: KrknTelemetryKubernetes,
-        kubecli: KrknKubernetes
+        kubecli: KrknKubernetes,
+        run_uuid: str
        ) -> (List[str], list[ScenarioTelemetry]):

    scenario_telemetries: list[ScenarioTelemetry] = []
@@ -268,7 +270,7 @@ def run(scenarios: List[str],

        try:
            start_monitoring(pool, kill_scenarios)
-            PLUGINS.run(scenario, kubeconfig_path, kraken_config)
+            PLUGINS.run(scenario, kubeconfig_path, kraken_config, run_uuid)
            result = pool.join()
            scenario_telemetry.affected_pods = result
            if result.error:
--- a/kraken/syn_flood/init.py
+++ b/kraken/syn_flood/init.py
@@ -0,0 +1 @@
+from .syn_flood import *
--- a/kraken/syn_flood/syn_flood.py
+++ b/kraken/syn_flood/syn_flood.py
@@ -0,0 +1,132 @@
+import logging
+import os.path
+import time
+from typing import List
+
+import krkn_lib.utils
+import yaml
+from krkn_lib.k8s import KrknKubernetes
+from krkn_lib.models.telemetry import ScenarioTelemetry
+from krkn_lib.telemetry.k8s import KrknTelemetryKubernetes
+
+
+def run(scenarios_list: list[str], krkn_kubernetes: KrknKubernetes, telemetry: KrknTelemetryKubernetes) -> (list[str], list[ScenarioTelemetry]):
+    scenario_telemetries: list[ScenarioTelemetry] = []
+    failed_post_scenarios = []
+    for scenario in scenarios_list:
+        scenario_telemetry = ScenarioTelemetry()
+        scenario_telemetry.scenario = scenario
+        scenario_telemetry.start_timestamp = time.time()
+        telemetry.set_parameters_base64(scenario_telemetry, scenario)
+
+        try:
+            pod_names = []
+            config = parse_config(scenario)
+            if config["target-service-label"]:
+                target_services = krkn_kubernetes.select_service_by_label(config["namespace"], config["target-service-label"])
+            else:
+                target_services = [config["target-service"]]
+
+            for target in target_services:
+                if not krkn_kubernetes.service_exists(target, config["namespace"]):
+                    raise Exception(f"{target} service not found")
+                for i in range(config["number-of-pods"]):
+                    pod_name = "syn-flood-" + krkn_lib.utils.get_random_string(10)
+                    krkn_kubernetes.deploy_syn_flood(pod_name,
+                                                     config["namespace"],
+                                                     config["image"],
+                                                     target,
+                                                     config["target-port"],
+                                                     config["packet-size"],
+                                                     config["window-size"],
+                                                     config["duration"],
+                                                     config["attacker-nodes"]
+                                                     )
+                    pod_names.append(pod_name)
+
+            logging.info("waiting all the attackers to finish:")
+            did_finish = False
+            finished_pods = []
+            while not did_finish:
+                for pod_name in pod_names:
+                    if not krkn_kubernetes.is_pod_running(pod_name, config["namespace"]):
+                        finished_pods.append(pod_name)
+                    if set(pod_names) == set(finished_pods):
+                        did_finish = True
+                time.sleep(1)
+
+        except Exception as e:
+            logging.error(f"Failed to run syn flood scenario {scenario}: {e}")
+            failed_post_scenarios.append(scenario)
+            scenario_telemetry.exit_status = 1
+        else:
+            scenario_telemetry.exit_status = 0
+        scenario_telemetry.end_timestamp = time.time()
+        scenario_telemetries.append(scenario_telemetry)
+    return failed_post_scenarios, scenario_telemetries
+
+def parse_config(scenario_file: str) -> dict[str,any]:
+    if not os.path.exists(scenario_file):
+        raise Exception(f"failed to load scenario file {scenario_file}")
+
+    try:
+        with open(scenario_file) as stream:
+            config = yaml.safe_load(stream)
+    except Exception:
+        raise Exception(f"{scenario_file} is not a valid yaml file")
+
+    missing = []
+    if not check_key_value(config ,"packet-size"):
+        missing.append("packet-size")
+    if not check_key_value(config,"window-size"):
+        missing.append("window-size")
+    if not check_key_value(config, "duration"):
+        missing.append("duration")
+    if not check_key_value(config, "namespace"):
+        missing.append("namespace")
+    if not check_key_value(config, "number-of-pods"):
+        missing.append("number-of-pods")
+    if not check_key_value(config, "target-port"):
+        missing.append("target-port")
+    if not check_key_value(config, "image"):
+        missing.append("image")
+    if "target-service" not in config.keys():
+        missing.append("target-service")
+    if "target-service-label" not in config.keys():
+        missing.append("target-service-label")
+
+
+
+
+    if len(missing) > 0:
+        raise Exception(f"{(',').join(missing)} parameter(s) are missing")
+
+    if not config["target-service"] and not config["target-service-label"]:
+        raise Exception("you have either to set a target service or a label")
+    if config["target-service"] and config["target-service-label"]:
+        raise Exception("you cannot select both target-service and target-service-label")
+
+    if 'attacker-nodes' and not is_node_affinity_correct(config['attacker-nodes']):
+        raise Exception("attacker-nodes format is not correct")
+    return config
+
+def check_key_value(dictionary, key):
+    if key in dictionary:
+        value = dictionary[key]
+        if value is not None and value != '':
+            return True
+    return False
+
+def is_node_affinity_correct(obj) -> bool:
+    if not isinstance(obj, dict):
+        return False
+    for key in obj.keys():
+        if not isinstance(key, str):
+            return False
+        if not isinstance(obj[key], list):
+            return False
+    return True
+
+
+
+
--- a/requirements.txt
+++ b/requirements.txt
@@ -1,7 +1,7 @@
 aliyun-python-sdk-core==2.13.36
 aliyun-python-sdk-ecs==4.24.25
+arcaflow-plugin-sdk==0.14.0
 arcaflow==0.17.2
-arcaflow-plugin-sdk==0.10.0
 boto3==1.28.61
 azure-identity==1.16.1
 azure-keyvault==4.2.0
@@ -15,27 +15,26 @@ google-api-python-client==2.116.0
 ibm_cloud_sdk_core==3.18.0
 ibm_vpc==0.20.0
 jinja2==3.1.4
-krkn-lib==2.1.3
+krkn-lib==2.1.7
 lxml==5.1.0
-kubernetes==26.1.0
+kubernetes==28.1.0
 oauth2client==4.1.3
 pandas==2.2.0
 openshift-client==1.0.21
 paramiko==3.4.0
-podman-compose==1.0.6
 pyVmomi==8.0.2.0.1
 pyfiglet==1.0.2
 pytest==8.0.0
 python-ipmi==0.5.4
 python-openstackclient==6.5.0
-requests==2.32.0
+requests==2.32.2
 service_identity==24.1.0
-PyYAML==6.0
-setuptools==65.5.1
+PyYAML==6.0.1
+setuptools==70.0.0
 werkzeug==3.0.3
 wheel==0.42.0
 zope.interface==5.4.0

-git+https://github.com/krkn-chaos/arcaflow-plugin-kill-pod.git
+git+https://github.com/krkn-chaos/arcaflow-plugin-kill-pod.git@v0.1.0
 git+https://github.com/vmware/vsphere-automation-sdk-python.git@v8.0.0.0
 cryptography>=42.0.4 # not directly required, pinned by Snyk to avoid a vulnerability
--- a/run_kraken.py
+++ b/run_kraken.py
@@ -27,7 +27,7 @@ import kraken.arcaflow_plugin as arcaflow_plugin
 import kraken.prometheus as prometheus_plugin
 import kraken.service_hijacking.service_hijacking as service_hijacking_plugin
 import server as server
-from kraken import plugins
+from kraken import plugins, syn_flood
 from krkn_lib.k8s import KrknKubernetes
 from krkn_lib.ocp import KrknOpenshift
 from krkn_lib.telemetry.elastic import KrknElastic
@@ -266,7 +266,8 @@ def main(cfg):
                                failed_post_scenarios,
                                wait_duration,
                                telemetry_k8s,
-                                kubecli
+                                kubecli,
+                                run_uuid
                            )
                            chaos_telemetry.scenarios.extend(scenario_telemetries)
                        # krkn_lib
@@ -353,6 +354,10 @@ def main(cfg):
                            logging.info("Running Service Hijacking Chaos")
                            failed_post_scenarios, scenario_telemetries = service_hijacking_plugin.run(scenarios_list, wait_duration, kubecli, telemetry_k8s)
                            chaos_telemetry.scenarios.extend(scenario_telemetries)
+                        elif scenario_type == "syn_flood":
+                            logging.info("Running Syn Flood Chaos")
+                            failed_post_scenarios, scenario_telemetries = syn_flood.run(scenarios_list, kubecli, telemetry_k8s)
+                            chaos_telemetry.scenarios.extend(scenario_telemetries)

                        # Check for critical alerts when enabled
                        post_critical_alerts = 0
--- a/scenarios/kube/syn_flood.yaml
+++ b/scenarios/kube/syn_flood.yaml
@@ -0,0 +1,16 @@
+packet-size: 120 # hping3 packet size
+window-size: 64 # hping 3 TCP window size
+duration: 10 # chaos scenario duration
+namespace: default # namespace where the target service(s) are deployed
+target-service: elasticsearch # target service name (if set target-service-label must be empty)
+target-port: 9200 # target service TCP port
+target-service-label : "" # target service label, can be used to target multiple target at the same time
+                          # if they have the same label set (if set target-service must be empty)
+number-of-pods: 2 # number of attacker pod instantiated per each target
+image: quay.io/krkn-chaos/krkn-syn-flood:v1.0.0 # syn flood attacker container image
+attacker-nodes:                       # this will set the node affinity to schedule the attacker node. Per each node label selector
+    node-role.kubernetes.io/worker:   # can be specified multiple values in this way the kube scheduler will schedule the attacker pods
+      - ""                            # in the best way possible based on the provided labels. Multiple labels can be specified
+                                      # set empty value  `attacker-nodes: {}`  to let kubernetes schedule the pods
+
+
--- a/tests/test_ingress_network_plugin.py
+++ b/tests/test_ingress_network_plugin.py
@@ -39,7 +39,7 @@ class NetworkScenariosTest(unittest.TestCase):

    def test_network_chaos(self):
        output_id, output_data = ingress_shaping.network_chaos(
-            ingress_shaping.NetworkScenarioConfig(
+            params=ingress_shaping.NetworkScenarioConfig(
                label_selector="node-role.kubernetes.io/control-plane",
                instance_count=1,
                network_params={
@@ -47,7 +47,8 @@ class NetworkScenariosTest(unittest.TestCase):
                    "loss": "0.02",
                    "bandwidth": "100mbit"
                }
-            )
+            ),
+            run_id="network-shaping-test"
        )
        if output_id == "error":
            logging.error(output_data.error)
--- a/tests/test_run_python_plugin.py
+++ b/tests/test_run_python_plugin.py
@@ -10,7 +10,7 @@ class RunPythonPluginTest(unittest.TestCase):
        tmp_file = tempfile.NamedTemporaryFile()
        tmp_file.write(bytes("print('Hello world!')", 'utf-8'))
        tmp_file.flush()
-        output_id, output_data = run_python_file(RunPythonFileInput(tmp_file.name))
+        output_id, output_data = run_python_file(params=RunPythonFileInput(tmp_file.name), run_id="test-python-plugin-success")
        self.assertEqual("success", output_id)
        self.assertEqual("Hello world!\n", output_data.stdout)

@@ -18,7 +18,7 @@ class RunPythonPluginTest(unittest.TestCase):
        tmp_file = tempfile.NamedTemporaryFile()
        tmp_file.write(bytes("import sys\nprint('Hello world!')\nsys.exit(42)\n", 'utf-8'))
        tmp_file.flush()
-        output_id, output_data = run_python_file(RunPythonFileInput(tmp_file.name))
+        output_id, output_data = run_python_file(params=RunPythonFileInput(tmp_file.name), run_id="test-python-plugin-error")
        self.assertEqual("error", output_id)
        self.assertEqual(42, output_data.exit_code)
        self.assertEqual("Hello world!\n", output_data.stdout)
Author	SHA1	Message	Date
Tullio Sebastiani	e02c6d1287	SYN flood scenario (#668 ) * scenario config file Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com> * syn flood plugin Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com> * run_krkn.py updaated Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com> * requirements.txt + documentation + config.yaml * set node selector defaults to worker Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com> --------- Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>	2024-07-29 15:31:37 -04:00
jtydlack	04425a8d8a	Add alerts to alert.yaml Signed-off-by: jtydlack <139967002+jtydlack@users.noreply.github.com>	2024-07-25 10:51:15 -04:00
Naga Ravi Chaitanya Elluri	f3933f0e62	fix: requirements.txt to reduce vulnerabilities (#673 ) The following vulnerabilities are fixed by pinning transitive dependencies: - https://snyk.io/vuln/SNYK-PYTHON-SETUPTOOLS-7448482 Co-authored-by: snyk-bot <snyk-bot@snyk.io>	2024-07-22 10:12:14 -04:00
Naga Ravi Chaitanya Elluri	56ff0a8c72	Deprecate setting release version in the container source file This commit also deprecates building container image for ppc64le as it is not actively maintained. We will add support if users request for it in the future. Signed-off-by: Naga Ravi Chaitanya Elluri <nelluri@redhat.com>	2024-07-18 12:56:08 -04:00
Tullio Sebastiani	9378cd74cd	krkn-lib update v2.1.6 to fix pod monitoring time calculations (#674 ) Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>	2024-07-16 18:04:24 +02:00
Paige Patton	4d3491da0f	adidng action token passing (#671 ) rh-pre-commit.version: 2.2.0 rh-pre-commit.check-secrets: ENABLED Signed-off-by: Paige Rubendall <prubenda@redhat.com>	2024-07-15 12:50:20 -04:00
Naga Ravi Chaitanya Elluri	d6ce66160b	Remove podman-compose dependency We are not using it in the krkn code base and removing it fixes one of the license issues reported by FOSSA. This commit also removes setting up dependencies using docker/podman compose as it not actively maintained. Signed-off-by: Naga Ravi Chaitanya Elluri <nelluri@redhat.com>	2024-07-10 17:25:33 -04:00
Paige Rubendall	ef1a55438b	taking out need for az cli to be installed rh-pre-commit.version: 2.2.0 rh-pre-commit.check-secrets: ENABLED Signed-off-by: Paige Rubendall <prubenda@redhat.com>	2024-07-05 15:18:06 -04:00
Tullio Sebastiani	d8f54b83a2	fixed image push issue Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>	2024-07-05 10:32:01 -04:00
Tullio Sebastiani	4870c86515	moves the krkn-hub build from push on main to tag (#660 ) * moves the krkn-hub build from push on main to tag + final image enhancement Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com> fixed syntax Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com> typo Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com> typo Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com> * quotes Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com> --------- Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>	2024-07-05 16:09:34 +02:00
Naga Ravi Chaitanya Elluri	6ae17cf678	Update dockerfile to install azure-cli using dnf Avoids architecture issues such as "bash: /usr/bin/az: cannot execute: required file not found" Signed-off-by: Naga Ravi Chaitanya Elluri <nelluri@redhat.com>	2024-07-03 18:35:45 -04:00
Tullio Sebastiani	ce9f8aa050	Dockerfile update v1.6.2 (#659 ) Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>	2024-07-03 16:34:37 +02:00
Paige Patton	05148317c1	taking out one glcoud call (#657 ) rh-pre-commit.version: 2.2.0 rh-pre-commit.check-secrets: ENABLED Signed-off-by: Paige Rubendall <prubenda@redhat.com>	2024-07-03 16:14:19 +02:00
Tullio Sebastiani	5f836f294b	Kill pod arca plugin update adaptation (#656 ) * new kill-pod interface adaptation Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com> * unit test fix Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com> * requirements update Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com> * fixed duplicate requirement Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com> * added conditional dockerfile build Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com> fix Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com> fix Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com> fix Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com> removed useless print Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com> --------- Signed-off-by: Tullio Sebastiani <tsebasti@redhat.com>	2024-07-03 15:50:43 +02:00
snyk-bot	cfa1bb09a0	fix: requirements.txt to reduce vulnerabilities The following vulnerabilities are fixed by pinning transitive dependencies: - https://snyk.io/vuln/SNYK-PYTHON-REQUESTS-6928867	2024-06-24 10:23:37 -04:00
Naga Ravi Chaitanya Elluri	5ddfff5a85	Make krkn dir executable Signed-off-by: Naga Ravi Chaitanya Elluri <nelluri@redhat.com>	2024-06-20 14:32:20 -04:00