mirror of
https://github.com/krkn-chaos/krkn.git
synced 2026-03-09 05:01:55 +00:00
Compare commits
13 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
ce409ea6fb | ||
|
|
0eb8d38596 | ||
|
|
68dc17bc44 | ||
|
|
572eeefaf4 | ||
|
|
81376bad56 | ||
|
|
72b46f8393 | ||
|
|
a7938e58d2 | ||
|
|
9858f96c78 | ||
|
|
c91e8db928 | ||
|
|
54ea98be9c | ||
|
|
9748622e4f | ||
|
|
47f93b39c2 | ||
|
|
aa715bf566 |
127
CODE_OF_CONDUCT.md
Normal file
127
CODE_OF_CONDUCT.md
Normal file
@@ -0,0 +1,127 @@
|
||||
# Contributor Covenant Code of Conduct
|
||||
|
||||
## Our Pledge
|
||||
|
||||
We as members, contributors, and leaders pledge to make participation in our
|
||||
community a harassment-free experience for everyone, regardless of age, body
|
||||
size, visible or invisible disability, ethnicity, sex characteristics, gender
|
||||
identity and expression, level of experience, education, socio-economic status,
|
||||
nationality, personal appearance, race, religion, or sexual identity
|
||||
and orientation.
|
||||
|
||||
We pledge to act and interact in ways that contribute to an open, welcoming,
|
||||
diverse, inclusive, and healthy community.
|
||||
|
||||
## Our Standards
|
||||
|
||||
Examples of behavior that contributes to a positive environment for our
|
||||
community include:
|
||||
|
||||
* Demonstrating empathy and kindness toward other people
|
||||
* Being respectful of differing opinions, viewpoints, and experiences
|
||||
* Giving and gracefully accepting constructive feedback
|
||||
* Accepting responsibility and apologizing to those affected by our mistakes,
|
||||
and learning from the experience
|
||||
* Focusing on what is best not just for us as individuals, but for the
|
||||
overall community
|
||||
|
||||
Examples of unacceptable behavior include:
|
||||
|
||||
* The use of sexualized language or imagery, and sexual attention or
|
||||
advances of any kind
|
||||
* Trolling, insulting or derogatory comments, and personal or political attacks
|
||||
* Public or private harassment
|
||||
* Publishing others' private information, such as a physical or email
|
||||
address, without their explicit permission
|
||||
* Other conduct which could reasonably be considered inappropriate in a
|
||||
professional setting
|
||||
|
||||
## Enforcement Responsibilities
|
||||
|
||||
Community leaders are responsible for clarifying and enforcing our standards of
|
||||
acceptable behavior and will take appropriate and fair corrective action in
|
||||
response to any behavior that they deem inappropriate, threatening, offensive,
|
||||
or harmful.
|
||||
|
||||
Community leaders have the right and responsibility to remove, edit, or reject
|
||||
comments, commits, code, wiki edits, issues, and other contributions that are
|
||||
not aligned to this Code of Conduct, and will communicate reasons for moderation
|
||||
decisions when appropriate.
|
||||
|
||||
## Scope
|
||||
|
||||
This Code of Conduct applies within all community spaces, and also applies when
|
||||
an individual is officially representing the community in public spaces.
|
||||
Examples of representing our community include using an official e-mail address,
|
||||
posting via an official social media account, or acting as an appointed
|
||||
representative at an online or offline event.
|
||||
|
||||
## Enforcement
|
||||
|
||||
Instances of abusive, harassing, or otherwise unacceptable behavior may be
|
||||
reported to the community leaders responsible for enforcement.
|
||||
All complaints will be reviewed and investigated promptly and fairly.
|
||||
|
||||
All community leaders are obligated to respect the privacy and security of the
|
||||
reporter of any incident.
|
||||
|
||||
## Enforcement Guidelines
|
||||
|
||||
Community leaders will follow these Community Impact Guidelines in determining
|
||||
the consequences for any action they deem in violation of this Code of Conduct:
|
||||
|
||||
### 1. Correction
|
||||
|
||||
**Community Impact**: Use of inappropriate language or other behavior deemed
|
||||
unprofessional or unwelcome in the community.
|
||||
|
||||
**Consequence**: A private, written warning from community leaders, providing
|
||||
clarity around the nature of the violation and an explanation of why the
|
||||
behavior was inappropriate. A public apology may be requested.
|
||||
|
||||
### 2. Warning
|
||||
|
||||
**Community Impact**: A violation through a single incident or series
|
||||
of actions.
|
||||
|
||||
**Consequence**: A warning with consequences for continued behavior. No
|
||||
interaction with the people involved, including unsolicited interaction with
|
||||
those enforcing the Code of Conduct, for a specified period of time. This
|
||||
includes avoiding interactions in community spaces as well as external channels
|
||||
like social media. Violating these terms may lead to a temporary or
|
||||
permanent ban.
|
||||
|
||||
### 3. Temporary Ban
|
||||
|
||||
**Community Impact**: A serious violation of community standards, including
|
||||
sustained inappropriate behavior.
|
||||
|
||||
**Consequence**: A temporary ban from any sort of interaction or public
|
||||
communication with the community for a specified period of time. No public or
|
||||
private interaction with the people involved, including unsolicited interaction
|
||||
with those enforcing the Code of Conduct, is allowed during this period.
|
||||
Violating these terms may lead to a permanent ban.
|
||||
|
||||
### 4. Permanent Ban
|
||||
|
||||
**Community Impact**: Demonstrating a pattern of violation of community
|
||||
standards, including sustained inappropriate behavior, harassment of an
|
||||
individual, or aggression toward or disparagement of classes of individuals.
|
||||
|
||||
**Consequence**: A permanent ban from any sort of public interaction within
|
||||
the community.
|
||||
|
||||
## Attribution
|
||||
|
||||
This Code of Conduct is adapted from the [Contributor Covenant][homepage],
|
||||
version 2.0, available at
|
||||
https://www.contributor-covenant.org/version/2/0/code_of_conduct.html.
|
||||
|
||||
Community Impact Guidelines were inspired by [Mozilla's code of conduct
|
||||
enforcement ladder](https://github.com/mozilla/diversity).
|
||||
|
||||
[homepage]: https://www.contributor-covenant.org
|
||||
|
||||
For answers to common questions about this code of conduct, see the FAQ at
|
||||
https://www.contributor-covenant.org/faq. Translations are available at
|
||||
https://www.contributor-covenant.org/translations.
|
||||
12
MAINTAINERS.md
Normal file
12
MAINTAINERS.md
Normal file
@@ -0,0 +1,12 @@
|
||||
## Overview
|
||||
|
||||
This document contains a list of maintainers in this repo.
|
||||
|
||||
## Current Maintainers
|
||||
|
||||
| Maintainer | GitHub ID | Email |
|
||||
|---------------------| --------------------------------------------------------- | ----------------------- |
|
||||
| Ravi Elluri | [chaitanyaenr](https://github.com/chaitanyaenr) | nelluri@redhat.com |
|
||||
| Pradeep Surisetty | [psuriset](https://github.com/psuriset) | psuriset@redhat.com |
|
||||
| Paige Rubendall | [paigerube14](https://github.com/paigerube14) | prubenda@redhat.com |
|
||||
| Tullio Sebastiani | [tsebastiani](https://github.com/tsebastiani) | tsebasti@redhat.com |
|
||||
15
README.md
15
README.md
@@ -62,7 +62,7 @@ Scenario type | Kubernetes | OpenShift
|
||||
[Container Scenarios](docs/container_scenarios.md) | :heavy_check_mark: | :heavy_check_mark: |
|
||||
[Node Scenarios](docs/node_scenarios.md) | :heavy_check_mark: | :heavy_check_mark: |
|
||||
[Time Scenarios](docs/time_scenarios.md) | :x: | :heavy_check_mark: |
|
||||
[Hog Scenarios](docs/arcaflow_scenarios.md) | :heavy_check_mark: | :heavy_check_mark: |
|
||||
[Hog Scenarios: CPU, Memory](docs/arcaflow_scenarios.md) | :heavy_check_mark: | :heavy_check_mark: |
|
||||
[Cluster Shut Down Scenarios](docs/cluster_shut_down_scenarios.md) | :heavy_check_mark: | :heavy_check_mark: |
|
||||
[Namespace Scenarios](docs/namespace_scenarios.md) | :heavy_check_mark: | :heavy_check_mark: |
|
||||
[Zone Outage Scenarios](docs/zone_outage.md) | :heavy_check_mark: | :heavy_check_mark: |
|
||||
@@ -94,8 +94,12 @@ Monitoring the Kubernetes/OpenShift cluster to observe the impact of Kraken chao
|
||||
Kraken supports capturing metrics for the duration of the scenarios defined in the config and indexes then into Elasticsearch to be able to store and evaluate the state of the runs long term. The indexed metrics can be visualized with the help of Grafana. It uses [Kube-burner](https://github.com/cloud-bulldozer/kube-burner) under the hood. The metrics to capture need to be defined in a metrics profile which Kraken consumes to query prometheus ( installed by default in OpenShift ) with the start and end timestamp of the run. Information on enabling and leveraging this feature can be found [here](docs/metrics.md).
|
||||
|
||||
|
||||
### Alerts
|
||||
In addition to checking the recovery and health of the cluster and components under test, Kraken takes in a profile with the Prometheus expressions to validate and alerts, exits with a non-zero return code depending on the severity set. This feature can be used to determine pass/fail or alert on abnormalities observed in the cluster based on the metrics. Information on enabling and leveraging this feature can be found [here](docs/alerts.md).
|
||||
### SLOs validation during and post chaos
|
||||
- In addition to checking the recovery and health of the cluster and components under test, Kraken takes in a profile with the Prometheus expressions to validate and alerts, exits with a non-zero return code depending on the severity set. This feature can be used to determine pass/fail or alert on abnormalities observed in the cluster based on the metrics.
|
||||
- Kraken also provides ability to check if any critical alerts are firing in the cluster post chaos and pass/fail's.
|
||||
|
||||
Information on enabling and leveraging this feature can be found [here](docs/SLOs_validation.md)
|
||||
|
||||
|
||||
### OCM / ACM integration
|
||||
|
||||
@@ -109,10 +113,7 @@ Kraken supports injecting faults into [Open Cluster Management (OCM)](https://op
|
||||
|
||||
|
||||
### Roadmap
|
||||
Following is a list of enhancements that we are planning to work on adding support in Kraken. Of course any help/contributions are greatly appreciated.
|
||||
- [Ability to visualize the metrics that are being captured by Kraken and stored in Elasticsearch](https://github.com/redhat-chaos/krkn/issues/124)
|
||||
- Continue to improve [Chaos Testing Guide](https://redhat-chaos.github.io/krkn) in terms of adding best practices, test environment recommendations and scenarios to make sure the OpenShift platform, as well the applications running on top it, are resilient and performant under chaotic conditions.
|
||||
- Support for running all the scenarios of Kraken on Kubernetes distribution - see https://github.com/redhat-chaos/krkn/issues/185, https://github.com/redhat-chaos/krkn/issues/186
|
||||
Enhancements being planned can be found in the [roadmap](ROADMAP.md).
|
||||
|
||||
|
||||
### Contributions
|
||||
|
||||
11
ROADMAP.md
Normal file
11
ROADMAP.md
Normal file
@@ -0,0 +1,11 @@
|
||||
## Krkn Roadmap
|
||||
|
||||
Following are a list of enhancements that we are planning to work on adding support in Krkn. Of course any help/contributions are greatly appreciated.
|
||||
|
||||
- [Ability to run multiple chaos scenarios in parallel under load to mimic real world outages](https://github.com/redhat-chaos/krkn/issues/424)
|
||||
- [Centralized storage for chaos experiments artifacts](https://github.com/redhat-chaos/krkn/issues/423)
|
||||
- [Support for causing DNS outages](https://github.com/redhat-chaos/krkn/issues/394)
|
||||
- [Support for pod level network traffic shaping](https://github.com/redhat-chaos/krkn/issues/393)
|
||||
- [Ability to visualize the metrics that are being captured by Kraken and stored in Elasticsearch](https://github.com/redhat-chaos/krkn/issues/124)
|
||||
- Support for running all the scenarios of Kraken on Kubernetes distribution - see https://github.com/redhat-chaos/krkn/issues/185, https://github.com/redhat-chaos/krkn/issues/186
|
||||
- Continue to improve [Chaos Testing Guide](https://redhat-chaos.github.io/krkn) in terms of adding best practices, test environment recommendations and scenarios to make sure the OpenShift platform, as well the applications running on top it, are resilient and performant under chaotic conditions.
|
||||
@@ -1,11 +1,65 @@
|
||||
- expr: avg_over_time(histogram_quantile(0.99, rate(etcd_disk_wal_fsync_duration_seconds_bucket[2m]))[5m:]) > 0.01
|
||||
description: 5 minutes avg. etcd fsync latency on {{$labels.pod}} higher than 10ms {{$value}}
|
||||
# etcd
|
||||
|
||||
- expr: avg_over_time(histogram_quantile(0.99, rate(etcd_disk_wal_fsync_duration_seconds_bucket[2m]))[10m:]) > 0.01
|
||||
description: 10 minutes avg. 99th etcd fsync latency on {{$labels.pod}} higher than 10ms. {{$value}}s
|
||||
severity: warning
|
||||
|
||||
- expr: avg_over_time(histogram_quantile(0.99, rate(etcd_disk_wal_fsync_duration_seconds_bucket[2m]))[10m:]) > 1
|
||||
description: 10 minutes avg. 99th etcd fsync latency on {{$labels.pod}} higher than 1s. {{$value}}s
|
||||
severity: error
|
||||
|
||||
- expr: avg_over_time(histogram_quantile(0.99, rate(etcd_network_peer_round_trip_time_seconds_bucket[5m]))[5m:]) > 0.1
|
||||
description: 5 minutes avg. etcd netowrk peer round trip on {{$labels.pod}} higher than 100ms {{$value}}
|
||||
severity: info
|
||||
- expr: avg_over_time(histogram_quantile(0.99, rate(etcd_disk_backend_commit_duration_seconds_bucket[2m]))[10m:]) > 0.03
|
||||
description: 10 minutes avg. 99th etcd commit latency on {{$labels.pod}} higher than 30ms. {{$value}}s
|
||||
severity: warning
|
||||
|
||||
- expr: increase(etcd_server_leader_changes_seen_total[2m]) > 0
|
||||
- expr: rate(etcd_server_leader_changes_seen_total[2m]) > 0
|
||||
description: etcd leader changes observed
|
||||
severity: critical
|
||||
severity: warning
|
||||
|
||||
# API server
|
||||
- expr: avg_over_time(histogram_quantile(0.99, sum(irate(apiserver_request_duration_seconds_bucket{apiserver="kube-apiserver", verb=~"POST|PUT|DELETE|PATCH", subresource!~"log|exec|portforward|attach|proxy"}[2m])) by (le, resource, verb))[10m:]) > 1
|
||||
description: 10 minutes avg. 99th mutating API call latency for {{$labels.verb}}/{{$labels.resource}} higher than 1 second. {{$value}}s
|
||||
severity: error
|
||||
|
||||
- expr: avg_over_time(histogram_quantile(0.99, sum(irate(apiserver_request_duration_seconds_bucket{apiserver="kube-apiserver", verb=~"LIST|GET", subresource!~"log|exec|portforward|attach|proxy", scope="resource"}[2m])) by (le, resource, verb, scope))[5m:]) > 1
|
||||
description: 5 minutes avg. 99th read-only API call latency for {{$labels.verb}}/{{$labels.resource}} in scope {{$labels.scope}} higher than 1 second. {{$value}}s
|
||||
severity: error
|
||||
|
||||
- expr: avg_over_time(histogram_quantile(0.99, sum(irate(apiserver_request_duration_seconds_bucket{apiserver="kube-apiserver", verb=~"LIST|GET", subresource!~"log|exec|portforward|attach|proxy", scope="namespace"}[2m])) by (le, resource, verb, scope))[5m:]) > 5
|
||||
description: 5 minutes avg. 99th read-only API call latency for {{$labels.verb}}/{{$labels.resource}} in scope {{$labels.scope}} higher than 5 seconds. {{$value}}s
|
||||
severity: error
|
||||
|
||||
- expr: avg_over_time(histogram_quantile(0.99, sum(irate(apiserver_request_duration_seconds_bucket{apiserver="kube-apiserver", verb=~"LIST|GET", subresource!~"log|exec|portforward|attach|proxy", scope="cluster"}[2m])) by (le, resource, verb, scope))[5m:]) > 30
|
||||
description: 5 minutes avg. 99th read-only API call latency for {{$labels.verb}}/{{$labels.resource}} in scope {{$labels.scope}} higher than 30 seconds. {{$value}}s
|
||||
severity: error
|
||||
|
||||
# Control plane pods
|
||||
- expr: up{apiserver=~"kube-apiserver|openshift-apiserver"} == 0
|
||||
description: "{{$labels.apiserver}} {{$labels.instance}} down"
|
||||
severity: warning
|
||||
|
||||
- expr: up{namespace=~"openshift-etcd"} == 0
|
||||
description: "{{$labels.namespace}}/{{$labels.pod}} down"
|
||||
severity: error
|
||||
|
||||
- expr: up{namespace=~"openshift-.*(kube-controller-manager|scheduler|controller-manager|sdn|ovn-kubernetes|dns)"} == 0
|
||||
description: "{{$labels.namespace}}/{{$labels.pod}} down"
|
||||
severity: warning
|
||||
|
||||
- expr: up{job=~"crio|kubelet"} == 0
|
||||
description: "{{$labels.node}}/{{$labels.job}} down"
|
||||
severity: warning
|
||||
|
||||
- expr: up{job="ovnkube-node"} == 0
|
||||
description: "{{$labels.instance}}/{{$labels.pod}} {{$labels.job}} down"
|
||||
severity: warning
|
||||
|
||||
# Service sync latency
|
||||
- expr: histogram_quantile(0.99, sum(rate(kubeproxy_network_programming_duration_seconds_bucket[2m])) by (le)) > 10
|
||||
description: 99th Kubeproxy network programming latency higher than 10 seconds. {{$value}}s
|
||||
severity: warning
|
||||
|
||||
# Prometheus alerts
|
||||
- expr: ALERTS{severity="critical", alertstate="firing"} > 0
|
||||
description: Critical prometheus alert. {{$labels.alertname}}
|
||||
severity: warning
|
||||
|
||||
@@ -6,14 +6,9 @@ kraken:
|
||||
signal_state: RUN # Will wait for the RUN signal when set to PAUSE before running the scenarios, refer docs/signal.md for more details
|
||||
signal_address: 0.0.0.0 # Signal listening address
|
||||
port: 8081 # Signal port
|
||||
litmus_install: True # Installs specified version, set to False if it's already setup
|
||||
litmus_version: v1.13.6 # Litmus version to install
|
||||
litmus_uninstall: False # If you want to uninstall litmus if failure
|
||||
litmus_uninstall_before_run: True # If you want to uninstall litmus before a new run starts
|
||||
chaos_scenarios: # List of policies/chaos scenarios to load
|
||||
- arcaflow_scenarios:
|
||||
- scenarios/arcaflow/cpu-hog/input.yaml
|
||||
- scenarios/arcaflow/io-hog/input.yaml
|
||||
- scenarios/arcaflow/memory-hog/input.yaml
|
||||
- container_scenarios: # List of chaos pod scenarios to load
|
||||
- - scenarios/openshift/container_etcd.yml
|
||||
@@ -31,13 +26,6 @@ kraken:
|
||||
- scenarios/openshift/openshift-kube-apiserver.yml
|
||||
- time_scenarios: # List of chaos time scenarios to load
|
||||
- scenarios/openshift/time_scenarios_example.yml
|
||||
- litmus_scenarios: # List of litmus scenarios to load
|
||||
- - scenarios/openshift/templates/litmus-rbac.yaml
|
||||
- scenarios/openshift/node_cpu_hog_engine.yaml
|
||||
- - scenarios/openshift/templates/litmus-rbac.yaml
|
||||
- scenarios/openshift/node_mem_engine.yaml
|
||||
- - scenarios/openshift/templates/litmus-rbac.yaml
|
||||
- scenarios/openshift/node_io_engine.yaml
|
||||
- cluster_shut_down_scenarios:
|
||||
- - scenarios/openshift/cluster_shut_down_scenario.yml
|
||||
- scenarios/openshift/post_action_shut_down.py
|
||||
@@ -62,7 +50,7 @@ cerberus:
|
||||
performance_monitoring:
|
||||
deploy_dashboards: False # Install a mutable grafana and load the performance dashboards. Enable this only when running on OpenShift
|
||||
repo: "https://github.com/cloud-bulldozer/performance-dashboards.git"
|
||||
kube_burner_binary_url: "https://github.com/cloud-bulldozer/kube-burner/releases/download/v0.9.1/kube-burner-0.9.1-Linux-x86_64.tar.gz"
|
||||
kube_burner_binary_url: "https://github.com/cloud-bulldozer/kube-burner/releases/download/v1.7.0/kube-burner-1.7.0-Linux-x86_64.tar.gz"
|
||||
capture_metrics: False
|
||||
config_path: config/kube_burner.yaml # Define the Elasticsearch url and index name in this config
|
||||
metrics_profile_path: config/metrics-aggregated.yaml
|
||||
|
||||
@@ -21,7 +21,7 @@ COPY --from=azure-cli /usr/local/bin/az /usr/bin/az
|
||||
RUN yum install epel-release -y && \
|
||||
yum install -y git python39 python3-pip jq gettext && \
|
||||
python3.9 -m pip install -U pip && \
|
||||
git clone https://github.com/redhat-chaos/krkn.git --branch v1.2.0 /root/kraken && \
|
||||
git clone https://github.com/redhat-chaos/krkn.git --branch v1.3.1 /root/kraken && \
|
||||
mkdir -p /root/.kube && cd /root/kraken && \
|
||||
pip3.9 install -r requirements.txt
|
||||
|
||||
|
||||
@@ -1,16 +1,16 @@
|
||||
## Alerts
|
||||
## SLOs validation
|
||||
|
||||
Pass/fail based on metrics captured from the cluster is important in addition to checking the health status and recovery. Kraken supports:
|
||||
|
||||
### Checking for critical alerts
|
||||
If enabled, the check runs at the end of each scenario and Kraken exits in case critical alerts are firing to allow user to debug. You can enable it in the config:
|
||||
### Checking for critical alerts post chaos
|
||||
If enabled, the check runs at the end of each scenario ( post chaos ) and Kraken exits in case critical alerts are firing to allow user to debug. You can enable it in the config:
|
||||
|
||||
```
|
||||
performance_monitoring:
|
||||
check_critical_alerts: False # When enabled will check prometheus for critical alerts firing post chaos
|
||||
```
|
||||
|
||||
### Alerting based on the queries defined by the user
|
||||
### Validation and alerting based on the queries defined by the user during chaos
|
||||
Takes PromQL queries as input and modifies the return code of the run to determine pass/fail. It's especially useful in case of automated runs in CI where user won't be able to monitor the system. It uses [Kube-burner](https://kube-burner.readthedocs.io/en/latest/) under the hood. This feature can be enabled in the [config](https://github.com/redhat-chaos/krkn/blob/main/config/config.yaml) by setting the following:
|
||||
|
||||
```
|
||||
@@ -7,7 +7,6 @@ The engine uses containers to execute plugins and runs them either locally in Do
|
||||
#### Hog scenarios:
|
||||
- [CPU Hog](arcaflow_scenarios/cpu_hog.md)
|
||||
- [Memory Hog](arcaflow_scenarios/memory_hog.md)
|
||||
- [I/O Hog](arcaflow_scenarios/io_hog.md)
|
||||
|
||||
|
||||
### Prequisites
|
||||
|
||||
@@ -1,21 +0,0 @@
|
||||
# I/O Hog
|
||||
This scenario is based on the arcaflow [arcaflow-plugin-stressng](https://github.com/arcalot/arcaflow-plugin-stressng) plugin.
|
||||
The purpose of this scenario is to create disk pressure on a particular node of the Kubernetes/OpenShift cluster for a time span.
|
||||
The scenario allows to attach a node path to the pod as a `hostPath` volume.
|
||||
To enable this plugin add the pointer to the scenario input file `scenarios/arcaflow/io-hog/input.yaml` as described in the
|
||||
Usage section.
|
||||
This scenario takes a list of objects named `input_list` with the following properties:
|
||||
|
||||
- **kubeconfig :** *string* the kubeconfig needed by the deployer to deploy the sysbench plugin in the target cluster
|
||||
- **namespace :** *string* the namespace where the scenario container will be deployed
|
||||
**Note:** this parameter will be automatically filled by kraken if the `kubeconfig_path` property is correctly set
|
||||
- **node_selector :** *key-value map* the node label that will be used as `nodeSelector` by the pod to target a specific cluster node
|
||||
- **duration :** *string* stop stress test after N seconds. One can also specify the units of time in seconds, minutes, hours, days or years with the suffix s, m, h, d or y.
|
||||
- **target_pod_folder :** *string* the path in the pod where the volume is mounted
|
||||
- **target_pod_volume :** *object* the `hostPath` volume definition in the [Kubernetes/OpenShift](https://docs.openshift.com/container-platform/3.11/install_config/persistent_storage/using_hostpath.html) format, that will be attached to the pod as a volume
|
||||
- **io_write_bytes :** *string* writes N bytes for each hdd process. The size can be expressed as % of free space on the file system or in units of Bytes, KBytes, MBytes and GBytes using the suffix b, k, m or g
|
||||
- **io_block_size :** *string* size of each write in bytes. Size can be from 1 byte to 4m.
|
||||
|
||||
To perform several load tests in the same run simultaneously (eg. stress two or more nodes in the same run) add another item
|
||||
to the `input_list` with the same properties (and eventually different values eg. different node_selectors
|
||||
to schedule the pod on different nodes). To reduce (or increase) the parallelism change the value `parallelism` in `workload.yaml` file
|
||||
@@ -27,14 +27,6 @@ The prometheus url/route and bearer token are automatically obtained in case of
|
||||
**signal_address**: Address to listen/post the signal state to
|
||||
**port**: port to listen/post the signal state to
|
||||
|
||||
## Litmus Variables
|
||||
Litmus installation specifics if you are running one of the hog scenarios. See [litmus doc](litmus_scenarios.md) for more information on these types of scenarios
|
||||
**litmus_install**: Installs specified version of litmus, set to False if it's already setup
|
||||
**litmus_version**: Litmus version to install
|
||||
**litmus_uninstall**: If you want to uninstall litmus if failure
|
||||
**litmus_uninstall_before_run**: If you want to uninstall litmus before a new run starts, True or False values
|
||||
|
||||
|
||||
## Chaos Scenarios
|
||||
|
||||
**chaos_scenarios**: List of different types of chaos scenarios you want to run with paths to their specific yaml file configurations
|
||||
@@ -48,7 +40,6 @@ Chaos scenario types:
|
||||
- plugin_scenarios
|
||||
- node_scenarios
|
||||
- time_scenarios
|
||||
- litmus_scenarios
|
||||
- cluster_shut_down_scenarios
|
||||
- namespace_scenarios
|
||||
- zone_outages
|
||||
|
||||
@@ -155,7 +155,6 @@ Let us take a look at how to run the chaos scenarios on your OpenShift clusters
|
||||
- Helps understand if the application/system components have reserved resources to not get disrupted because of rogue applications, or get performance throttled.
|
||||
- CPU Hog ([Documentation](https://github.com/redhat-chaos/krkn-hub/blob/main/docs/node-cpu-hog.md), [Demo](https://asciinema.org/a/452762))
|
||||
- Memory Hog ([Documentation](https://github.com/redhat-chaos/krkn-hub/blob/main/docs/node-memory-hog.md), [Demo](https://asciinema.org/a/452742?speed=3&theme=solarized-dark))
|
||||
- IO Hog ([Documentation](https://github.com/redhat-chaos/krkn-hub/blob/main/docs/node-io-hog.md))
|
||||
|
||||
- Time Skewing ([Documentation](https://github.com/redhat-chaos/krkn-hub/blob/main/docs/time-scenarios.md))
|
||||
- Manipulate the system time and/or date of specific pods/nodes.
|
||||
|
||||
@@ -1,41 +0,0 @@
|
||||
### Litmus Scenarios
|
||||
Kraken consumes [Litmus](https://github.com/litmuschaos/litmus) under the hood for some scenarios
|
||||
|
||||
Official Litmus documentation and specifics of Litmus resources can be found [here](https://docs.litmuschaos.io/docs/next/getstarted/)
|
||||
|
||||
|
||||
#### Litmus Chaos Custom Resources
|
||||
There are 3 custom resources that are created during each Litmus scenario. Below is a description of the resources:
|
||||
* ChaosEngine: A resource to link a Kubernetes application or Kubernetes node to a ChaosExperiment. ChaosEngine is watched by Litmus' Chaos-Operator which then invokes Chaos-Experiments.
|
||||
* ChaosExperiment: A resource to group the configuration parameters of a chaos experiment. ChaosExperiment CRs are created by the operator when experiments are invoked by ChaosEngine.
|
||||
* ChaosResult : A resource to hold the results of a chaos-experiment. The Chaos-exporter reads the results and exports the metrics into a configured Prometheus server.
|
||||
|
||||
### Understanding Litmus Scenarios
|
||||
|
||||
To run Litmus scenarios we need to apply 3 different resources/yaml files to our cluster.
|
||||
1. **Chaos experiments** contain the actual chaos details of a scenario.
|
||||
|
||||
i. This is installed automatically by Kraken (does not need to be specified in kraken scenario configuration).
|
||||
|
||||
2. **Service Account**: should be created to allow chaosengine to run experiments in your application namespace. Usually it sets just enough permissions to a specific namespace to be able to run the experiment properly.
|
||||
|
||||
i. This can be defined using either a link to a yaml file or a downloaded file in the scenarios' folder.
|
||||
|
||||
3. **Chaos Engine** connects the application instance to a Chaos Experiment. This is where you define the specifics of your scenario; i.e.: the node or pod name you want to cause chaos within.
|
||||
|
||||
i. This is a downloaded yaml file in the scenarios' folder. A full list of scenarios can be found [here](https://hub.litmuschaos.io/)
|
||||
|
||||
**NOTE**: By default, all chaos experiments will be installed based on the version you give in the config file.
|
||||
|
||||
Adding a new Litmus based scenario is as simple as adding references to 2 new yaml files (the Service Account and Chaos engine files for your scenario ) in the Kraken config.
|
||||
|
||||
|
||||
### Supported scenarios
|
||||
|
||||
The following are the start of scenarios for which a chaos scenario config exists today.
|
||||
|
||||
Scenario | Description | Working
|
||||
------------------------ |-----------------------------------------------------------------------------------------| ------------------------- |
|
||||
[Node CPU Hog](https://github.com/redhat-chaos/krkn/blob/main/scenarios/node_cpu_hog_engine.yaml) | Chaos scenario that hogs up the CPU on a defined node for a specific amount of time. | :heavy_check_mark: |
|
||||
[Node Memory Hog](https://github.com/redhat-chaos/krkn/blob/main/scenarios/node_mem_engine.yaml) | Chaos scenario that hogs up the memory on a defined node for a specific amount of time. | :heavy_check_mark: |
|
||||
[Node IO Hog](https://github.com/redhat-chaos/krkn/blob/main/scenarios/node_io_engine.yaml) | Chaos scenario that hogs up the IO on a defined node for a specific amount of time. | :heavy_check_mark: |
|
||||
@@ -40,7 +40,7 @@ def scrape_metrics(
|
||||
distribution, prometheus_url, prometheus_bearer_token
|
||||
)
|
||||
else:
|
||||
logging.error("Looks like proemtheus url is not defined, exiting")
|
||||
logging.error("Looks like prometheus url is not defined, exiting")
|
||||
sys.exit(1)
|
||||
command = (
|
||||
"./kube-burner index --uuid "
|
||||
|
||||
@@ -24,13 +24,15 @@ def initialize_clients(kubeconfig_path):
|
||||
global dyn_client
|
||||
global custom_object_client
|
||||
try:
|
||||
config.load_kube_config(kubeconfig_path)
|
||||
if kubeconfig_path:
|
||||
config.load_kube_config(kubeconfig_path)
|
||||
else:
|
||||
config.load_incluster_config()
|
||||
api_client = client.ApiClient()
|
||||
k8s_client = config.new_client_from_config(config_file=kubeconfig_path)
|
||||
cli = client.CoreV1Api(k8s_client)
|
||||
batch_cli = client.BatchV1Api(k8s_client)
|
||||
custom_object_client = client.CustomObjectsApi(k8s_client)
|
||||
dyn_client = DynamicClient(k8s_client)
|
||||
cli = client.CoreV1Api(api_client)
|
||||
batch_cli = client.BatchV1Api(api_client)
|
||||
custom_object_client = client.CustomObjectsApi(api_client)
|
||||
dyn_client = DynamicClient(api_client)
|
||||
watch_resource = watch.Watch()
|
||||
except ApiException as e:
|
||||
logging.error("Failed to initialize kubernetes client: %s\n" % e)
|
||||
@@ -1,5 +1,5 @@
|
||||
import kraken.invoke.command as runcommand
|
||||
import kraken.kubernetes.client as kubecli
|
||||
import krkn_lib_kubernetes
|
||||
import logging
|
||||
import time
|
||||
import sys
|
||||
@@ -8,8 +8,16 @@ import yaml
|
||||
import kraken.cerberus.setup as cerberus
|
||||
|
||||
|
||||
# krkn_lib_kubernetes
|
||||
# Inject litmus scenarios defined in the config
|
||||
def run(scenarios_list, config, litmus_uninstall, wait_duration, litmus_namespace):
|
||||
def run(
|
||||
scenarios_list,
|
||||
config,
|
||||
litmus_uninstall,
|
||||
wait_duration,
|
||||
litmus_namespace,
|
||||
kubecli: krkn_lib_kubernetes.KrknLibKubernetes
|
||||
):
|
||||
# Loop to run the scenarios starts here
|
||||
for l_scenario in scenarios_list:
|
||||
start_time = int(time.time())
|
||||
@@ -35,16 +43,16 @@ def run(scenarios_list, config, litmus_uninstall, wait_duration, litmus_namespac
|
||||
sys.exit(1)
|
||||
for expr in experiment_names:
|
||||
expr_name = expr["name"]
|
||||
experiment_result = check_experiment(engine_name, expr_name, litmus_namespace)
|
||||
experiment_result = check_experiment(engine_name, expr_name, litmus_namespace, kubecli)
|
||||
if experiment_result:
|
||||
logging.info("Scenario: %s has been successfully injected!" % item)
|
||||
else:
|
||||
logging.info("Scenario: %s was not successfully injected, please check" % item)
|
||||
if litmus_uninstall:
|
||||
delete_chaos(litmus_namespace)
|
||||
delete_chaos(litmus_namespace, kubecli)
|
||||
sys.exit(1)
|
||||
if litmus_uninstall:
|
||||
delete_chaos(litmus_namespace)
|
||||
delete_chaos(litmus_namespace, kubecli)
|
||||
logging.info("Waiting for the specified duration: %s" % wait_duration)
|
||||
time.sleep(wait_duration)
|
||||
end_time = int(time.time())
|
||||
@@ -86,7 +94,8 @@ def deploy_all_experiments(version_string, namespace):
|
||||
)
|
||||
|
||||
|
||||
def wait_for_initialized(engine_name, experiment_name, namespace):
|
||||
# krkn_lib_kubernetes
|
||||
def wait_for_initialized(engine_name, experiment_name, namespace, kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
|
||||
chaos_engine = kubecli.get_litmus_chaos_object(kind='chaosengine', name=engine_name,
|
||||
namespace=namespace).engineStatus
|
||||
@@ -110,10 +119,17 @@ def wait_for_initialized(engine_name, experiment_name, namespace):
|
||||
return True
|
||||
|
||||
|
||||
def wait_for_status(engine_name, expected_status, experiment_name, namespace):
|
||||
# krkn_lib_kubernetes
|
||||
def wait_for_status(
|
||||
engine_name,
|
||||
expected_status,
|
||||
experiment_name,
|
||||
namespace,
|
||||
kubecli: krkn_lib_kubernetes.KrknLibKubernetes
|
||||
):
|
||||
|
||||
if expected_status == "running":
|
||||
response = wait_for_initialized(engine_name, experiment_name, namespace)
|
||||
response = wait_for_initialized(engine_name, experiment_name, namespace, kubecli)
|
||||
if not response:
|
||||
logging.info("Chaos engine never initialized, exiting")
|
||||
return False
|
||||
@@ -140,12 +156,13 @@ def wait_for_status(engine_name, expected_status, experiment_name, namespace):
|
||||
|
||||
|
||||
# Check status of experiment
|
||||
def check_experiment(engine_name, experiment_name, namespace):
|
||||
# krkn_lib_kubernetes
|
||||
def check_experiment(engine_name, experiment_name, namespace, kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
|
||||
wait_response = wait_for_status(engine_name, "running", experiment_name, namespace)
|
||||
wait_response = wait_for_status(engine_name, "running", experiment_name, namespace, kubecli)
|
||||
|
||||
if wait_response:
|
||||
wait_for_status(engine_name, "completed", experiment_name, namespace)
|
||||
wait_for_status(engine_name, "completed", experiment_name, namespace, kubecli)
|
||||
else:
|
||||
sys.exit(1)
|
||||
|
||||
@@ -166,7 +183,8 @@ def check_experiment(engine_name, experiment_name, namespace):
|
||||
|
||||
|
||||
# Delete all chaos engines in a given namespace
|
||||
def delete_chaos_experiments(namespace):
|
||||
# krkn_lib_kubernetes
|
||||
def delete_chaos_experiments(namespace, kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
|
||||
if kubecli.check_if_namespace_exists(namespace):
|
||||
chaos_exp_exists = runcommand.invoke_no_exit("kubectl get chaosexperiment")
|
||||
@@ -176,7 +194,8 @@ def delete_chaos_experiments(namespace):
|
||||
|
||||
|
||||
# Delete all chaos engines in a given namespace
|
||||
def delete_chaos(namespace):
|
||||
# krkn_lib_kubernetes
|
||||
def delete_chaos(namespace, kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
|
||||
if kubecli.check_if_namespace_exists(namespace):
|
||||
logging.info("Deleting all litmus run objects")
|
||||
@@ -190,7 +209,8 @@ def delete_chaos(namespace):
|
||||
logging.info(namespace + " namespace doesn't exist")
|
||||
|
||||
|
||||
def uninstall_litmus(version, litmus_namespace):
|
||||
# krkn_lib_kubernetes
|
||||
def uninstall_litmus(version, litmus_namespace, kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
|
||||
if kubecli.check_if_namespace_exists(litmus_namespace):
|
||||
logging.info("Uninstalling Litmus operator")
|
||||
|
||||
@@ -1,10 +1,15 @@
|
||||
import random
|
||||
import logging
|
||||
import kraken.kubernetes.client as kubecli
|
||||
|
||||
import krkn_lib_kubernetes
|
||||
|
||||
# krkn_lib_kubernetes
|
||||
# Pick a random managedcluster with specified label selector
|
||||
def get_managedcluster(managedcluster_name, label_selector, instance_kill_count):
|
||||
def get_managedcluster(
|
||||
managedcluster_name,
|
||||
label_selector,
|
||||
instance_kill_count,
|
||||
kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
|
||||
if managedcluster_name in kubecli.list_killable_managedclusters():
|
||||
return [managedcluster_name]
|
||||
elif managedcluster_name:
|
||||
@@ -25,10 +30,12 @@ def get_managedcluster(managedcluster_name, label_selector, instance_kill_count)
|
||||
|
||||
|
||||
# Wait until the managedcluster status becomes Available
|
||||
def wait_for_available_status(managedcluster, timeout):
|
||||
# krkn_lib_kubernetes
|
||||
def wait_for_available_status(managedcluster, timeout, kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
kubecli.watch_managedcluster_status(managedcluster, "True", timeout)
|
||||
|
||||
|
||||
# Wait until the managedcluster status becomes Not Available
|
||||
def wait_for_unavailable_status(managedcluster, timeout):
|
||||
# krkn_lib_kubernetes
|
||||
def wait_for_unavailable_status(managedcluster, timeout, kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
kubecli.watch_managedcluster_status(managedcluster, "Unknown", timeout)
|
||||
|
||||
@@ -5,7 +5,7 @@ import logging
|
||||
import sys
|
||||
import yaml
|
||||
import html
|
||||
import kraken.kubernetes.client as kubecli
|
||||
import krkn_lib_kubernetes
|
||||
import kraken.managedcluster_scenarios.common_managedcluster_functions as common_managedcluster_functions
|
||||
|
||||
|
||||
@@ -13,9 +13,11 @@ class GENERAL:
|
||||
def __init__(self):
|
||||
pass
|
||||
|
||||
|
||||
# krkn_lib_kubernetes
|
||||
class managedcluster_scenarios():
|
||||
def __init__(self):
|
||||
kubecli: krkn_lib_kubernetes.KrknLibKubernetes
|
||||
def __init__(self, kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
self.kubecli = kubecli
|
||||
self.general = GENERAL()
|
||||
|
||||
# managedcluster scenario to start the managedcluster
|
||||
@@ -31,16 +33,16 @@ class managedcluster_scenarios():
|
||||
args="""kubectl scale deployment.apps/klusterlet --replicas 3 &
|
||||
kubectl scale deployment.apps/klusterlet-registration-agent --replicas 1 -n open-cluster-management-agent""")
|
||||
)
|
||||
kubecli.create_manifestwork(body, managedcluster)
|
||||
self.kubecli.create_manifestwork(body, managedcluster)
|
||||
logging.info("managedcluster_start_scenario has been successfully injected!")
|
||||
logging.info("Waiting for the specified timeout: %s" % timeout)
|
||||
common_managedcluster_functions.wait_for_available_status(managedcluster, timeout)
|
||||
common_managedcluster_functions.wait_for_available_status(managedcluster, timeout, self.kubecli)
|
||||
except Exception as e:
|
||||
logging.error("managedcluster scenario exiting due to Exception %s" % e)
|
||||
sys.exit(1)
|
||||
finally:
|
||||
logging.info("Deleting manifestworks")
|
||||
kubecli.delete_manifestwork(managedcluster)
|
||||
self.kubecli.delete_manifestwork(managedcluster)
|
||||
|
||||
# managedcluster scenario to stop the managedcluster
|
||||
def managedcluster_stop_scenario(self, instance_kill_count, managedcluster, timeout):
|
||||
@@ -55,16 +57,16 @@ class managedcluster_scenarios():
|
||||
args="""kubectl scale deployment.apps/klusterlet --replicas 0 &&
|
||||
kubectl scale deployment.apps/klusterlet-registration-agent --replicas 0 -n open-cluster-management-agent""")
|
||||
)
|
||||
kubecli.create_manifestwork(body, managedcluster)
|
||||
self.kubecli.create_manifestwork(body, managedcluster)
|
||||
logging.info("managedcluster_stop_scenario has been successfully injected!")
|
||||
logging.info("Waiting for the specified timeout: %s" % timeout)
|
||||
common_managedcluster_functions.wait_for_unavailable_status(managedcluster, timeout)
|
||||
common_managedcluster_functions.wait_for_unavailable_status(managedcluster, timeout, self.kubecli)
|
||||
except Exception as e:
|
||||
logging.error("managedcluster scenario exiting due to Exception %s" % e)
|
||||
sys.exit(1)
|
||||
finally:
|
||||
logging.info("Deleting manifestworks")
|
||||
kubecli.delete_manifestwork(managedcluster)
|
||||
self.kubecli.delete_manifestwork(managedcluster)
|
||||
|
||||
# managedcluster scenario to stop and then start the managedcluster
|
||||
def managedcluster_stop_start_scenario(self, instance_kill_count, managedcluster, timeout):
|
||||
@@ -94,7 +96,7 @@ class managedcluster_scenarios():
|
||||
template.render(managedcluster_name=managedcluster,
|
||||
args="""kubectl scale deployment.apps/klusterlet --replicas 3""")
|
||||
)
|
||||
kubecli.create_manifestwork(body, managedcluster)
|
||||
self.kubecli.create_manifestwork(body, managedcluster)
|
||||
logging.info("start_klusterlet_scenario has been successfully injected!")
|
||||
time.sleep(30) # until https://github.com/open-cluster-management-io/OCM/issues/118 gets solved
|
||||
except Exception as e:
|
||||
@@ -102,7 +104,7 @@ class managedcluster_scenarios():
|
||||
sys.exit(1)
|
||||
finally:
|
||||
logging.info("Deleting manifestworks")
|
||||
kubecli.delete_manifestwork(managedcluster)
|
||||
self.kubecli.delete_manifestwork(managedcluster)
|
||||
|
||||
# managedcluster scenario to stop the klusterlet
|
||||
def stop_klusterlet_scenario(self, instance_kill_count, managedcluster, timeout):
|
||||
@@ -116,7 +118,7 @@ class managedcluster_scenarios():
|
||||
template.render(managedcluster_name=managedcluster,
|
||||
args="""kubectl scale deployment.apps/klusterlet --replicas 0""")
|
||||
)
|
||||
kubecli.create_manifestwork(body, managedcluster)
|
||||
self.kubecli.create_manifestwork(body, managedcluster)
|
||||
logging.info("stop_klusterlet_scenario has been successfully injected!")
|
||||
time.sleep(30) # until https://github.com/open-cluster-management-io/OCM/issues/118 gets solved
|
||||
except Exception as e:
|
||||
@@ -124,7 +126,7 @@ class managedcluster_scenarios():
|
||||
sys.exit(1)
|
||||
finally:
|
||||
logging.info("Deleting manifestworks")
|
||||
kubecli.delete_manifestwork(managedcluster)
|
||||
self.kubecli.delete_manifestwork(managedcluster)
|
||||
|
||||
# managedcluster scenario to stop and start the klusterlet
|
||||
def stop_start_klusterlet_scenario(self, instance_kill_count, managedcluster, timeout):
|
||||
|
||||
@@ -1,26 +1,29 @@
|
||||
import yaml
|
||||
import logging
|
||||
import time
|
||||
import krkn_lib_kubernetes
|
||||
from kraken.managedcluster_scenarios.managedcluster_scenarios import managedcluster_scenarios
|
||||
import kraken.managedcluster_scenarios.common_managedcluster_functions as common_managedcluster_functions
|
||||
import kraken.cerberus.setup as cerberus
|
||||
|
||||
|
||||
# Get the managedcluster scenarios object of specfied cloud type
|
||||
def get_managedcluster_scenario_object(managedcluster_scenario):
|
||||
return managedcluster_scenarios()
|
||||
# krkn_lib_kubernetes
|
||||
def get_managedcluster_scenario_object(managedcluster_scenario, kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
return managedcluster_scenarios(kubecli)
|
||||
|
||||
# Run defined scenarios
|
||||
def run(scenarios_list, config, wait_duration):
|
||||
# krkn_lib_kubernetes
|
||||
def run(scenarios_list, config, wait_duration, kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
for managedcluster_scenario_config in scenarios_list:
|
||||
with open(managedcluster_scenario_config, "r") as f:
|
||||
managedcluster_scenario_config = yaml.full_load(f)
|
||||
for managedcluster_scenario in managedcluster_scenario_config["managedcluster_scenarios"]:
|
||||
managedcluster_scenario_object = get_managedcluster_scenario_object(managedcluster_scenario)
|
||||
managedcluster_scenario_object = get_managedcluster_scenario_object(managedcluster_scenario, kubecli)
|
||||
if managedcluster_scenario["actions"]:
|
||||
for action in managedcluster_scenario["actions"]:
|
||||
start_time = int(time.time())
|
||||
inject_managedcluster_scenario(action, managedcluster_scenario, managedcluster_scenario_object)
|
||||
inject_managedcluster_scenario(action, managedcluster_scenario, managedcluster_scenario_object, kubecli)
|
||||
logging.info("Waiting for the specified duration: %s" % (wait_duration))
|
||||
time.sleep(wait_duration)
|
||||
end_time = int(time.time())
|
||||
@@ -29,7 +32,8 @@ def run(scenarios_list, config, wait_duration):
|
||||
|
||||
|
||||
# Inject the specified managedcluster scenario
|
||||
def inject_managedcluster_scenario(action, managedcluster_scenario, managedcluster_scenario_object):
|
||||
# krkn_lib_kubernetes
|
||||
def inject_managedcluster_scenario(action, managedcluster_scenario, managedcluster_scenario_object, kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
# Get the managedcluster scenario configurations
|
||||
run_kill_count = managedcluster_scenario.get("runs", 1)
|
||||
instance_kill_count = managedcluster_scenario.get("instance_count", 1)
|
||||
@@ -42,7 +46,7 @@ def inject_managedcluster_scenario(action, managedcluster_scenario, managedclust
|
||||
else:
|
||||
managedcluster_name_list = [managedcluster_name]
|
||||
for single_managedcluster_name in managedcluster_name_list:
|
||||
managedclusters = common_managedcluster_functions.get_managedcluster(single_managedcluster_name, label_selector, instance_kill_count)
|
||||
managedclusters = common_managedcluster_functions.get_managedcluster(single_managedcluster_name, label_selector, instance_kill_count, kubecli)
|
||||
for single_managedcluster in managedclusters:
|
||||
if action == "managedcluster_start_scenario":
|
||||
managedcluster_scenario_object.managedcluster_start_scenario(run_kill_count, single_managedcluster, timeout)
|
||||
|
||||
@@ -1,14 +1,23 @@
|
||||
import time
|
||||
import random
|
||||
import logging
|
||||
import kraken.kubernetes.client as kubecli
|
||||
import krkn_lib_kubernetes
|
||||
import kraken.cerberus.setup as cerberus
|
||||
import kraken.post_actions.actions as post_actions
|
||||
import yaml
|
||||
import sys
|
||||
|
||||
|
||||
def run(scenarios_list, config, wait_duration, failed_post_scenarios, kubeconfig_path):
|
||||
# krkn_lib_kubernetes
|
||||
def run(
|
||||
scenarios_list,
|
||||
config,
|
||||
wait_duration,
|
||||
failed_post_scenarios,
|
||||
kubeconfig_path,
|
||||
kubecli: krkn_lib_kubernetes.KrknLibKubernetes
|
||||
):
|
||||
|
||||
for scenario_config in scenarios_list:
|
||||
if len(scenario_config) > 1:
|
||||
pre_action_output = post_actions.run(kubeconfig_path, scenario_config[1])
|
||||
@@ -69,12 +78,12 @@ def run(scenarios_list, config, wait_duration, failed_post_scenarios, kubeconfig
|
||||
logging.error("Failed to run post action checks: %s" % e)
|
||||
sys.exit(1)
|
||||
else:
|
||||
failed_post_scenarios = check_active_namespace(killed_namespaces, wait_time)
|
||||
failed_post_scenarios = check_active_namespace(killed_namespaces, wait_time, kubecli)
|
||||
end_time = int(time.time())
|
||||
cerberus.publish_kraken_status(config, failed_post_scenarios, start_time, end_time)
|
||||
|
||||
|
||||
def check_active_namespace(killed_namespaces, wait_time):
|
||||
# krkn_lib_kubernetes
|
||||
def check_active_namespace(killed_namespaces, wait_time, kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
active_namespace = []
|
||||
timer = 0
|
||||
while timer < wait_time and killed_namespaces:
|
||||
|
||||
@@ -4,14 +4,15 @@ import time
|
||||
import sys
|
||||
import os
|
||||
import random
|
||||
import krkn_lib_kubernetes
|
||||
from jinja2 import Environment, FileSystemLoader
|
||||
import kraken.cerberus.setup as cerberus
|
||||
import kraken.kubernetes.client as kubecli
|
||||
import kraken.node_actions.common_node_functions as common_node_functions
|
||||
|
||||
|
||||
# krkn_lib_kubernetes
|
||||
# Reads the scenario config and introduces traffic variations in Node's host network interface.
|
||||
def run(scenarios_list, config, wait_duration):
|
||||
def run(scenarios_list, config, wait_duration, kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
failed_post_scenarios = ""
|
||||
logging.info("Runing the Network Chaos tests")
|
||||
for net_config in scenarios_list:
|
||||
@@ -32,11 +33,11 @@ def run(scenarios_list, config, wait_duration):
|
||||
node_name_list = [test_node]
|
||||
nodelst = []
|
||||
for single_node_name in node_name_list:
|
||||
nodelst.extend(common_node_functions.get_node(single_node_name, test_node_label, test_instance_count))
|
||||
nodelst.extend(common_node_functions.get_node(single_node_name, test_node_label, test_instance_count, kubecli))
|
||||
file_loader = FileSystemLoader(os.path.abspath(os.path.dirname(__file__)))
|
||||
env = Environment(loader=file_loader, autoescape=True)
|
||||
pod_template = env.get_template("pod.j2")
|
||||
test_interface = verify_interface(test_interface, nodelst, pod_template)
|
||||
test_interface = verify_interface(test_interface, nodelst, pod_template, kubecli)
|
||||
joblst = []
|
||||
egress_lst = [i for i in param_lst if i in test_egress]
|
||||
chaos_config = {
|
||||
@@ -68,7 +69,7 @@ def run(scenarios_list, config, wait_duration):
|
||||
if test_execution == "serial":
|
||||
logging.info("Waiting for serial job to finish")
|
||||
start_time = int(time.time())
|
||||
wait_for_job(joblst[:], test_duration + 300)
|
||||
wait_for_job(joblst[:], kubecli, test_duration + 300)
|
||||
logging.info("Waiting for wait_duration %s" % wait_duration)
|
||||
time.sleep(wait_duration)
|
||||
end_time = int(time.time())
|
||||
@@ -78,7 +79,7 @@ def run(scenarios_list, config, wait_duration):
|
||||
if test_execution == "parallel":
|
||||
logging.info("Waiting for parallel job to finish")
|
||||
start_time = int(time.time())
|
||||
wait_for_job(joblst[:], test_duration + 300)
|
||||
wait_for_job(joblst[:], kubecli, test_duration + 300)
|
||||
logging.info("Waiting for wait_duration %s" % wait_duration)
|
||||
time.sleep(wait_duration)
|
||||
end_time = int(time.time())
|
||||
@@ -88,10 +89,11 @@ def run(scenarios_list, config, wait_duration):
|
||||
sys.exit(1)
|
||||
finally:
|
||||
logging.info("Deleting jobs")
|
||||
delete_job(joblst[:])
|
||||
delete_job(joblst[:], kubecli)
|
||||
|
||||
|
||||
def verify_interface(test_interface, nodelst, template):
|
||||
# krkn_lib_kubernetes
|
||||
def verify_interface(test_interface, nodelst, template, kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
pod_index = random.randint(0, len(nodelst) - 1)
|
||||
pod_body = yaml.safe_load(template.render(nodename=nodelst[pod_index]))
|
||||
logging.info("Creating pod to query interface on node %s" % nodelst[pod_index])
|
||||
@@ -115,14 +117,16 @@ def verify_interface(test_interface, nodelst, template):
|
||||
kubecli.delete_pod("fedtools", "default")
|
||||
|
||||
|
||||
def get_job_pods(api_response):
|
||||
# krkn_lib_kubernetes
|
||||
def get_job_pods(api_response, kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
controllerUid = api_response.metadata.labels["controller-uid"]
|
||||
pod_label_selector = "controller-uid=" + controllerUid
|
||||
pods_list = kubecli.list_pods(label_selector=pod_label_selector, namespace="default")
|
||||
return pods_list[0]
|
||||
|
||||
|
||||
def wait_for_job(joblst, timeout=300):
|
||||
# krkn_lib_kubernetes
|
||||
def wait_for_job(joblst, kubecli: krkn_lib_kubernetes.KrknLibKubernetes, timeout=300):
|
||||
waittime = time.time() + timeout
|
||||
count = 0
|
||||
joblen = len(joblst)
|
||||
@@ -134,25 +138,26 @@ def wait_for_job(joblst, timeout=300):
|
||||
count += 1
|
||||
joblst.remove(jobname)
|
||||
except Exception:
|
||||
logging.warn("Exception in getting job status")
|
||||
logging.warning("Exception in getting job status")
|
||||
if time.time() > waittime:
|
||||
raise Exception("Starting pod failed")
|
||||
time.sleep(5)
|
||||
|
||||
|
||||
def delete_job(joblst):
|
||||
# krkn_lib_kubernetes
|
||||
def delete_job(joblst, kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
for jobname in joblst:
|
||||
try:
|
||||
api_response = kubecli.get_job_status(jobname, namespace="default")
|
||||
if api_response.status.failed is not None:
|
||||
pod_name = get_job_pods(api_response)
|
||||
pod_name = get_job_pods(api_response, kubecli)
|
||||
pod_stat = kubecli.read_pod(name=pod_name, namespace="default")
|
||||
logging.error(pod_stat.status.container_statuses)
|
||||
pod_log_response = kubecli.get_pod_log(name=pod_name, namespace="default")
|
||||
pod_log = pod_log_response.data.decode("utf-8")
|
||||
logging.error(pod_log)
|
||||
except Exception:
|
||||
logging.warn("Exception in getting job status")
|
||||
logging.warning("Exception in getting job status")
|
||||
api_response = kubecli.delete_job(name=jobname, namespace="default")
|
||||
|
||||
|
||||
|
||||
@@ -2,10 +2,13 @@ import sys
|
||||
import logging
|
||||
import kraken.invoke.command as runcommand
|
||||
import kraken.node_actions.common_node_functions as nodeaction
|
||||
import krkn_lib_kubernetes
|
||||
|
||||
|
||||
# krkn_lib_kubernetes
|
||||
class abstract_node_scenarios:
|
||||
|
||||
kubecli: krkn_lib_kubernetes.KrknLibKubernetes
|
||||
def __init__(self, kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
self.kubecli = kubecli
|
||||
# Node scenario to start the node
|
||||
def node_start_scenario(self, instance_kill_count, node, timeout):
|
||||
pass
|
||||
@@ -42,7 +45,7 @@ class abstract_node_scenarios:
|
||||
logging.info("Starting stop_kubelet_scenario injection")
|
||||
logging.info("Stopping the kubelet of the node %s" % (node))
|
||||
runcommand.run("oc debug node/" + node + " -- chroot /host systemctl stop kubelet")
|
||||
nodeaction.wait_for_unknown_status(node, timeout)
|
||||
nodeaction.wait_for_unknown_status(node, timeout, self.kubecli)
|
||||
logging.info("The kubelet of the node %s has been stopped" % (node))
|
||||
logging.info("stop_kubelet_scenario has been successfuly injected!")
|
||||
except Exception as e:
|
||||
|
||||
@@ -1,5 +1,6 @@
|
||||
import sys
|
||||
import time
|
||||
import krkn_lib_kubernetes
|
||||
from aliyunsdkcore.client import AcsClient
|
||||
from aliyunsdkecs.request.v20140526 import DescribeInstancesRequest, DeleteInstanceRequest
|
||||
from aliyunsdkecs.request.v20140526 import StopInstanceRequest, StartInstanceRequest, RebootInstanceRequest
|
||||
@@ -179,9 +180,9 @@ class Alibaba:
|
||||
logging.info("ECS %s is released" % instance_id)
|
||||
return True
|
||||
|
||||
|
||||
# krkn_lib_kubernetes
|
||||
class alibaba_node_scenarios(abstract_node_scenarios):
|
||||
def __init__(self):
|
||||
def __init__(self,kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
self.alibaba = Alibaba()
|
||||
|
||||
# Node scenario to start the node
|
||||
@@ -193,7 +194,7 @@ class alibaba_node_scenarios(abstract_node_scenarios):
|
||||
logging.info("Starting the node %s with instance ID: %s " % (node, vm_id))
|
||||
self.alibaba.start_instances(vm_id)
|
||||
self.alibaba.wait_until_running(vm_id, timeout)
|
||||
nodeaction.wait_for_ready_status(node, timeout)
|
||||
nodeaction.wait_for_ready_status(node, timeout, self.kubecli)
|
||||
logging.info("Node with instance ID: %s is in running state" % node)
|
||||
logging.info("node_start_scenario has been successfully injected!")
|
||||
except Exception as e:
|
||||
@@ -213,7 +214,7 @@ class alibaba_node_scenarios(abstract_node_scenarios):
|
||||
self.alibaba.stop_instances(vm_id)
|
||||
self.alibaba.wait_until_stopped(vm_id, timeout)
|
||||
logging.info("Node with instance ID: %s is in stopped state" % vm_id)
|
||||
nodeaction.wait_for_unknown_status(node, timeout)
|
||||
nodeaction.wait_for_unknown_status(node, timeout, self.kubecli)
|
||||
except Exception as e:
|
||||
logging.error("Failed to stop node instance. Encountered following exception: %s. " "Test Failed" % e)
|
||||
logging.error("node_stop_scenario injection failed!")
|
||||
@@ -248,8 +249,8 @@ class alibaba_node_scenarios(abstract_node_scenarios):
|
||||
instance_id = self.alibaba.get_instance_id(node)
|
||||
logging.info("Rebooting the node with instance ID: %s " % (instance_id))
|
||||
self.alibaba.reboot_instances(instance_id)
|
||||
nodeaction.wait_for_unknown_status(node, timeout)
|
||||
nodeaction.wait_for_ready_status(node, timeout)
|
||||
nodeaction.wait_for_unknown_status(node, timeout, self.kubecli)
|
||||
nodeaction.wait_for_ready_status(node, timeout, self.kubecli)
|
||||
logging.info("Node with instance ID: %s has been rebooted" % (instance_id))
|
||||
logging.info("node_reboot_scenario has been successfully injected!")
|
||||
except Exception as e:
|
||||
|
||||
@@ -2,7 +2,7 @@ import sys
|
||||
import time
|
||||
import boto3
|
||||
import logging
|
||||
import kraken.kubernetes.client as kubecli
|
||||
import krkn_lib_kubernetes
|
||||
import kraken.node_actions.common_node_functions as nodeaction
|
||||
from kraken.node_actions.abstract_node_scenarios import abstract_node_scenarios
|
||||
|
||||
@@ -150,9 +150,10 @@ class AWS:
|
||||
)
|
||||
sys.exit(1)
|
||||
|
||||
|
||||
# krkn_lib_kubernetes
|
||||
class aws_node_scenarios(abstract_node_scenarios):
|
||||
def __init__(self):
|
||||
def __init__(self, kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
super().__init__(kubecli)
|
||||
self.aws = AWS()
|
||||
|
||||
# Node scenario to start the node
|
||||
@@ -164,7 +165,7 @@ class aws_node_scenarios(abstract_node_scenarios):
|
||||
logging.info("Starting the node %s with instance ID: %s " % (node, instance_id))
|
||||
self.aws.start_instances(instance_id)
|
||||
self.aws.wait_until_running(instance_id)
|
||||
nodeaction.wait_for_ready_status(node, timeout)
|
||||
nodeaction.wait_for_ready_status(node, timeout, self.kubecli)
|
||||
logging.info("Node with instance ID: %s is in running state" % (instance_id))
|
||||
logging.info("node_start_scenario has been successfully injected!")
|
||||
except Exception as e:
|
||||
@@ -184,7 +185,7 @@ class aws_node_scenarios(abstract_node_scenarios):
|
||||
self.aws.stop_instances(instance_id)
|
||||
self.aws.wait_until_stopped(instance_id)
|
||||
logging.info("Node with instance ID: %s is in stopped state" % (instance_id))
|
||||
nodeaction.wait_for_unknown_status(node, timeout)
|
||||
nodeaction.wait_for_unknown_status(node, timeout, self.kubecli)
|
||||
except Exception as e:
|
||||
logging.error("Failed to stop node instance. Encountered following exception: %s. " "Test Failed" % (e))
|
||||
logging.error("node_stop_scenario injection failed!")
|
||||
@@ -200,10 +201,10 @@ class aws_node_scenarios(abstract_node_scenarios):
|
||||
self.aws.terminate_instances(instance_id)
|
||||
self.aws.wait_until_terminated(instance_id)
|
||||
for _ in range(timeout):
|
||||
if node not in kubecli.list_nodes():
|
||||
if node not in self.kubecli.list_nodes():
|
||||
break
|
||||
time.sleep(1)
|
||||
if node in kubecli.list_nodes():
|
||||
if node in self.kubecli.list_nodes():
|
||||
raise Exception("Node could not be terminated")
|
||||
logging.info("Node with instance ID: %s has been terminated" % (instance_id))
|
||||
logging.info("node_termination_scenario has been successfuly injected!")
|
||||
@@ -222,8 +223,8 @@ class aws_node_scenarios(abstract_node_scenarios):
|
||||
instance_id = self.aws.get_instance_id(node)
|
||||
logging.info("Rebooting the node %s with instance ID: %s " % (node, instance_id))
|
||||
self.aws.reboot_instances(instance_id)
|
||||
nodeaction.wait_for_unknown_status(node, timeout)
|
||||
nodeaction.wait_for_ready_status(node, timeout)
|
||||
nodeaction.wait_for_unknown_status(node, timeout, self.kubecli)
|
||||
nodeaction.wait_for_ready_status(node, timeout, self.kubecli)
|
||||
logging.info("Node with instance ID: %s has been rebooted" % (instance_id))
|
||||
logging.info("node_reboot_scenario has been successfuly injected!")
|
||||
except Exception as e:
|
||||
|
||||
@@ -3,7 +3,7 @@ import time
|
||||
from azure.mgmt.compute import ComputeManagementClient
|
||||
from azure.identity import DefaultAzureCredential
|
||||
import logging
|
||||
import kraken.kubernetes.client as kubecli
|
||||
import krkn_lib_kubernetes
|
||||
import kraken.node_actions.common_node_functions as nodeaction
|
||||
from kraken.node_actions.abstract_node_scenarios import abstract_node_scenarios
|
||||
import kraken.invoke.command as runcommand
|
||||
@@ -121,9 +121,10 @@ class Azure:
|
||||
logging.info("Vm %s is terminated" % vm_name)
|
||||
return True
|
||||
|
||||
|
||||
# krkn_lib_kubernetes
|
||||
class azure_node_scenarios(abstract_node_scenarios):
|
||||
def __init__(self):
|
||||
def __init__(self, kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
super().__init__(kubecli)
|
||||
logging.info("init in azure")
|
||||
self.azure = Azure()
|
||||
|
||||
@@ -136,7 +137,7 @@ class azure_node_scenarios(abstract_node_scenarios):
|
||||
logging.info("Starting the node %s with instance ID: %s " % (vm_name, resource_group))
|
||||
self.azure.start_instances(resource_group, vm_name)
|
||||
self.azure.wait_until_running(resource_group, vm_name, timeout)
|
||||
nodeaction.wait_for_ready_status(vm_name, timeout)
|
||||
nodeaction.wait_for_ready_status(vm_name, timeout,self.kubecli)
|
||||
logging.info("Node with instance ID: %s is in running state" % node)
|
||||
logging.info("node_start_scenario has been successfully injected!")
|
||||
except Exception as e:
|
||||
@@ -156,7 +157,7 @@ class azure_node_scenarios(abstract_node_scenarios):
|
||||
self.azure.stop_instances(resource_group, vm_name)
|
||||
self.azure.wait_until_stopped(resource_group, vm_name, timeout)
|
||||
logging.info("Node with instance ID: %s is in stopped state" % vm_name)
|
||||
nodeaction.wait_for_unknown_status(vm_name, timeout)
|
||||
nodeaction.wait_for_unknown_status(vm_name, timeout, self.kubecli)
|
||||
except Exception as e:
|
||||
logging.error("Failed to stop node instance. Encountered following exception: %s. " "Test Failed" % e)
|
||||
logging.error("node_stop_scenario injection failed!")
|
||||
@@ -172,10 +173,10 @@ class azure_node_scenarios(abstract_node_scenarios):
|
||||
self.azure.terminate_instances(resource_group, vm_name)
|
||||
self.azure.wait_until_terminated(resource_group, vm_name, timeout)
|
||||
for _ in range(timeout):
|
||||
if vm_name not in kubecli.list_nodes():
|
||||
if vm_name not in self.kubecli.list_nodes():
|
||||
break
|
||||
time.sleep(1)
|
||||
if vm_name in kubecli.list_nodes():
|
||||
if vm_name in self.kubecli.list_nodes():
|
||||
raise Exception("Node could not be terminated")
|
||||
logging.info("Node with instance ID: %s has been terminated" % node)
|
||||
logging.info("node_termination_scenario has been successfully injected!")
|
||||
@@ -194,8 +195,8 @@ class azure_node_scenarios(abstract_node_scenarios):
|
||||
vm_name, resource_group = self.azure.get_instance_id(node)
|
||||
logging.info("Rebooting the node %s with instance ID: %s " % (vm_name, resource_group))
|
||||
self.azure.reboot_instances(resource_group, vm_name)
|
||||
nodeaction.wait_for_unknown_status(vm_name, timeout)
|
||||
nodeaction.wait_for_ready_status(vm_name, timeout)
|
||||
nodeaction.wait_for_unknown_status(vm_name, timeout, self.kubecli)
|
||||
nodeaction.wait_for_ready_status(vm_name, timeout, self.kubecli)
|
||||
logging.info("Node with instance ID: %s has been rebooted" % (vm_name))
|
||||
logging.info("node_reboot_scenario has been successfully injected!")
|
||||
except Exception as e:
|
||||
|
||||
@@ -1,5 +1,6 @@
|
||||
import kraken.node_actions.common_node_functions as nodeaction
|
||||
from kraken.node_actions.abstract_node_scenarios import abstract_node_scenarios
|
||||
import krkn_lib_kubernetes
|
||||
import logging
|
||||
import openshift as oc
|
||||
import pyipmi
|
||||
@@ -104,9 +105,10 @@ class BM:
|
||||
while self.get_ipmi_connection(bmc_addr, node_name).get_chassis_status().power_on:
|
||||
time.sleep(1)
|
||||
|
||||
|
||||
# krkn_lib_kubernetes
|
||||
class bm_node_scenarios(abstract_node_scenarios):
|
||||
def __init__(self, bm_info, user, passwd):
|
||||
def __init__(self, bm_info, user, passwd, kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
super().__init__(kubecli)
|
||||
self.bm = BM(bm_info, user, passwd)
|
||||
|
||||
# Node scenario to start the node
|
||||
@@ -118,7 +120,7 @@ class bm_node_scenarios(abstract_node_scenarios):
|
||||
logging.info("Starting the node %s with bmc address: %s " % (node, bmc_addr))
|
||||
self.bm.start_instances(bmc_addr, node)
|
||||
self.bm.wait_until_running(bmc_addr, node)
|
||||
nodeaction.wait_for_ready_status(node, timeout)
|
||||
nodeaction.wait_for_ready_status(node, timeout, self.kubecli)
|
||||
logging.info("Node with bmc address: %s is in running state" % (bmc_addr))
|
||||
logging.info("node_start_scenario has been successfully injected!")
|
||||
except Exception as e:
|
||||
@@ -140,7 +142,7 @@ class bm_node_scenarios(abstract_node_scenarios):
|
||||
self.bm.stop_instances(bmc_addr, node)
|
||||
self.bm.wait_until_stopped(bmc_addr, node)
|
||||
logging.info("Node with bmc address: %s is in stopped state" % (bmc_addr))
|
||||
nodeaction.wait_for_unknown_status(node, timeout)
|
||||
nodeaction.wait_for_unknown_status(node, timeout, self.kubecli)
|
||||
except Exception as e:
|
||||
logging.error(
|
||||
"Failed to stop node instance. Encountered following exception: %s. "
|
||||
@@ -163,8 +165,8 @@ class bm_node_scenarios(abstract_node_scenarios):
|
||||
logging.info("BMC Addr: %s" % (bmc_addr))
|
||||
logging.info("Rebooting the node %s with bmc address: %s " % (node, bmc_addr))
|
||||
self.bm.reboot_instances(bmc_addr, node)
|
||||
nodeaction.wait_for_unknown_status(node, timeout)
|
||||
nodeaction.wait_for_ready_status(node, timeout)
|
||||
nodeaction.wait_for_unknown_status(node, timeout, self.kubecli)
|
||||
nodeaction.wait_for_ready_status(node, timeout, self.kubecli)
|
||||
logging.info("Node with bmc address: %s has been rebooted" % (bmc_addr))
|
||||
logging.info("node_reboot_scenario has been successfuly injected!")
|
||||
except Exception as e:
|
||||
|
||||
@@ -2,14 +2,14 @@ import time
|
||||
import random
|
||||
import logging
|
||||
import paramiko
|
||||
import kraken.kubernetes.client as kubecli
|
||||
import krkn_lib_kubernetes
|
||||
import kraken.invoke.command as runcommand
|
||||
|
||||
node_general = False
|
||||
|
||||
|
||||
# Pick a random node with specified label selector
|
||||
def get_node(node_name, label_selector, instance_kill_count):
|
||||
def get_node(node_name, label_selector, instance_kill_count, kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
if node_name in kubecli.list_killable_nodes():
|
||||
return [node_name]
|
||||
elif node_name:
|
||||
@@ -29,20 +29,21 @@ def get_node(node_name, label_selector, instance_kill_count):
|
||||
return nodes_to_return
|
||||
|
||||
|
||||
# krkn_lib_kubernetes
|
||||
# Wait until the node status becomes Ready
|
||||
def wait_for_ready_status(node, timeout):
|
||||
def wait_for_ready_status(node, timeout, kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
resource_version = kubecli.get_node_resource_version(node)
|
||||
kubecli.watch_node_status(node, "True", timeout, resource_version)
|
||||
|
||||
|
||||
# krkn_lib_kubernetes
|
||||
# Wait until the node status becomes Not Ready
|
||||
def wait_for_not_ready_status(node, timeout):
|
||||
def wait_for_not_ready_status(node, timeout, kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
resource_version = kubecli.get_node_resource_version(node)
|
||||
kubecli.watch_node_status(node, "False", timeout, resource_version)
|
||||
|
||||
|
||||
# krkn_lib_kubernetes
|
||||
# Wait until the node status becomes Unknown
|
||||
def wait_for_unknown_status(node, timeout):
|
||||
def wait_for_unknown_status(node, timeout, kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
resource_version = kubecli.get_node_resource_version(node)
|
||||
kubecli.watch_node_status(node, "Unknown", timeout, resource_version)
|
||||
|
||||
|
||||
@@ -1,5 +1,6 @@
|
||||
import kraken.node_actions.common_node_functions as nodeaction
|
||||
from kraken.node_actions.abstract_node_scenarios import abstract_node_scenarios
|
||||
import krkn_lib_kubernetes
|
||||
import logging
|
||||
import sys
|
||||
import docker
|
||||
@@ -36,7 +37,8 @@ class Docker:
|
||||
|
||||
|
||||
class docker_node_scenarios(abstract_node_scenarios):
|
||||
def __init__(self):
|
||||
def __init__(self, kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
super().__init__(kubecli)
|
||||
self.docker = Docker()
|
||||
|
||||
# Node scenario to start the node
|
||||
@@ -47,7 +49,7 @@ class docker_node_scenarios(abstract_node_scenarios):
|
||||
container_id = self.docker.get_container_id(node)
|
||||
logging.info("Starting the node %s with container ID: %s " % (node, container_id))
|
||||
self.docker.start_instances(node)
|
||||
nodeaction.wait_for_ready_status(node, timeout)
|
||||
nodeaction.wait_for_ready_status(node, timeout, self.kubecli)
|
||||
logging.info("Node with container ID: %s is in running state" % (container_id))
|
||||
logging.info("node_start_scenario has been successfully injected!")
|
||||
except Exception as e:
|
||||
@@ -66,7 +68,7 @@ class docker_node_scenarios(abstract_node_scenarios):
|
||||
logging.info("Stopping the node %s with container ID: %s " % (node, container_id))
|
||||
self.docker.stop_instances(node)
|
||||
logging.info("Node with container ID: %s is in stopped state" % (container_id))
|
||||
nodeaction.wait_for_unknown_status(node, timeout)
|
||||
nodeaction.wait_for_unknown_status(node, timeout, self.kubecli)
|
||||
except Exception as e:
|
||||
logging.error("Failed to stop node instance. Encountered following exception: %s. " "Test Failed" % (e))
|
||||
logging.error("node_stop_scenario injection failed!")
|
||||
@@ -97,8 +99,8 @@ class docker_node_scenarios(abstract_node_scenarios):
|
||||
container_id = self.docker.get_container_id(node)
|
||||
logging.info("Rebooting the node %s with container ID: %s " % (node, container_id))
|
||||
self.docker.reboot_instances(node)
|
||||
nodeaction.wait_for_unknown_status(node, timeout)
|
||||
nodeaction.wait_for_ready_status(node, timeout)
|
||||
nodeaction.wait_for_unknown_status(node, timeout, self.kubecli)
|
||||
nodeaction.wait_for_ready_status(node, timeout, self.kubecli)
|
||||
logging.info("Node with container ID: %s has been rebooted" % (container_id))
|
||||
logging.info("node_reboot_scenario has been successfuly injected!")
|
||||
except Exception as e:
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
import sys
|
||||
import time
|
||||
import logging
|
||||
import kraken.kubernetes.client as kubecli
|
||||
import krkn_lib_kubernetes
|
||||
import kraken.node_actions.common_node_functions as nodeaction
|
||||
from kraken.node_actions.abstract_node_scenarios import abstract_node_scenarios
|
||||
from googleapiclient import discovery
|
||||
@@ -133,8 +133,10 @@ class GCP:
|
||||
return True
|
||||
|
||||
|
||||
# krkn_lib_kubernetes
|
||||
class gcp_node_scenarios(abstract_node_scenarios):
|
||||
def __init__(self):
|
||||
def __init__(self, kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
super().__init__(kubecli)
|
||||
self.gcp = GCP()
|
||||
|
||||
# Node scenario to start the node
|
||||
@@ -146,7 +148,7 @@ class gcp_node_scenarios(abstract_node_scenarios):
|
||||
logging.info("Starting the node %s with instance ID: %s " % (node, instance_id))
|
||||
self.gcp.start_instances(zone, instance_id)
|
||||
self.gcp.wait_until_running(zone, instance_id, timeout)
|
||||
nodeaction.wait_for_ready_status(node, timeout)
|
||||
nodeaction.wait_for_ready_status(node, timeout, self.kubecli)
|
||||
logging.info("Node with instance ID: %s is in running state" % instance_id)
|
||||
logging.info("node_start_scenario has been successfully injected!")
|
||||
except Exception as e:
|
||||
@@ -167,7 +169,7 @@ class gcp_node_scenarios(abstract_node_scenarios):
|
||||
self.gcp.stop_instances(zone, instance_id)
|
||||
self.gcp.wait_until_stopped(zone, instance_id, timeout)
|
||||
logging.info("Node with instance ID: %s is in stopped state" % instance_id)
|
||||
nodeaction.wait_for_unknown_status(node, timeout)
|
||||
nodeaction.wait_for_unknown_status(node, timeout, self.kubecli)
|
||||
except Exception as e:
|
||||
logging.error("Failed to stop node instance. Encountered following exception: %s. " "Test Failed" % (e))
|
||||
logging.error("node_stop_scenario injection failed!")
|
||||
@@ -183,10 +185,10 @@ class gcp_node_scenarios(abstract_node_scenarios):
|
||||
self.gcp.terminate_instances(zone, instance_id)
|
||||
self.gcp.wait_until_terminated(zone, instance_id, timeout)
|
||||
for _ in range(timeout):
|
||||
if node not in kubecli.list_nodes():
|
||||
if node not in self.kubecli.list_nodes():
|
||||
break
|
||||
time.sleep(1)
|
||||
if node in kubecli.list_nodes():
|
||||
if node in self.kubecli.list_nodes():
|
||||
raise Exception("Node could not be terminated")
|
||||
logging.info("Node with instance ID: %s has been terminated" % instance_id)
|
||||
logging.info("node_termination_scenario has been successfuly injected!")
|
||||
@@ -205,7 +207,7 @@ class gcp_node_scenarios(abstract_node_scenarios):
|
||||
instance_id, zone = self.gcp.get_instance_id(node)
|
||||
logging.info("Rebooting the node %s with instance ID: %s " % (node, instance_id))
|
||||
self.gcp.reboot_instances(zone, instance_id)
|
||||
nodeaction.wait_for_ready_status(node, timeout)
|
||||
nodeaction.wait_for_ready_status(node, timeout, self.kubecli)
|
||||
logging.info("Node with instance ID: %s has been rebooted" % instance_id)
|
||||
logging.info("node_reboot_scenario has been successfuly injected!")
|
||||
except Exception as e:
|
||||
|
||||
@@ -1,4 +1,5 @@
|
||||
import logging
|
||||
import krkn_lib_kubernetes
|
||||
from kraken.node_actions.abstract_node_scenarios import abstract_node_scenarios
|
||||
|
||||
|
||||
@@ -6,9 +7,10 @@ class GENERAL:
|
||||
def __init__(self):
|
||||
pass
|
||||
|
||||
|
||||
# krkn_lib_kubernetes
|
||||
class general_node_scenarios(abstract_node_scenarios):
|
||||
def __init__(self):
|
||||
def __init__(self, kubecli: krkn_lib_kubernetes.KrknLibKubernetes ):
|
||||
super().__init__(kubecli)
|
||||
self.general = GENERAL()
|
||||
|
||||
# Node scenario to start the node
|
||||
|
||||
@@ -1,6 +1,7 @@
|
||||
import sys
|
||||
import time
|
||||
import logging
|
||||
import krkn_lib_kubernetes
|
||||
import kraken.invoke.command as runcommand
|
||||
import kraken.node_actions.common_node_functions as nodeaction
|
||||
from kraken.node_actions.abstract_node_scenarios import abstract_node_scenarios
|
||||
@@ -86,9 +87,9 @@ class OPENSTACKCLOUD:
|
||||
return node_name
|
||||
counter += 1
|
||||
|
||||
|
||||
# krkn_lib_kubernetes
|
||||
class openstack_node_scenarios(abstract_node_scenarios):
|
||||
def __init__(self):
|
||||
def __init__(self, kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
self.openstackcloud = OPENSTACKCLOUD()
|
||||
|
||||
# Node scenario to start the node
|
||||
@@ -100,7 +101,7 @@ class openstack_node_scenarios(abstract_node_scenarios):
|
||||
openstack_node_name = self.openstackcloud.get_instance_id(node)
|
||||
self.openstackcloud.start_instances(openstack_node_name)
|
||||
self.openstackcloud.wait_until_running(openstack_node_name, timeout)
|
||||
nodeaction.wait_for_ready_status(node, timeout)
|
||||
nodeaction.wait_for_ready_status(node, timeout, self.kubecli)
|
||||
logging.info("Node with instance ID: %s is in running state" % (node))
|
||||
logging.info("node_start_scenario has been successfully injected!")
|
||||
except Exception as e:
|
||||
@@ -120,7 +121,7 @@ class openstack_node_scenarios(abstract_node_scenarios):
|
||||
self.openstackcloud.stop_instances(openstack_node_name)
|
||||
self.openstackcloud.wait_until_stopped(openstack_node_name, timeout)
|
||||
logging.info("Node with instance name: %s is in stopped state" % (node))
|
||||
nodeaction.wait_for_ready_status(node, timeout)
|
||||
nodeaction.wait_for_ready_status(node, timeout, self.kubecli)
|
||||
except Exception as e:
|
||||
logging.error("Failed to stop node instance. Encountered following exception: %s. " "Test Failed" % (e))
|
||||
logging.error("node_stop_scenario injection failed!")
|
||||
@@ -134,8 +135,8 @@ class openstack_node_scenarios(abstract_node_scenarios):
|
||||
logging.info("Rebooting the node %s" % (node))
|
||||
openstack_node_name = self.openstackcloud.get_instance_id(node)
|
||||
self.openstackcloud.reboot_instances(openstack_node_name)
|
||||
nodeaction.wait_for_unknown_status(node, timeout)
|
||||
nodeaction.wait_for_ready_status(node, timeout)
|
||||
nodeaction.wait_for_unknown_status(node, timeout, self.kubecli)
|
||||
nodeaction.wait_for_ready_status(node, timeout, self.kubecli)
|
||||
logging.info("Node with instance name: %s has been rebooted" % (node))
|
||||
logging.info("node_reboot_scenario has been successfuly injected!")
|
||||
except Exception as e:
|
||||
|
||||
@@ -2,6 +2,7 @@ import yaml
|
||||
import logging
|
||||
import sys
|
||||
import time
|
||||
import krkn_lib_kubernetes
|
||||
from kraken.node_actions.aws_node_scenarios import aws_node_scenarios
|
||||
from kraken.node_actions.general_cloud_node_scenarios import general_node_scenarios
|
||||
from kraken.node_actions.az_node_scenarios import azure_node_scenarios
|
||||
@@ -18,27 +19,29 @@ node_general = False
|
||||
|
||||
|
||||
# Get the node scenarios object of specfied cloud type
|
||||
def get_node_scenario_object(node_scenario):
|
||||
# krkn_lib_kubernetes
|
||||
def get_node_scenario_object(node_scenario, kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
if "cloud_type" not in node_scenario.keys() or node_scenario["cloud_type"] == "generic":
|
||||
global node_general
|
||||
node_general = True
|
||||
return general_node_scenarios()
|
||||
return general_node_scenarios(kubecli)
|
||||
if node_scenario["cloud_type"] == "aws":
|
||||
return aws_node_scenarios()
|
||||
return aws_node_scenarios(kubecli)
|
||||
elif node_scenario["cloud_type"] == "gcp":
|
||||
return gcp_node_scenarios()
|
||||
return gcp_node_scenarios(kubecli)
|
||||
elif node_scenario["cloud_type"] == "openstack":
|
||||
return openstack_node_scenarios()
|
||||
return openstack_node_scenarios(kubecli)
|
||||
elif node_scenario["cloud_type"] == "azure" or node_scenario["cloud_type"] == "az":
|
||||
return azure_node_scenarios()
|
||||
return azure_node_scenarios(kubecli)
|
||||
elif node_scenario["cloud_type"] == "alibaba" or node_scenario["cloud_type"] == "alicloud":
|
||||
return alibaba_node_scenarios()
|
||||
return alibaba_node_scenarios(kubecli)
|
||||
elif node_scenario["cloud_type"] == "bm":
|
||||
return bm_node_scenarios(
|
||||
node_scenario.get("bmc_info"), node_scenario.get("bmc_user", None), node_scenario.get("bmc_password", None)
|
||||
node_scenario.get("bmc_info"), node_scenario.get("bmc_user", None), node_scenario.get("bmc_password", None),
|
||||
kubecli
|
||||
)
|
||||
elif node_scenario["cloud_type"] == "docker":
|
||||
return docker_node_scenarios()
|
||||
return docker_node_scenarios(kubecli)
|
||||
else:
|
||||
logging.error(
|
||||
"Cloud type " + node_scenario["cloud_type"] + " is not currently supported; "
|
||||
@@ -49,16 +52,17 @@ def get_node_scenario_object(node_scenario):
|
||||
|
||||
|
||||
# Run defined scenarios
|
||||
def run(scenarios_list, config, wait_duration):
|
||||
# krkn_lib_kubernetes
|
||||
def run(scenarios_list, config, wait_duration, kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
for node_scenario_config in scenarios_list:
|
||||
with open(node_scenario_config, "r") as f:
|
||||
node_scenario_config = yaml.full_load(f)
|
||||
for node_scenario in node_scenario_config["node_scenarios"]:
|
||||
node_scenario_object = get_node_scenario_object(node_scenario)
|
||||
node_scenario_object = get_node_scenario_object(node_scenario, kubecli)
|
||||
if node_scenario["actions"]:
|
||||
for action in node_scenario["actions"]:
|
||||
start_time = int(time.time())
|
||||
inject_node_scenario(action, node_scenario, node_scenario_object)
|
||||
inject_node_scenario(action, node_scenario, node_scenario_object, kubecli)
|
||||
logging.info("Waiting for the specified duration: %s" % (wait_duration))
|
||||
time.sleep(wait_duration)
|
||||
end_time = int(time.time())
|
||||
@@ -67,7 +71,7 @@ def run(scenarios_list, config, wait_duration):
|
||||
|
||||
|
||||
# Inject the specified node scenario
|
||||
def inject_node_scenario(action, node_scenario, node_scenario_object):
|
||||
def inject_node_scenario(action, node_scenario, node_scenario_object, kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
generic_cloud_scenarios = ("stop_kubelet_scenario", "node_crash_scenario")
|
||||
# Get the node scenario configurations
|
||||
run_kill_count = node_scenario.get("runs", 1)
|
||||
@@ -83,7 +87,7 @@ def inject_node_scenario(action, node_scenario, node_scenario_object):
|
||||
else:
|
||||
node_name_list = [node_name]
|
||||
for single_node_name in node_name_list:
|
||||
nodes = common_node_functions.get_node(single_node_name, label_selector, instance_kill_count)
|
||||
nodes = common_node_functions.get_node(single_node_name, label_selector, instance_kill_count, kubecli)
|
||||
for single_node in nodes:
|
||||
if node_general and action not in generic_cloud_scenarios:
|
||||
logging.info("Scenario: " + action + " is not set up for generic cloud type, skipping action")
|
||||
|
||||
@@ -5,7 +5,7 @@ import arcaflow_plugin_kill_pod
|
||||
|
||||
import kraken.cerberus.setup as cerberus
|
||||
import kraken.post_actions.actions as post_actions
|
||||
import kraken.kubernetes.client as kubecli
|
||||
import krkn_lib_kubernetes
|
||||
import time
|
||||
import yaml
|
||||
import sys
|
||||
@@ -66,8 +66,8 @@ def run(kubeconfig_path, scenarios_list, config, failed_post_scenarios, wait_dur
|
||||
cerberus.publish_kraken_status(config, failed_post_scenarios, start_time, end_time)
|
||||
return failed_post_scenarios
|
||||
|
||||
|
||||
def container_run(kubeconfig_path, scenarios_list, config, failed_post_scenarios, wait_duration):
|
||||
# krkn_lib_kubernetes
|
||||
def container_run(kubeconfig_path, scenarios_list, config, failed_post_scenarios, wait_duration, kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
for container_scenario_config in scenarios_list:
|
||||
if len(container_scenario_config) > 1:
|
||||
pre_action_output = post_actions.run(kubeconfig_path, container_scenario_config[1])
|
||||
@@ -78,7 +78,7 @@ def container_run(kubeconfig_path, scenarios_list, config, failed_post_scenarios
|
||||
for cont_scenario in cont_scenario_config["scenarios"]:
|
||||
# capture start time
|
||||
start_time = int(time.time())
|
||||
killed_containers = container_killing_in_pod(cont_scenario)
|
||||
killed_containers = container_killing_in_pod(cont_scenario, kubecli)
|
||||
|
||||
if len(container_scenario_config) > 1:
|
||||
try:
|
||||
@@ -90,7 +90,7 @@ def container_run(kubeconfig_path, scenarios_list, config, failed_post_scenarios
|
||||
sys.exit(1)
|
||||
else:
|
||||
failed_post_scenarios = check_failed_containers(
|
||||
killed_containers, cont_scenario.get("retry_wait", 120)
|
||||
killed_containers, cont_scenario.get("retry_wait", 120), kubecli
|
||||
)
|
||||
|
||||
logging.info("Waiting for the specified duration: %s" % (wait_duration))
|
||||
@@ -104,7 +104,7 @@ def container_run(kubeconfig_path, scenarios_list, config, failed_post_scenarios
|
||||
logging.info("")
|
||||
|
||||
|
||||
def container_killing_in_pod(cont_scenario):
|
||||
def container_killing_in_pod(cont_scenario, kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
scenario_name = cont_scenario.get("name", "")
|
||||
namespace = cont_scenario.get("namespace", "*")
|
||||
label_selector = cont_scenario.get("label_selector", None)
|
||||
@@ -153,11 +153,11 @@ def container_killing_in_pod(cont_scenario):
|
||||
if container_name != "":
|
||||
if c_name == container_name:
|
||||
killed_container_list.append([selected_container_pod[0], selected_container_pod[1], c_name])
|
||||
retry_container_killing(kill_action, selected_container_pod[0], selected_container_pod[1], c_name)
|
||||
retry_container_killing(kill_action, selected_container_pod[0], selected_container_pod[1], c_name, kubecli)
|
||||
break
|
||||
else:
|
||||
killed_container_list.append([selected_container_pod[0], selected_container_pod[1], c_name])
|
||||
retry_container_killing(kill_action, selected_container_pod[0], selected_container_pod[1], c_name)
|
||||
retry_container_killing(kill_action, selected_container_pod[0], selected_container_pod[1], c_name, kubecli)
|
||||
break
|
||||
container_pod_list.remove(selected_container_pod)
|
||||
killed_count += 1
|
||||
@@ -165,7 +165,7 @@ def container_killing_in_pod(cont_scenario):
|
||||
return killed_container_list
|
||||
|
||||
|
||||
def retry_container_killing(kill_action, podname, namespace, container_name):
|
||||
def retry_container_killing(kill_action, podname, namespace, container_name, kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
i = 0
|
||||
while i < 5:
|
||||
logging.info("Killing container %s in pod %s (ns %s)" % (str(container_name), str(podname), str(namespace)))
|
||||
@@ -181,7 +181,7 @@ def retry_container_killing(kill_action, podname, namespace, container_name):
|
||||
continue
|
||||
|
||||
|
||||
def check_failed_containers(killed_container_list, wait_time):
|
||||
def check_failed_containers(killed_container_list, wait_time, kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
|
||||
container_ready = []
|
||||
timer = 0
|
||||
|
||||
@@ -31,7 +31,7 @@ def process_prom_query(query):
|
||||
logging.error("Failed to get the metrics: %s" % e)
|
||||
sys.exit(1)
|
||||
else:
|
||||
logging.info("Skipping the prometheus query as the prometheus client couldn't " "be initilized\n")
|
||||
logging.info("Skipping the prometheus query as the prometheus client couldn't " "be initialized\n")
|
||||
|
||||
# Get prometheus details
|
||||
def instance(distribution, prometheus_url, prometheus_bearer_token):
|
||||
|
||||
@@ -3,14 +3,14 @@ import random
|
||||
import re
|
||||
import sys
|
||||
import time
|
||||
|
||||
import krkn_lib_kubernetes
|
||||
import yaml
|
||||
|
||||
from ..cerberus import setup as cerberus
|
||||
from ..kubernetes import client as kubecli
|
||||
|
||||
|
||||
def run(scenarios_list, config):
|
||||
# krkn_lib_kubernetes
|
||||
def run(scenarios_list, config, kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
"""
|
||||
Reads the scenario config and creates a temp file to fill up the PVC
|
||||
"""
|
||||
@@ -213,7 +213,8 @@ def run(scenarios_list, config):
|
||||
namespace,
|
||||
container_name,
|
||||
mount_path,
|
||||
file_size_kb
|
||||
file_size_kb,
|
||||
kubecli
|
||||
)
|
||||
sys.exit(1)
|
||||
|
||||
@@ -233,7 +234,8 @@ def run(scenarios_list, config):
|
||||
namespace,
|
||||
container_name,
|
||||
mount_path,
|
||||
file_size_kb
|
||||
file_size_kb,
|
||||
kubecli
|
||||
)
|
||||
|
||||
end_time = int(time.time())
|
||||
@@ -245,6 +247,7 @@ def run(scenarios_list, config):
|
||||
)
|
||||
|
||||
|
||||
# krkn_lib_kubernetes
|
||||
def remove_temp_file(
|
||||
file_name,
|
||||
full_path,
|
||||
@@ -252,7 +255,8 @@ def remove_temp_file(
|
||||
namespace,
|
||||
container_name,
|
||||
mount_path,
|
||||
file_size_kb
|
||||
file_size_kb,
|
||||
kubecli: krkn_lib_kubernetes.KrknLibKubernetes
|
||||
):
|
||||
command = "rm -f %s" % (str(full_path))
|
||||
logging.debug("Remove temp file from the PVC command:\n %s" % command)
|
||||
|
||||
@@ -4,10 +4,10 @@ import sys
|
||||
import yaml
|
||||
import logging
|
||||
import time
|
||||
import krkn_lib_kubernetes
|
||||
from multiprocessing.pool import ThreadPool
|
||||
|
||||
from ..cerberus import setup as cerberus
|
||||
from ..kubernetes import client as kubecli
|
||||
from ..post_actions import actions as post_actions
|
||||
from ..node_actions.aws_node_scenarios import AWS
|
||||
from ..node_actions.openstack_node_scenarios import OPENSTACKCLOUD
|
||||
@@ -40,7 +40,8 @@ def multiprocess_nodes(cloud_object_function, nodes):
|
||||
|
||||
|
||||
# Inject the cluster shut down scenario
|
||||
def cluster_shut_down(shut_down_config):
|
||||
# krkn_lib_kubernetes
|
||||
def cluster_shut_down(shut_down_config, kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
runs = shut_down_config["runs"]
|
||||
shut_down_duration = shut_down_config["shut_down_duration"]
|
||||
cloud_type = shut_down_config["cloud_type"]
|
||||
@@ -125,8 +126,9 @@ def cluster_shut_down(shut_down_config):
|
||||
|
||||
logging.info("Successfully injected cluster_shut_down scenario!")
|
||||
|
||||
# krkn_lib_kubernetes
|
||||
|
||||
def run(scenarios_list, config, wait_duration):
|
||||
def run(scenarios_list, config, wait_duration, kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
failed_post_scenarios = []
|
||||
for shut_down_config in scenarios_list:
|
||||
if len(shut_down_config) > 1:
|
||||
@@ -138,7 +140,7 @@ def run(scenarios_list, config, wait_duration):
|
||||
shut_down_config_scenario = \
|
||||
shut_down_config_yaml["cluster_shut_down_scenario"]
|
||||
start_time = int(time.time())
|
||||
cluster_shut_down(shut_down_config_scenario)
|
||||
cluster_shut_down(shut_down_config_scenario, kubecli)
|
||||
logging.info(
|
||||
"Waiting for the specified duration: %s" % (wait_duration)
|
||||
)
|
||||
|
||||
@@ -5,14 +5,12 @@ import re
|
||||
import sys
|
||||
import yaml
|
||||
import random
|
||||
|
||||
import krkn_lib_kubernetes
|
||||
from ..cerberus import setup as cerberus
|
||||
from ..kubernetes import client as kubecli
|
||||
from ..invoke import command as runcommand
|
||||
|
||||
|
||||
def pod_exec(pod_name, command, namespace, container_name):
|
||||
i = 0
|
||||
# krkn_lib_kubernetes
|
||||
def pod_exec(pod_name, command, namespace, container_name, kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
for i in range(5):
|
||||
response = kubecli.exec_cmd_in_pod(
|
||||
command,
|
||||
@@ -41,7 +39,8 @@ def node_debug(node_name, command):
|
||||
return response
|
||||
|
||||
|
||||
def get_container_name(pod_name, namespace, container_name=""):
|
||||
# krkn_lib_kubernetes
|
||||
def get_container_name(pod_name, namespace, kubecli: krkn_lib_kubernetes.KrknLibKubernetes, container_name=""):
|
||||
|
||||
container_names = kubecli.get_containers_in_pod(pod_name, namespace)
|
||||
if container_name != "":
|
||||
@@ -63,7 +62,8 @@ def get_container_name(pod_name, namespace, container_name=""):
|
||||
return container_name
|
||||
|
||||
|
||||
def skew_time(scenario):
|
||||
# krkn_lib_kubernetes
|
||||
def skew_time(scenario, kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
skew_command = "date --set "
|
||||
if scenario["action"] == "skew_date":
|
||||
skewed_date = "00-01-01"
|
||||
@@ -134,13 +134,17 @@ def skew_time(scenario):
|
||||
selected_container_name = get_container_name(
|
||||
pod[0],
|
||||
pod[1],
|
||||
container_name
|
||||
kubecli,
|
||||
container_name,
|
||||
|
||||
)
|
||||
pod_exec_response = pod_exec(
|
||||
pod[0],
|
||||
skew_command,
|
||||
pod[1],
|
||||
selected_container_name
|
||||
selected_container_name,
|
||||
kubecli,
|
||||
|
||||
)
|
||||
if pod_exec_response is False:
|
||||
logging.error(
|
||||
@@ -154,13 +158,15 @@ def skew_time(scenario):
|
||||
selected_container_name = get_container_name(
|
||||
pod,
|
||||
scenario["namespace"],
|
||||
kubecli,
|
||||
container_name
|
||||
)
|
||||
pod_exec_response = pod_exec(
|
||||
pod,
|
||||
skew_command,
|
||||
scenario["namespace"],
|
||||
selected_container_name
|
||||
selected_container_name,
|
||||
kubecli
|
||||
)
|
||||
if pod_exec_response is False:
|
||||
logging.error(
|
||||
@@ -216,7 +222,8 @@ def string_to_date(obj_datetime):
|
||||
return datetime.datetime(datetime.MINYEAR, 1, 1)
|
||||
|
||||
|
||||
def check_date_time(object_type, names):
|
||||
# krkn_lib_kubernetes
|
||||
def check_date_time(object_type, names, kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
skew_command = "date"
|
||||
not_reset = []
|
||||
max_retries = 30
|
||||
@@ -256,7 +263,8 @@ def check_date_time(object_type, names):
|
||||
pod_name[0],
|
||||
skew_command,
|
||||
pod_name[1],
|
||||
pod_name[2]
|
||||
pod_name[2],
|
||||
kubecli
|
||||
)
|
||||
pod_datetime = string_to_date(pod_datetime_string)
|
||||
while not (
|
||||
@@ -271,7 +279,8 @@ def check_date_time(object_type, names):
|
||||
pod_name[0],
|
||||
skew_command,
|
||||
pod_name[1],
|
||||
pod_name[2]
|
||||
pod_name[2],
|
||||
kubecli
|
||||
)
|
||||
pod_datetime = string_to_date(pod_datetime)
|
||||
counter += 1
|
||||
@@ -289,14 +298,15 @@ def check_date_time(object_type, names):
|
||||
return not_reset
|
||||
|
||||
|
||||
def run(scenarios_list, config, wait_duration):
|
||||
# krkn_lib_kubernetes
|
||||
def run(scenarios_list, config, wait_duration, kubecli: krkn_lib_kubernetes.KrknLibKubernetes):
|
||||
for time_scenario_config in scenarios_list:
|
||||
with open(time_scenario_config, "r") as f:
|
||||
scenario_config = yaml.full_load(f)
|
||||
for time_scenario in scenario_config["time_scenarios"]:
|
||||
start_time = int(time.time())
|
||||
object_type, object_names = skew_time(time_scenario)
|
||||
not_reset = check_date_time(object_type, object_names)
|
||||
object_type, object_names = skew_time(time_scenario, kubecli)
|
||||
not_reset = check_date_time(object_type, object_names, kubecli)
|
||||
if len(not_reset) > 0:
|
||||
logging.info("Object times were not reset")
|
||||
logging.info(
|
||||
|
||||
@@ -37,3 +37,5 @@ prometheus_api_client
|
||||
ibm_cloud_sdk_core
|
||||
ibm_vpc
|
||||
pytest
|
||||
|
||||
krkn-lib-kubernetes >= 0.1.1
|
||||
@@ -8,7 +8,6 @@ import optparse
|
||||
import pyfiglet
|
||||
import uuid
|
||||
import time
|
||||
import kraken.kubernetes.client as kubecli
|
||||
import kraken.litmus.common_litmus as common_litmus
|
||||
import kraken.time_actions.common_time_functions as time_actions
|
||||
import kraken.performance_dashboards.setup as performance_dashboards
|
||||
@@ -26,12 +25,13 @@ import kraken.arcaflow_plugin as arcaflow_plugin
|
||||
import server as server
|
||||
import kraken.prometheus.client as promcli
|
||||
from kraken import plugins
|
||||
from krkn_lib_kubernetes import KrknLibKubernetes
|
||||
|
||||
KUBE_BURNER_URL = (
|
||||
"https://github.com/cloud-bulldozer/kube-burner/"
|
||||
"releases/download/v{version}/kube-burner-{version}-Linux-x86_64.tar.gz"
|
||||
)
|
||||
KUBE_BURNER_VERSION = "0.9.1"
|
||||
KUBE_BURNER_VERSION = "1.7.0"
|
||||
|
||||
|
||||
# Main function
|
||||
@@ -91,14 +91,19 @@ def main(cfg):
|
||||
check_critical_alerts = config["performance_monitoring"].get("check_critical_alerts", False)
|
||||
|
||||
# Initialize clients
|
||||
if not os.path.isfile(kubeconfig_path):
|
||||
if (not os.path.isfile(kubeconfig_path) and
|
||||
not os.path.isfile("/var/run/secrets/kubernetes.io/serviceaccount/token")):
|
||||
logging.error(
|
||||
"Cannot read the kubeconfig file at %s, please check" % kubeconfig_path
|
||||
)
|
||||
sys.exit(1)
|
||||
logging.info("Initializing client to talk to the Kubernetes cluster")
|
||||
os.environ["KUBECONFIG"] = str(kubeconfig_path)
|
||||
kubecli.initialize_clients(kubeconfig_path)
|
||||
try:
|
||||
kubeconfig_path
|
||||
os.environ["KUBECONFIG"] = str(kubeconfig_path)
|
||||
kubecli = KrknLibKubernetes(kubeconfig_path=kubeconfig_path)
|
||||
except NameError:
|
||||
kubecli.initialize_clients(None)
|
||||
|
||||
# find node kraken might be running on
|
||||
kubecli.find_kraken_node()
|
||||
@@ -120,9 +125,6 @@ def main(cfg):
|
||||
logging.info(
|
||||
"Publishing kraken status at http://%s:%s" % (server_address, port)
|
||||
)
|
||||
logging.info(
|
||||
"Publishing kraken status at http://%s:%s" % (server_address, port)
|
||||
)
|
||||
server.start_server(address, run_signal)
|
||||
|
||||
# Cluster info
|
||||
@@ -213,6 +215,7 @@ def main(cfg):
|
||||
failed_post_scenarios,
|
||||
wait_duration,
|
||||
)
|
||||
# krkn_lib_kubernetes
|
||||
elif scenario_type == "container_scenarios":
|
||||
logging.info("Running container scenarios")
|
||||
failed_post_scenarios = pod_scenarios.container_run(
|
||||
@@ -221,26 +224,30 @@ def main(cfg):
|
||||
config,
|
||||
failed_post_scenarios,
|
||||
wait_duration,
|
||||
kubecli
|
||||
)
|
||||
|
||||
# Inject node chaos scenarios specified in the config
|
||||
# krkn_lib_kubernetes
|
||||
elif scenario_type == "node_scenarios":
|
||||
logging.info("Running node scenarios")
|
||||
nodeaction.run(scenarios_list, config, wait_duration)
|
||||
nodeaction.run(scenarios_list, config, wait_duration, kubecli)
|
||||
|
||||
# Inject managedcluster chaos scenarios specified in the config
|
||||
# krkn_lib_kubernetes
|
||||
elif scenario_type == "managedcluster_scenarios":
|
||||
logging.info("Running managedcluster scenarios")
|
||||
managedcluster_scenarios.run(
|
||||
scenarios_list, config, wait_duration
|
||||
scenarios_list, config, wait_duration, kubecli
|
||||
)
|
||||
|
||||
# Inject time skew chaos scenarios specified
|
||||
# in the config
|
||||
# krkn_lib_kubernetes
|
||||
elif scenario_type == "time_scenarios":
|
||||
if distribution == "openshift":
|
||||
logging.info("Running time skew scenarios")
|
||||
time_actions.run(scenarios_list, config, wait_duration)
|
||||
time_actions.run(scenarios_list, config, wait_duration, kubecli)
|
||||
else:
|
||||
logging.error(
|
||||
"Litmus scenarios are currently "
|
||||
@@ -256,13 +263,14 @@ def main(cfg):
|
||||
if litmus_install:
|
||||
# Remove Litmus resources
|
||||
# before running the scenarios
|
||||
common_litmus.delete_chaos(litmus_namespace)
|
||||
common_litmus.delete_chaos(litmus_namespace, kubecli)
|
||||
common_litmus.delete_chaos_experiments(
|
||||
litmus_namespace
|
||||
litmus_namespace,
|
||||
kubecli
|
||||
)
|
||||
if litmus_uninstall_before_run:
|
||||
common_litmus.uninstall_litmus(
|
||||
litmus_version, litmus_namespace
|
||||
litmus_version, litmus_namespace, kubecli
|
||||
)
|
||||
common_litmus.install_litmus(
|
||||
litmus_version, litmus_namespace
|
||||
@@ -277,6 +285,7 @@ def main(cfg):
|
||||
litmus_uninstall,
|
||||
wait_duration,
|
||||
litmus_namespace,
|
||||
kubecli
|
||||
)
|
||||
else:
|
||||
logging.error(
|
||||
@@ -286,10 +295,12 @@ def main(cfg):
|
||||
sys.exit(1)
|
||||
|
||||
# Inject cluster shutdown scenarios
|
||||
# krkn_lib_kubernetes
|
||||
elif scenario_type == "cluster_shut_down_scenarios":
|
||||
shut_down.run(scenarios_list, config, wait_duration)
|
||||
shut_down.run(scenarios_list, config, wait_duration, kubecli)
|
||||
|
||||
# Inject namespace chaos scenarios
|
||||
# krkn_lib_kubernetes
|
||||
elif scenario_type == "namespace_scenarios":
|
||||
logging.info("Running namespace scenarios")
|
||||
namespace_actions.run(
|
||||
@@ -298,6 +309,7 @@ def main(cfg):
|
||||
wait_duration,
|
||||
failed_post_scenarios,
|
||||
kubeconfig_path,
|
||||
kubecli
|
||||
)
|
||||
|
||||
# Inject zone failures
|
||||
@@ -313,14 +325,16 @@ def main(cfg):
|
||||
)
|
||||
|
||||
# PVC scenarios
|
||||
# krkn_lib_kubernetes
|
||||
elif scenario_type == "pvc_scenarios":
|
||||
logging.info("Running PVC scenario")
|
||||
pvc_scenario.run(scenarios_list, config)
|
||||
pvc_scenario.run(scenarios_list, config, kubecli)
|
||||
|
||||
# Network scenarios
|
||||
# krkn_lib_kubernetes
|
||||
elif scenario_type == "network_chaos":
|
||||
logging.info("Running Network Chaos")
|
||||
network_chaos.run(scenarios_list, config, wait_duration)
|
||||
network_chaos.run(scenarios_list, config, wait_duration, kubecli)
|
||||
|
||||
# Check for critical alerts when enabled
|
||||
if check_critical_alerts:
|
||||
@@ -375,9 +389,9 @@ def main(cfg):
|
||||
sys.exit(1)
|
||||
|
||||
if litmus_uninstall and litmus_installed:
|
||||
common_litmus.delete_chaos(litmus_namespace)
|
||||
common_litmus.delete_chaos_experiments(litmus_namespace)
|
||||
common_litmus.uninstall_litmus(litmus_version, litmus_namespace)
|
||||
common_litmus.delete_chaos(litmus_namespace, kubecli)
|
||||
common_litmus.delete_chaos_experiments(litmus_namespace, kubecli)
|
||||
common_litmus.uninstall_litmus(litmus_version, litmus_namespace, kubecli)
|
||||
|
||||
if failed_post_scenarios:
|
||||
logging.error(
|
||||
|
||||
@@ -1,11 +0,0 @@
|
||||
---
|
||||
deployer:
|
||||
connection: {}
|
||||
type: kubernetes
|
||||
log:
|
||||
level: debug
|
||||
logged_outputs:
|
||||
error:
|
||||
level: error
|
||||
success:
|
||||
level: debug
|
||||
@@ -1,19 +0,0 @@
|
||||
input_list:
|
||||
- duration: 30s
|
||||
io_block_size: 1m
|
||||
io_workers: 1
|
||||
io_write_bytes: 10m
|
||||
target_pod_folder: /data
|
||||
target_pod_volume:
|
||||
hostPath:
|
||||
path: /
|
||||
name: node-volume
|
||||
node_selector: { }
|
||||
# node selector example
|
||||
# node_selector:
|
||||
# kubernetes.io/hostname: master
|
||||
kubeconfig: ""
|
||||
namespace: default
|
||||
|
||||
# duplicate this section to run simultaneous stressors in the same run
|
||||
|
||||
@@ -1,136 +0,0 @@
|
||||
input:
|
||||
root: RootObject
|
||||
objects:
|
||||
RootObject:
|
||||
id: RootObject
|
||||
properties:
|
||||
kubeconfig:
|
||||
display:
|
||||
description: The complete kubeconfig file as a string
|
||||
name: Kubeconfig file contents
|
||||
type:
|
||||
type_id: string
|
||||
required: true
|
||||
namespace:
|
||||
display:
|
||||
description: The namespace where the container will be deployed
|
||||
name: Namespace
|
||||
type:
|
||||
type_id: string
|
||||
required: true
|
||||
node_selector:
|
||||
display:
|
||||
description: kubernetes node name where the plugin must be deployed
|
||||
type:
|
||||
type_id: map
|
||||
values:
|
||||
type_id: string
|
||||
keys:
|
||||
type_id: string
|
||||
required: true
|
||||
duration:
|
||||
display:
|
||||
name: duration the scenario expressed in seconds
|
||||
description: stop stress test after T seconds. One can also specify the units of time in
|
||||
seconds, minutes, hours, days or years with the suffix s, m, h, d or y
|
||||
type:
|
||||
type_id: string
|
||||
required: true
|
||||
io_workers:
|
||||
display:
|
||||
description: number of workers
|
||||
name: start N workers continually writing, reading and removing temporary files
|
||||
type:
|
||||
type_id: integer
|
||||
required: true
|
||||
io_block_size:
|
||||
display:
|
||||
description: single write size
|
||||
name: specify size of each write in bytes. Size can be from 1 byte to 4MB.
|
||||
type:
|
||||
type_id: string
|
||||
required: true
|
||||
io_write_bytes:
|
||||
display:
|
||||
description: Total number of bytes written
|
||||
name: write N bytes for each hdd process, the default is 1 GB. One can specify the size
|
||||
as % of free space on the file system or in units of Bytes, KBytes, MBytes and
|
||||
GBytes using the suffix b, k, m or g
|
||||
type:
|
||||
type_id: string
|
||||
required: true
|
||||
target_pod_folder:
|
||||
display:
|
||||
description: Target Folder
|
||||
name: Folder in the pod where the test will be executed and the test files will be written
|
||||
type:
|
||||
type_id: string
|
||||
required: true
|
||||
target_pod_volume:
|
||||
display:
|
||||
name: kubernetes volume definition
|
||||
description: the volume that will be attached to the pod. In order to stress
|
||||
the node storage only hosPath mode is currently supported
|
||||
type:
|
||||
type_id: object
|
||||
id: k8s_volume
|
||||
properties:
|
||||
name:
|
||||
display:
|
||||
description: name of the volume (must match the name in pod definition)
|
||||
type:
|
||||
type_id: string
|
||||
required: true
|
||||
hostPath:
|
||||
display:
|
||||
description: hostPath options expressed as string map (key-value)
|
||||
type:
|
||||
type_id: map
|
||||
values:
|
||||
type_id: string
|
||||
keys:
|
||||
type_id: string
|
||||
required: true
|
||||
required: true
|
||||
|
||||
steps:
|
||||
kubeconfig:
|
||||
plugin: quay.io/arcalot/arcaflow-plugin-kubeconfig:latest
|
||||
input:
|
||||
kubeconfig: !expr $.input.kubeconfig
|
||||
stressng:
|
||||
plugin: quay.io/arcalot/arcaflow-plugin-stressng:latest
|
||||
step: workload
|
||||
input:
|
||||
StressNGParams:
|
||||
timeout: !expr $.input.duration
|
||||
cleanup: "true"
|
||||
workdir: !expr $.input.target_pod_folder
|
||||
items:
|
||||
- stressor: hdd
|
||||
hdd: !expr $.input.io_workers
|
||||
hdd_bytes: !expr $.input.io_write_bytes
|
||||
hdd_write_size: !expr $.input.io_block_size
|
||||
|
||||
deploy:
|
||||
type: kubernetes
|
||||
connection: !expr $.steps.kubeconfig.outputs.success.connection
|
||||
pod:
|
||||
metadata:
|
||||
namespace: !expr $.input.namespace
|
||||
labels:
|
||||
arcaflow: stressng
|
||||
spec:
|
||||
nodeSelector: !expr $.input.node_selector
|
||||
pluginContainer:
|
||||
imagePullPolicy: Always
|
||||
volumeMounts:
|
||||
- mountPath: /data
|
||||
name: node-volume
|
||||
volumes:
|
||||
- !expr $.input.target_pod_volume
|
||||
|
||||
outputs:
|
||||
success:
|
||||
stressng: !expr $.steps.stressng.outputs.success
|
||||
|
||||
@@ -1,113 +0,0 @@
|
||||
input:
|
||||
root: RootObject
|
||||
objects:
|
||||
RootObject:
|
||||
id: RootObject
|
||||
properties:
|
||||
input_list:
|
||||
type:
|
||||
type_id: list
|
||||
items:
|
||||
id: input_item
|
||||
type_id: object
|
||||
properties:
|
||||
kubeconfig:
|
||||
display:
|
||||
description: The complete kubeconfig file as a string
|
||||
name: Kubeconfig file contents
|
||||
type:
|
||||
type_id: string
|
||||
required: true
|
||||
namespace:
|
||||
display:
|
||||
description: The namespace where the container will be deployed
|
||||
name: Namespace
|
||||
type:
|
||||
type_id: string
|
||||
required: true
|
||||
node_selector:
|
||||
display:
|
||||
description: kubernetes node name where the plugin must be deployed
|
||||
type:
|
||||
type_id: map
|
||||
values:
|
||||
type_id: string
|
||||
keys:
|
||||
type_id: string
|
||||
required: true
|
||||
duration:
|
||||
display:
|
||||
name: duration the scenario expressed in seconds
|
||||
description: stop stress test after T seconds. One can also specify the units of time in
|
||||
seconds, minutes, hours, days or years with the suffix s, m, h, d or y
|
||||
type:
|
||||
type_id: string
|
||||
required: true
|
||||
io_workers:
|
||||
display:
|
||||
description: number of workers
|
||||
name: start N workers continually writing, reading and removing temporary files
|
||||
type:
|
||||
type_id: integer
|
||||
required: true
|
||||
io_block_size:
|
||||
display:
|
||||
description: single write size
|
||||
name: specify size of each write in bytes. Size can be from 1 byte to 4MB.
|
||||
type:
|
||||
type_id: string
|
||||
required: true
|
||||
io_write_bytes:
|
||||
display:
|
||||
description: Total number of bytes written
|
||||
name: write N bytes for each hdd process, the default is 1 GB. One can specify the size
|
||||
as % of free space on the file system or in units of Bytes, KBytes, MBytes and
|
||||
GBytes using the suffix b, k, m or g
|
||||
type:
|
||||
type_id: string
|
||||
required: true
|
||||
target_pod_folder:
|
||||
display:
|
||||
description: Target Folder
|
||||
name: Folder in the pod where the test will be executed and the test files will be written
|
||||
type:
|
||||
type_id: string
|
||||
required: true
|
||||
target_pod_volume:
|
||||
display:
|
||||
name: kubernetes volume definition
|
||||
description: the volume that will be attached to the pod. In order to stress
|
||||
the node storage only hosPath mode is currently supported
|
||||
type:
|
||||
type_id: object
|
||||
id: k8s_volume
|
||||
properties:
|
||||
name:
|
||||
display:
|
||||
description: name of the volume (must match the name in pod definition)
|
||||
type:
|
||||
type_id: string
|
||||
required: true
|
||||
hostPath:
|
||||
display:
|
||||
description: hostPath options expressed as string map (key-value)
|
||||
type:
|
||||
type_id: map
|
||||
values:
|
||||
type_id: string
|
||||
keys:
|
||||
type_id: string
|
||||
required: true
|
||||
required: true
|
||||
steps:
|
||||
workload_loop:
|
||||
kind: foreach
|
||||
items: !expr $.input.input_list
|
||||
workflow: sub-workflow.yaml
|
||||
parallelism: 1000
|
||||
outputs:
|
||||
success:
|
||||
workloads: !expr $.steps.workload_loop.outputs.success.data
|
||||
|
||||
|
||||
|
||||
Reference in New Issue
Block a user