Compare commits

...

5 Commits

Author SHA1 Message Date
Tullio Sebastiani
067969a81a docker version update (#502) 2023-10-03 11:49:20 -04:00
Naga Ravi Chaitanya Elluri
972ac12921 Bump krkn-lib version (#499) 2023-10-03 17:00:07 +02:00
Tullio Sebastiani
ea813748ae added OCP/K8S functionalities split in the roadmap (#498) 2023-09-26 00:59:20 -04:00
Tullio Sebastiani
782d04c1b1 Prints the telemetry json after sending it to the webservice (#479)
* prints telemetry json after sending it to the service


deserialized base64 parameters

* json output even if telemetry collection is disabled.
2023-09-25 12:00:08 -04:00
Naga Ravi Chaitanya Elluri
2fb58f9897 Update roadmap with completed and new items 2023-09-21 15:33:44 -04:00
5 changed files with 27 additions and 17 deletions

View File

@@ -2,10 +2,12 @@
Following are a list of enhancements that we are planning to work on adding support in Krkn. Of course any help/contributions are greatly appreciated.
- [Ability to run multiple chaos scenarios in parallel under load to mimic real world outages](https://github.com/redhat-chaos/krkn/issues/424)
- [Centralized storage for chaos experiments artifacts](https://github.com/redhat-chaos/krkn/issues/423)
- [Support for causing DNS outages](https://github.com/redhat-chaos/krkn/issues/394)
- [Support for pod level network traffic shaping](https://github.com/redhat-chaos/krkn/issues/393)
- [Ability to visualize the metrics that are being captured by Kraken and stored in Elasticsearch](https://github.com/redhat-chaos/krkn/issues/124)
- Support for running all the scenarios of Kraken on Kubernetes distribution - see https://github.com/redhat-chaos/krkn/issues/185, https://github.com/redhat-chaos/krkn/issues/186
- Continue to improve [Chaos Testing Guide](https://redhat-chaos.github.io/krkn) in terms of adding best practices, test environment recommendations and scenarios to make sure the OpenShift platform, as well the applications running on top it, are resilient and performant under chaotic conditions.
- [ ] [Ability to run multiple chaos scenarios in parallel under load to mimic real world outages](https://github.com/redhat-chaos/krkn/issues/424)
- [x] [Centralized storage for chaos experiments artifacts](https://github.com/redhat-chaos/krkn/issues/423)
- [ ] [Support for causing DNS outages](https://github.com/redhat-chaos/krkn/issues/394)
- [ ] [Support for pod level network traffic shaping](https://github.com/redhat-chaos/krkn/issues/393)
- [ ] [Ability to visualize the metrics that are being captured by Kraken and stored in Elasticsearch](https://github.com/redhat-chaos/krkn/issues/124)
- [ ] Support for running all the scenarios of Kraken on Kubernetes distribution - see https://github.com/redhat-chaos/krkn/issues/185, https://github.com/redhat-chaos/krkn/issues/186
- [ ] Continue to improve [Chaos Testing Guide](https://redhat-chaos.github.io/krkn) in terms of adding best practices, test environment recommendations and scenarios to make sure the OpenShift platform, as well the applications running on top it, are resilient and performant under chaotic conditions.
- [ ] [Switch documentation references to Kubernetes](https://github.com/redhat-chaos/krkn/issues/495)
- [ ] [OCP and Kubernetes functionalities segregation](https://github.com/redhat-chaos/krkn/issues/497)

View File

@@ -14,7 +14,7 @@ COPY --from=azure-cli /usr/local/bin/az /usr/bin/az
# Install dependencies
RUN yum install -y git python39 python3-pip jq gettext wget && \
python3.9 -m pip install -U pip && \
git clone https://github.com/redhat-chaos/krkn.git --branch v1.4.5 /root/kraken && \
git clone https://github.com/redhat-chaos/krkn.git --branch v1.4.6 /root/kraken && \
mkdir -p /root/.kube && cd /root/kraken && \
pip3.9 install -r requirements.txt && \
pip3.9 install virtualenv && \

View File

@@ -14,7 +14,7 @@ COPY --from=azure-cli /usr/local/bin/az /usr/bin/az
# Install dependencies
RUN yum install -y git python39 python3-pip jq gettext wget && \
python3.9 -m pip install -U pip && \
git clone https://github.com/redhat-chaos/krkn.git --branch v1.4.5 /root/kraken && \
git clone https://github.com/redhat-chaos/krkn.git --branch v1.4.6 /root/kraken && \
mkdir -p /root/.kube && cd /root/kraken && \
pip3.9 install -r requirements.txt && \
pip3.9 install virtualenv && \

View File

@@ -37,4 +37,4 @@ prometheus_api_client
ibm_cloud_sdk_core
ibm_vpc
pytest
krkn-lib >= 1.0.0
krkn-lib >= 1.2.1

View File

@@ -1,5 +1,5 @@
#!/usr/bin/env python
import json
import os
import sys
import yaml
@@ -92,7 +92,7 @@ def main(cfg):
run_uuid = config["performance_monitoring"].get("uuid", "")
enable_alerts = config["performance_monitoring"].get("enable_alerts", False)
alert_profile = config["performance_monitoring"].get("alert_profile", "")
check_critical_alerts = config["performance_monitoring"].get("check_critical_alerts", False)
check_critical_alerts = config["performance_monitoring"].get("check_critical_alerts", False)
# Initialize clients
if (not os.path.isfile(kubeconfig_path) and
@@ -383,15 +383,23 @@ def main(cfg):
logging.info("")
# telemetry
# in order to print decoded telemetry data even if telemetry collection
# is disabled, it's necessary to serialize the ChaosRunTelemetry object
# to json, and recreate a new object from it.
telemetry.collect_cluster_metadata(chaos_telemetry)
decoded_chaos_run_telemetry = ChaosRunTelemetry(json.loads(chaos_telemetry.to_json()))
logging.info(f"Telemetry data:\n{decoded_chaos_run_telemetry.to_json()}")
if config["telemetry"]["enabled"]:
logging.info(f"telemetry data will be stored on s3 bucket folder: {telemetry_request_id}")
logging.info(f"telemetry upload log: {safe_logger.log_file_name}")
try:
telemetry.send_telemetry(config["telemetry"], telemetry_request_id, chaos_telemetry)
safe_logger.info("archives download started:")
prometheus_archive_files = telemetry.get_ocp_prometheus_data(config["telemetry"], telemetry_request_id)
safe_logger.info("archives upload started:")
telemetry.put_ocp_prometheus_data(config["telemetry"], prometheus_archive_files, telemetry_request_id)
if config["telemetry"]["prometheus_backup"]:
safe_logger.info("archives download started:")
prometheus_archive_files = telemetry.get_ocp_prometheus_data(config["telemetry"], telemetry_request_id)
safe_logger.info("archives upload started:")
telemetry.put_ocp_prometheus_data(config["telemetry"], prometheus_archive_files, telemetry_request_id)
except Exception as e:
logging.error(f"failed to send telemetry data: {str(e)}")
else:
@@ -431,7 +439,7 @@ def main(cfg):
else:
logging.error("Alert profile is not defined")
sys.exit(1)
if litmus_uninstall and litmus_installed:
common_litmus.delete_chaos(litmus_namespace, kubecli)
common_litmus.delete_chaos_experiments(litmus_namespace, kubecli)