* Added new scenario to fill up a given volumen
* fixing small issues and style
* adding PVC as input param instead of pod name
* small fix
* get container name and volumen name
replace oc with kubectl commands
* adding yaml file to create a pv, pvc and pod to run pvc_scenario
* adding support to match both string for describe command when looking for pod_name
* added support to find the pvc from a given pod
* small fix
* small fix
This commit enables users to start Kraken to act as listener by setting
the signal to PAUSE in the config to get the cluster to a desired test or
run any setup before injecting chaos by setting the signal to RUN. This
helps in cases where we have test cases that need to coordinate the chaos
at a desired time depending on the state of the cluster/test run.
This commit helps the cases where targeting application pods in a
namespace using pod-selector to create an outage fails because of
not being able to validate the selector.
Error message for reference:
error validating data: ValidationError(NetworkPolicy.spec.podSelector):
unknown field "app=dittybopper" in io.k8s.apimachinery.pkg.apis.meta.v1.LabelSelector
This commit enables users to simulate a downtime of an application
by blocking the traffic for the specified duration to see how
it/other components communicating with it behave in case of downtime.
This will avoid querying all namespaces for pods matching the label_selector
if defined as shown in the sample scenario config. This commit also prints a
pointer to the report generated at the end of the run.
- This eases the usage and debuggability by running the fault injection pods in
the same namespace as other resources of litmus. This will also ease the
deletion process and ensure that there are no leftover objects on the cluster.
- This commit also enables users to use the same rbac template for all the litmus
scenarios without having to pull in a specic one for each of the scenarios.
This commit adds support to create zone outage in AWS by denying both
ingress and egress traffic to the instances belonging to a particular
subnet belonging to the zone by tweaking the network acl. This creates
an outage of all the nodes in the zone - both master and workers.
This commit switches the object type from Deployment to Job to be able
to display the status after executing all the scenarios specified in
the Kraken config instead of crashing which is expected in Deployments.
Fixes https://github.com/cloud-bulldozer/kraken/issues/135
Current Kraken integration with Cerberus monitors the cluster as well as the
application health post chaos and pass/fails if they are not healthy after chaos.
This commit adds ability to monitor the user application health during the chaos
and fails the run in case of downtime as it's potentially a downtime in case of
customers environment as well. It is especially useful in case of control plane
failure scenarios including API server, Etcd, Ingress etc.
There are cases where the kubeconfig can be read only like when running
Kraken as a kubernetes deployment. This commit fixes the instances to
use -n flag instead of a namespace context switch.
This commit:
- Adds timeout to avoid operations hanging for long durations.
- Improves exception handling and exits wherever needed.
- Sets KUBECONFIG env var globoally to access the cluster.
This commit modifies the wait time from 60 seconds to 3 seconds between
each of the requests to the API to capture the components state at a more
granular level by default.
This commit:
- Adds support to automate the infrastructure pieces leveraged by Kraken
including Cerberus and Elasticsearch
- Adds a Kraken config that can be used to discover all the infra pieces
automatically without having to tweak the configuration.