Commit Graph

16 Commits

Author SHA1 Message Date
prubenda
41bf815f98 Adding shut down scenario for gcp, az, aws, openstack 2021-06-23 09:00:58 -04:00
Naga Ravi Chaitanya Elluri
e30a4243f6 Add support to alerting on metrics evaluation
This commit enables alerting in Kraken based on the Prometheus queries defined
by the user and modifies the return code of the run to determine pass/fail for
the run.
2021-06-22 15:22:37 -04:00
Naga Ravi Chaitanya Elluri
7e8f0450d6 Add support to scrape and index metrics
This commit:
- Enables Kraken to leverage kube-burner to scrape metrics from
  Prometheus and index them into Elasticsearch. This way we can
  take a look at the metrics in Grafana long term even after the
  cluster is terminated.
- Enables separation of operations based on distribution with
  OpenShift as the default option. One of the use cases is to
  capture Prometheus instance details as it's installed by default
  while it's optional for Kubernetes.
2021-06-21 14:55:50 -04:00
Naga Ravi Chaitanya Elluri
a7e28ca490 Add support to deploy performance dashboards
This commit enables performance monitoring on the cluster when
running Kraken to be able to observe how cluster reacts to failures
as it's important to make sure the cluster is healthy in terms of
both recovery as well as performance.
2021-02-10 16:06:55 -05:00
prubenda
1fc9683c8c Adding litmus scenario options 2020-12-03 12:45:35 -05:00
Yashashree1997
47847d86cd Adds the ability to run a specific type of scenario multiple times
With the current implementation, all the scenarios of specific type
(for example, pod scenario) has to be executed together. All
pod_scenarios are followed by node_scenarios and so on.
(pod_scenarios -> node_scenarios -> pod_scenarios is not possible)
This commit enables the user to run a specific type of scenario
multiple times. For example, few pod_scenarios followed by
node_scenarios followed by few_scenarios.
2020-10-30 10:40:42 -04:00
prubenda
6f31519e5f adding time scenario 2020-10-27 08:37:54 -04:00
Naga Ravi Chaitanya Elluri
82743230fe Modify documentation to improve readability
This commit:
- Converts various sections in the readme into individual documents.
- Adds pointers to the public blogs.
- Updates workflow/architecture diagram.
- Adds community info and contributing guidelines.
2020-10-21 15:01:54 -04:00
Mike Fiedler
2e5eac4550 Fix comment in config.yml 2020-10-09 13:20:26 -04:00
prubenda
8f5b688fba working on powerfulseal retry logic 2020-09-11 17:08:31 -04:00
Yashashree Suresh
31f06b861a Added node scenarios to stop and terminate instance
This commit:
- Adds a node scenario to stop and start an instance
- Adds a node scenario to terminate an instance
- Adds a node scenario to reboot an instance
- Adds a node scenario to stop the kubelet
- Adds a node scenario to crash the node
2020-08-27 16:50:42 -04:00
prubenda
0fc82090f2 Adding watch to see if components recovered 2020-08-18 16:26:04 -04:00
prubenda
44e753867f Adding random regex pod kill 2020-07-06 22:00:12 -04:00
prubenda
52e232d0e7 Adding iterations or infinite run of kraken 2020-06-09 10:55:24 -04:00
Yashashree Suresh
f1c145e942 Integrated cerberus for checking cluster health 2020-04-22 23:30:21 -04:00
Naga Ravi Chaitanya Elluri
649134e492 Add initial version of kraken
This commit:
- Adds support to run pod chaos scenarios including killing an etcd,
  ApiServer and kube-apiserver using powerfulseal tool.
- Adds support to create a report with the details about each chaos
  injection along with timestamps. The report is generated in the
  run directory.
- Adds kubernetes package with a bunch of functions which can be
  used later to talk to the kubernetes API to be able to know the
  status of the targeted components/nodes.
2020-04-20 08:57:00 -04:00