70 Commits

Author SHA1 Message Date
nilscan
0734773993 Skip primary check for appmesh 2019-10-02 14:29:48 +13:00
stefanprodan
2ff86fa56e Fix canary weight max value 2019-09-24 10:16:22 +03:00
stefanprodan
77d8e4e4d3 Use the promotion phase in A/B testing and Blue/Green 2019-09-23 22:14:44 +03:00
stefanprodan
800b0475ee Run the canary promotion on a separate stage
After the analysis finishes, Flagger will do the promotion and wait for the primary rollout to finish before routing all the traffic back to it. This ensures a smooth transition to the new version avoiding dropping in-flight requests.
2019-09-23 21:57:24 +03:00
stefanprodan
8282f86d9c Implement confirm-promotion hook
The confirm promotion hooks are executed right before the promotion step. The canary promotion is paused until the hooks return HTTP 200. While the promotion is paused, Flagger will continue to run the metrics checks and load tests.
2019-09-22 13:23:19 +03:00
stefanprodan
a6d86f2e81 Skip mesh routers for B/G when provider is kubernetes 2019-09-22 00:48:42 +03:00
stefanprodan
9d856a4f96 Implement B/G for service mesh providers
Blue/Green steps:
- scale up green
- run conformance tests on green
- run load tests and metric checks on green
- route traffic to green
- promote green spec over blue
- wait for blue rollout
- route traffic to blue
2019-09-21 21:21:33 +03:00
stefanprodan
c31e9e5a96 Use Linkerd metrics for ingress and kubernetes routers 2019-07-30 13:00:28 +03:00
stefanprodan
b2ca0c4c16 Implement finalising state
Set the canary status to finalising after routing the traffic back to the primary. Run one final loop before scaling the canary to zero so that the canary has a chance to process all inflight requests.
2019-07-29 13:52:11 +03:00
stefanprodan
163f5292b0 Push a notification when a canary is waiting for approval 2019-07-25 19:13:22 +03:00
stefanprodan
28e7e89047 Pause or resume analysis on confirmation gate toggle 2019-07-24 16:09:13 +03:00
stefanprodan
04cbacb6e0 Implement confirm rollout gate and hook
The confirm-rollout hooks are executed before the pre-rollout hooks. Flagger will halt the canary rollout until the confirm webhook returns HTTP status 200.
2019-07-24 12:09:39 +03:00
stefanprodan
108bf9ca65 Add initializing canary phase/status condition reason
Fix HPA reconciliation min replicas diff
2019-07-09 17:10:43 +03:00
stefanprodan
438f952128 Implement status conditions
Add Promoted status condition with the following reasons: Initialized, Progressing, Succeeded, Failed
Usage: `kubectl wait canary/app --for=condition=promoted`
Fix: #184
2019-07-09 15:22:56 +03:00
stefanprodan
ad8d02f701 Use Linkerd metrics when NGINX is the mesh ingress
Set the metrics provider to Linkerd Prometheus when using NGINX as Linkerd Ingress. This mitigates the lack of canary metrics in the NGINX controller exporter.
2019-06-30 13:03:27 +03:00
stefanprodan
63cb8a5ba5 Lookup the canary provider field during reconciliation
Override the global provider if one is specified in the canary spec
2019-06-20 14:52:43 +03:00
stefanprodan
bf7ebc9708 Skip readiness check on init for Istio SMI 2019-06-19 11:16:11 +03:00
stefanprodan
98beb1011e Skip primary check on init when using Istio
The deployment will become ready after the ClusterIP are created
2019-06-19 10:50:55 +03:00
stefanprodan
88c450e3bd Implement port discovery
If port discovery is enabled, Flagger scans the deployment pod template and extracts the container ports excluding the port specified in the canary service spec and Istio proxy ports. All the extra ports will be used when generation the ClusterIP services.
2019-06-15 16:34:32 +03:00
Olga Mirensky
9618d2ea0d Fix promoting canary when max weight is not a multiple of step 2019-05-23 10:18:19 +10:00
stefanprodan
5b3fd0efca Set Istio request duration to milliseconds 2019-05-15 20:01:27 +03:00
stefanprodan
0032c14a78 Refactor metrics
- add observer interface with builtin metrics functions
- add metrics observer factory
- add prometheus client
- implement the observer interface for istio, envoy and nginx
- remove deprecated istio and app mesh metric aliases (istio_requests_total, istio_request_duration_seconds_bucket, envoy_cluster_upstream_rq, envoy_cluster_upstream_rq_time_bucket)
2019-05-13 17:34:08 +03:00
stefanprodan
f7db0210ea Add nginx ingress controller checks 2019-05-06 18:43:02 +03:00
Yuval Kohavi
156488c8d5 Merge remote-tracking branch 'origin/master' into supergloo-updated 2019-04-17 18:24:41 -04:00
Yuval Kohavi
868482c240 basics seem working! 2019-04-16 15:10:08 -04:00
stefanprodan
6ef72e2550 Make the pod selector configurable
- default labels: app, name and app.kubernetes.io/name
2019-04-15 12:57:25 +03:00
stefanprodan
60f51ad7d5 Move deployer and config tracker to canary package 2019-04-15 11:27:08 +03:00
stefanprodan
edcff9cd15 Execute pre/post rollout webhooks
- halt the canary advancement if pre-rollout hooks are failing
- include the canary status (Succeeded/Failed) in the post-rollout webhook payload
- ignore post-rollout webhook failures
- log pre/post rollout webhook response result
2019-04-13 15:43:23 +03:00
stefanprodan
352ed898d4 Add request success rate and duration metrics alias 2019-04-12 17:00:04 +03:00
stefanprodan
f211e0fe31 Use go templates to render the builtin promql queries 2019-03-31 13:55:14 +03:00
stefanprodan
b2c12c1131 Move observer to metrics package 2019-03-30 11:45:39 +02:00
stefanprodan
48d9a0dede Ensure the status metric is set after a restart 2019-03-28 11:52:13 +02:00
stefanprodan
ca074ef13f Rename router sync to reconcile 2019-03-26 17:12:46 +02:00
stefanprodan
d07925d79d Fix canary status prom metrics 2019-03-25 17:26:22 +02:00
stefanprodan
941be15762 Fix typo in comments 2019-03-23 11:25:31 +02:00
stefanprodan
b4ae060122 Move to weaveworks org 2019-03-20 18:26:04 +02:00
stefanprodan
4b6126dd1a Add Envoy HTTP success rate metric check 2019-03-19 15:52:26 +02:00
stefanprodan
7d340c5e61 Change mesh providers based on cmd flag 2019-03-17 10:52:52 +02:00
Stefan Prodan
1cd0c49872 Merge pull request #88 from stefanprodan/ab-testing
A/B testing - canary with session affinity
2019-03-11 13:55:06 +02:00
stefanprodan
86ea172380 Fix weight metric report 2019-03-08 23:28:45 +02:00
Huy Le
6196f69f4d Create New Job when Canary's Interval changes
- Currently whenever the Canary analysis interval changes, flagger does
not reflect this into canary's job.
- This change will make sure the canary analysis interval got updated whenever
the Canary object's interval changes
2019-03-08 10:27:34 -08:00
stefanprodan
d8b847a973 Mention session affinity in docs 2019-03-08 15:05:44 +02:00
stefanprodan
bf1ca293dc Implement fix routing for canary analysis
Allow A/B testing scenarios where instead of weighted routing the traffic is split between the primary and canary based on HTTP headers or cookies.
2019-03-08 11:54:41 +02:00
stefanprodan
9680ca98f2 Rename service router to Kubernetes router 2019-03-05 02:12:52 +02:00
stefanprodan
42b850ca52 Replace controller routing management with router pkg 2019-03-05 02:04:55 +02:00
stefanprodan
5d81876d07 Make the metric interval optional
- set default value to 1m
2019-02-27 16:03:56 +02:00
stefanprodan
4d61a896c3 Add custom promql queries support 2019-02-27 15:48:31 +02:00
stefanprodan
29cdd43288 Implement skip analysis
When skip analysis is enabled, Flagger checks if the canary deployment is healthy and promotes it without analysing it. If an analysis is underway, Flagger cancels it and runs the promotion.
2019-02-13 15:30:29 +02:00
stefanprodan
5b296e01b3 Detect changes in configs and trigger canary analysis
- restart analysis if a ConfigMap or Secret changes during rollout
- add tests for tracked changes
2019-01-26 12:36:27 +02:00
stefanprodan
bd6d446cb8 Go format scheduler 2019-01-20 14:04:10 +02:00