Commit Graph

193 Commits

Author SHA1 Message Date
Michael Parker
1d5029d607 Merge branch 'event-webhook' of github.com:mrparkers/flagger into event-webhook 2020-01-07 09:39:13 -06:00
Michael Parker
e6d1880c93 use correct event type 2020-01-07 09:38:14 -06:00
Michael Parker
6da533090a Update controller.go 2020-01-06 19:12:39 -06:00
Michael Parker
38dfda9d8f add event-webhook command line flag 2020-01-06 16:35:42 -06:00
Stefan Prodan
968d67a7c3 Merge pull request #386 from mumoshu/envoy-canary-analysis
feat: Support for canary analysis on deployments and services behind Envoy
2019-12-18 19:22:18 +02:00
Yusuke Kuoka
357ef86c8b Differentiate AppMesh observer vs Crossover observer
To not break AppMesh integration.
2019-12-18 22:03:30 +09:00
Yusuke Kuoka
d75ade5e8c Fix envoy dashboard, scheduler, and envoy metrics provider to correctly pass canary analysis and show graphs 2019-12-18 10:55:49 +09:00
Yusuke Kuoka
6661406b75 Metrics provider for deployments and services behind Envoy
Assumes `envoy:smi` as the mesh provider name as I've successfully tested the progressive delivery for Envoy + Crossover with it.

This enhances Flagger to translate it to the metrics provider name of `envoy` for deployment targets, or `envoy:service` for service targets.

The `envoy` metrics provider is equivalent to `appmesh`, as both relies on the same set of standard metrics exposed by Envoy itself.

The `envoy:service` is almost the same as the `envoy` provider, but removing the condition on pod name, as we only need to filter on the backing service name = envoy_cluster_name. We don't consider other Envoy xDS implementations that uses anything that is different to original servicen ames as `envoy_cluster_name`, for now.

Ref #385
2019-11-30 13:03:01 +09:00
stefanprodan
8766523279 Add initialization phase to Kubernetes router
Create Kubernetes services before deployments because Envoy's readiness depends on existing ClusterIPs
2019-11-27 22:15:04 +02:00
Yusuke Kuoka
1ba595bc6f feat: Canary-release anything behind K8s service
Resolves #371

---

This adds the support for `corev1.Service` as the `targetRef.kind`, so that we can use Flagger just for canary analysis and traffic-shifting on existing and pre-created services. Flagger doesn't touch deployments and HPAs in this mode.

This is useful for keeping your full-control on the resources backing the service to be canary-released, including pods(behind a ClusterIP service) and external services(behind an ExternalName service).

Major use-case in my mind are:

- Canary-release a K8s cluster. You create two clusters and a master cluster. In the master cluster, you create two `ExternalName` services pointing to (the hostname of the loadbalancer of the targeted app instance in) each cluster. Flagger runs on the master cluster and helps safely rolling-out a new K8s cluster by doing a canary release on the `ExternalName` service.
- You want annotations and labels added to the service for integrating with things like external lbs(without extending Flagger to support customizing any aspect of the K8s service it manages

**Design**:

A canary release on a K8s service is almost the same as one on a K8s deployment. The only fundamental difference is that it operates only on a set of K8s services.

For example, one may start by creating two Helm releases for `podinfo-blue` and `podinfo-green`, and a K8s service `podinfo`. The `podinfo` service should initially have the same `Spec` as that of  `podinfo-blue`.

On a new release, you update `podinfo-green`, then trigger Flagger by updating the K8s service `podinfo` so that it points to pods or `externalName` as declared in `podinfo-green`. Flagger does the rest. The end result is the traffic to `podinfo` is gradually and safely shifted from `podinfo-blue` to `podinfo-green`.

**How it works**:

Under the hood, Flagger maintains two K8s services, `podinfo-primary` and `podinfo-canary`. Compared to canaries on K8s deployments, it doesn't create the service named `podinfo`, as it is already provided by YOU.

Once Flagger detects the change in the `podinfo` service, it updates the `podinfo-canary` service and the routes, then analyzes the canary. On successful analysis, it promotes the canary service to the `podinfo-primary` service. You expose the `podinfo` service via any L7 ingress solution or a service mesh so that the traffic is managed by Flagger for safe deployments.

**Giving it a try**:

To give it a try, create a `Canary` as usual, but its `targetRef` pointed to a K8s service:

```
apiVersion: flagger.app/v1alpha3
kind: Canary
metadata:
  name: podinfo
spec:
  provider: kubernetes
  targetRef:
    apiVersion: core/v1
    kind: Service
    name: podinfo
  service:
    port: 9898
  canaryAnalysis:
    # schedule interval (default 60s)
    interval: 10s
    # max number of failed checks before rollback
    threshold: 2
    # number of checks to run before rollback
    iterations: 2
    # Prometheus checks based on
    # http_request_duration_seconds histogram
    metrics: []
```

Create a K8s service named `podinfo`, and update it. Now watch for the services `podinfo`, `podinfo-primary`, `podinfo-canary`.

Flagger tracks `podinfo` service for changes. Upon any change, it reconciles `podinfo-primary` and `podinfo-canary` services. `podinfo-canary` always replicate the latest `podinfo`. In contract, `podinfo-primary` replicates the latest successful `podinfo-canary`.

**Notes**:

- For the canary cluster use-case, we would need to write a K8s operator to, e.g. for App Mesh, sync `ExternalName` services to AppMesh `VirtualNode`s. But that's another story!
2019-11-27 09:07:29 +09:00
stefanprodan
9af6ade54d Skip primary check on skip analysis 2019-11-25 23:48:22 +02:00
stefanprodan
4454c9b5b5 Add canary factory for Kubernetes targets
- extract Kubernetes operations to controller interface
- implement controller interface for kind Deployment
2019-11-25 18:45:19 +02:00
stefanprodan
0e9fe8a446 Remove the traffic mention from the custom metrics error log
Fix: #361
2019-11-07 09:36:38 +02:00
stefanprodan
c9bacdfe05 Update Istio to v1.3.3 2019-10-28 15:19:17 +02:00
stefanprodan
d6c5bdd241 Implement metrics server override 2019-10-17 11:37:54 +03:00
stefanprodan
673b6102a7 Add the name label to ClusterIP services and primary deployment 2019-10-09 13:01:15 +03:00
stefanprodan
45df96ff3c Format imports 2019-10-06 12:54:01 +03:00
stefanprodan
2d9098e43c Add target port number and name tests 2019-10-06 10:31:50 +03:00
stefanprodan
625eed0840 Enforce blue/green when using kubernetes networking
Use blue/green with ten iterations and warn that progressive traffic shifting and HTTP headers routing are not compatible with Kubernetes L4 networking.
2019-10-05 17:59:34 +03:00
stefanprodan
37f9151de3 Add traffic mirroring documentation 2019-10-05 16:23:43 +03:00
Stefan Prodan
9a9baadf0e Merge pull request #311 from andrewjjenkins/mirror
Add traffic mirroring for Istio service mesh
2019-10-05 10:34:25 +03:00
Andrew Jenkins
61f8aea7d8 add Traffic Mirroring to Blue/Green deployments
Traffic mirroring for blue/green will mirror traffic for the entire
canary analysis phase of the blue/green deployment.
2019-10-03 14:33:49 -06:00
Andrew Jenkins
e384b03d49 Add Traffic Mirroring for Istio Service Mesh
Traffic mirroring is a pre-stage for canary deployments.  When mirroring
is enabled, at the beginning of a canary deployment traffic is mirrored
to the canary instead of shifted for one canary period.  The service
mesh should mirror by copying the request and sending one copy to the
primary and one copy to the canary; only the response from the primary
is sent to the user.  The response from the canary is only used for
collecting metrics.

Once the mirror period is over, the canary proceeds as usual, shifting
traffic from primary to canary until complete.

Added TestScheduler_Mirroring unit test.
2019-10-03 14:33:49 -06:00
nilscan
0734773993 Skip primary check for appmesh 2019-10-02 14:29:48 +13:00
Andrew Jenkins
655df36913 Extend test SetupMocks() to take arbitrary Canary resources
SetupMocks() currently takes a bool switch that tells it to configure
against either a shifting canary or an A-B canary.  I'll need a third
canary that has mirroring turned on so I changed this to an interface
that just takes the canary to configure (and configs the default
shifting canary if you pass nil).
2019-09-24 16:15:45 -06:00
Andrew Jenkins
2e079ba7a1 Add mirror to router interface and implement for istio router
The mirror option will be used to tell routers to configure traffic
mirroring.  Implement mirror for GetRoutes and SetRoutes for Istio.  For
other routers, GetRoutes always returns mirror == false, and SetRoutes
ignores mirror.

After this change there is no behavior change because no code sets
mirror true (yet).

Enhanced TestIstioRouter_SetRoutes and TestIstioRouter_GetRoutes.
2019-09-24 16:15:45 -06:00
stefanprodan
2ff86fa56e Fix canary weight max value 2019-09-24 10:16:22 +03:00
stefanprodan
fe96af64e9 Add canary phases tests 2019-09-23 22:24:40 +03:00
stefanprodan
77d8e4e4d3 Use the promotion phase in A/B testing and Blue/Green 2019-09-23 22:14:44 +03:00
stefanprodan
800b0475ee Run the canary promotion on a separate stage
After the analysis finishes, Flagger will do the promotion and wait for the primary rollout to finish before routing all the traffic back to it. This ensures a smooth transition to the new version avoiding dropping in-flight requests.
2019-09-23 21:57:24 +03:00
stefanprodan
8282f86d9c Implement confirm-promotion hook
The confirm promotion hooks are executed right before the promotion step. The canary promotion is paused until the hooks return HTTP 200. While the promotion is paused, Flagger will continue to run the metrics checks and load tests.
2019-09-22 13:23:19 +03:00
stefanprodan
a6d86f2e81 Skip mesh routers for B/G when provider is kubernetes 2019-09-22 00:48:42 +03:00
stefanprodan
9d856a4f96 Implement B/G for service mesh providers
Blue/Green steps:
- scale up green
- run conformance tests on green
- run load tests and metric checks on green
- route traffic to green
- promote green spec over blue
- wait for blue rollout
- route traffic to blue
2019-09-21 21:21:33 +03:00
Anton Kislitcyn
f56b6dd6a7 Add annotations prefix for ingresses 2019-09-06 11:36:06 +02:00
stefanprodan
c31e9e5a96 Use Linkerd metrics for ingress and kubernetes routers 2019-07-30 13:00:28 +03:00
stefanprodan
c463b6b231 Add finalising state tests 2019-07-29 14:02:16 +03:00
stefanprodan
b2ca0c4c16 Implement finalising state
Set the canary status to finalising after routing the traffic back to the primary. Run one final loop before scaling the canary to zero so that the canary has a chance to process all inflight requests.
2019-07-29 13:52:11 +03:00
stefanprodan
163f5292b0 Push a notification when a canary is waiting for approval 2019-07-25 19:13:22 +03:00
stefanprodan
28e7e89047 Pause or resume analysis on confirmation gate toggle 2019-07-24 16:09:13 +03:00
stefanprodan
04cbacb6e0 Implement confirm rollout gate and hook
The confirm-rollout hooks are executed before the pre-rollout hooks. Flagger will halt the canary rollout until the confirm webhook returns HTTP status 200.
2019-07-24 12:09:39 +03:00
stefanprodan
9d89e0c83f Log status update error 2019-07-10 09:55:20 +03:00
stefanprodan
108bf9ca65 Add initializing canary phase/status condition reason
Fix HPA reconciliation min replicas diff
2019-07-09 17:10:43 +03:00
stefanprodan
438f952128 Implement status conditions
Add Promoted status condition with the following reasons: Initialized, Progressing, Succeeded, Failed
Usage: `kubectl wait canary/app --for=condition=promoted`
Fix: #184
2019-07-09 15:22:56 +03:00
stefanprodan
10c61daee4 Exit when losing leadership 2019-07-07 13:09:05 +03:00
stefanprodan
61d0216c21 Add traffic routing to notifications 2019-07-06 17:13:09 +03:00
stefanprodan
ba4a2406ba Refactor notifier to allow more implementations 2019-07-06 15:47:12 +03:00
stefanprodan
ad8d02f701 Use Linkerd metrics when NGINX is the mesh ingress
Set the metrics provider to Linkerd Prometheus when using NGINX as Linkerd Ingress. This mitigates the lack of canary metrics in the NGINX controller exporter.
2019-06-30 13:03:27 +03:00
stefanprodan
63cb8a5ba5 Lookup the canary provider field during reconciliation
Override the global provider if one is specified in the canary spec
2019-06-20 14:52:43 +03:00
stefanprodan
bf7ebc9708 Skip readiness check on init for Istio SMI 2019-06-19 11:16:11 +03:00
stefanprodan
98beb1011e Skip primary check on init when using Istio
The deployment will become ready after the ClusterIP are created
2019-06-19 10:50:55 +03:00