Remove initial content drafts.

This commit is contained in:
Marko Anastasov
2019-11-01 13:35:33 +01:00
parent 85cf4e0b5d
commit 8d4aba55d4
12 changed files with 0 additions and 1862 deletions

434
ch01.md
View File

@@ -1,434 +0,0 @@
# Using Docker for Development and Continuous Delivery: Why and How
Five years ago, in 2013, Solomon Hykes showed a demo of the first
version of Docker during the PyCon conference in Santa Clara.
Since then, containers have spread to seemingly every corner of
the software industry. While Docker (the project and the company)
made containers so popular, they were not the first project to
leverage containers out there; and they are definitely not the last
either.
Five years later, we can hopefully see beyond the hype as some
powerful, efficient patterns emerged to leverage containers to
develop and ship better software, faster.
I spent seven years working for Docker, and I was running containers
in production back when Docker was still dotCloud. I had the privilege
to help many organizations and teams to get started with containers,
and I would like to share a few things here.
First, the kind of benefits that you can expect from implementing containers.
Then, a realistic roadmap that any organization can follow realistically,
to attain these benefits.
## What we can expect
Containers *will not* instantly turn our monolithic, legacy applications
into distributed, scalable microservices.
Containers *will not* transform overnight all our software engineers into
"DevOps engineers". (Notably, because DevOps is not defined by our tools
or skills, but rather by a set of practices and cultural changes.)
So what can containers do for us?
### Set up development environments in minutes
One of my favorite demos with Docker (and its companion tool Compose)
is to show how to run a complex app locally, on any machine, in less
than five minutes.
It sums up to:
```bash
git clone https://github.com/jpetazzo/dockercoins
cd dockercoins
docker-compose up
```
You can run these three lines on any machine where Docker is installed
(Linux, macOS, Windows), and in a few minutes, you will get the
DockerCoins demo app up and running. I wrote DockerCoins in 2015; it
has multiple components written in Python, Ruby, and Node.js, as well
as a Redis store. Years later, without changing anything in the code,
we can still bring it up with the same three commands.
This means that onboarding a new team member, or switching from a project
to another, can now be quick and reliable. It doesn't matter if
DockerCoins is using Python 2.7 and Node.js 8 while your other apps
are using Python 3 and Node.js 10, or if your system is using even
different versions of these languages; each container is perfectly isolated
from the others and from the host system.
We will see how to get there.
### Deploy easily in the cloud or on premises
After we build container images, we can run them consistently on any
server environment. Automating server installation would usually
require steps (and domain knowledge) specific to our infrastructure.
For instance, if we are using AWS EC2, we may use AMI (Amazon Machine
Images), but these images are different (and built differently) from
the ones used on Azure, Google Cloud, or a private OpenStack cluster.
Configuration management systems (like Ansible, Chef, Puppet, or Salt)
help us by describing our servers and their configuration as manifests
that live in version-controlled source repositories. This helps, but
writing these manifests is no easy task, and they don't guarantee
reproducible execution. These manifests have to be adapted when
switching distros, distro versions, and sometimes even from a cloud
provider to another, because they would use different network interface
or disk naming, for instance.
Once we have installed the Docker Engine (the most
popular option), it can run any container image and effectively
abstract these environment discrepancies.
The ability to stage up new environments easily and reliably
gives us exactly what we need to set up continuous integration
and continuous deployment. We will see how to get there.
Ultimately, it means that these advanced techniques (as well
as e.g. blue/green deployments, or immutable infrastructure)
become accessible to us, instead of being the privilege
of larger organizations able to spend
a lot of time to build their perfect custom tooling.
I'm (enviously) looking at you, Netflix!
## How we get there
I'm now going to share with you a roadmap that works for
organizations and teams of all size, regardless of their
existing knowledge of containers. Even better, this roadmap
will give you tangible benefits at each step, so that the gains
realized give you more confidence into the whole process.
Sounds too good to be true?
Here is the quick overview, before I dive into the details:
1. Write one Dockerfile.
(We will pick the one service where this will have the most impact.)
2. Write more Dockerfiles.
(The goal is to get a whole applications in containers.)
3. Write a Compose file.
(Now anyone can get this app running on their machine in minutes.)
4. Make sure that all developers are on board.
(Do they all have a Docker setup in good condition?)
5. Use this to facilitate QA and end-to-end testing.
6. Automate this process: congratulations, you are now doing
*continuous deployment to staging.*
7. The last logical step is *continuous deployment to production.*
Each step is a self-contained iteration. Some steps are easy,
others are more work; but each of them will improve your workflow.
### Writing our first Dockerfile
A good candidate for our first Dockerfile is a service that is a
pain in the neck to build, and moves quickly. For instance, that new
Rails app that we're building, and where we're adding or updating
dependencies every few days as we're adding features. Pure Ruby
dependencies are fine, but as soon as we rely on a system library,
we will hit the infamous "works on my machine (not on yours)"
problem, between the developers who are on macOS, and those
who are on Linux, for instance. Docker will help with that.
Another good candidate is an application that we are refactoring
or updating, and where we want to make sure that we are using
the latest version of the language or framework; without breaking
the environment for everything else.
If we have a component that is tricky enough to require a tool
like Vagrant to run on our developer's machines, it's also a good
hint that Docker can help there. While Vagrant is an amazing product,
there are many scenarios where maintaining a Dockerfile is easier
than maintaining a Vagrantfile; and running Docker is also easier
and lighter than running Vagrant boxes.
There are various ways to write our first Dockerfile, and none
of them is inherently right or wrong. Some people prefer to
follow the existing environment as close as possible. We're
currently using PHP 7.2 with Apache 2.4, and have some very specific
Apache configuration and `.htaccess` files? Sure, we can put that
in containers. But if we prefer to start anew from our `.php`
files, serve them with PHP FPM, and host the static assets from
a separate NGINX container (an incredibly powerful and scalable
combination!), that's fine too. Either way, the [official PHP images](
https://hub.docker.com/r/_/php/) got us covered.
During this phase, we'll want to make sure that the team working
on that service has Docker installed on their machine, but only
a few people will have to meddle with Docker at this point. They
will be leveling the field for everyone else.
Once we have a working Dockerfile for that app, we can start
using this container image as the official development environment
for this specific service or component. If we picked a fast-moving
one, we will see the benefits very quickly, since library and
other dependency upgrades will now be completely seamless.
Rebuilding the entire environment with a different language
version now becomes effortless; and if we realize after a difficult
upgrade that the new version doesn't work as well, rolling back is
just as easy and instantaneous (because Docker keeps a cache of
previous image builds around).
### Writing more Dockerfiles
The next step is to get an entire application in containers.
Don't get me wrong: we are not talking about production (yet),
and even if your first experiments go so well that you want to roll
out some containers to production, you can do so selectively,
only for some components. In particular, it is advised to keep
databases and other stateful services outside of containers until
you gain more operational experience.
But in development, we want everything in containers, including
the precious databases (because the ones sitting on our developers'
machines don't, or shouldn't, contain any precious data anyway).
We will probably have to write a few more Dockerfiles, but for
standard services like Redis, MySQL, PostgreSQL, MongoDB, and many more,
we will be able to use standard images from the Docker Hub.
These images often come with special provisions to make them
easy to extend and customize; for instance the [official PostgreSQL image](
https://hub.docker.com/r/_/postgres/) will automatically run
`.sql` files placed in the suitable directory (to pre-load
our database with table structure or sample data).
Once we have Dockerfiles (or images) for all the components
of a given application, we're ready for the next step.
### Writing a Compose file
A Dockerfile makes it easy to build and run a single container;
a Compose file makes it easy to build and run a stack of multiple containers.
So once each component runs correctly in a container, we can
describe the whole application with a Compose file.
This gives us the very simple workflow that we mentioned earlier:
```bash
git clone https://github.com/jpetazzo/dockercoins
cd dockercoins
docker-compose up
```
Compose will analyze the file `docker-compose.yml`, pull the
required images, and build the ones that need to. Then it will
create a private bridge network for the application, and start
all the containers in that network. Why use a private network
for the application? Isn't that a bit overkill?
Since Compose will create a new network for each app that it starts,
this lets us run multiple apps next to each other (or multiple
versions of the same app) without any risk of interference.
This pairs with Docker's service discovery mechanism, which relies
on DNS. When an application needs to connect to, say, a Redis server,
it doesn't need to specify the IP address of the Redis server,
or its FQDN. Instead, it can just use `redis` as the server host
name. For instance, in PHP:
```php
$redis = new Redis();
$redis->connect('redis', 6379);
```
Docker will make sure that the name `redis` resolves to the IP
address of the Redis container *in the current network*.
So multiple applications can each have a `redis` service, and
the name `redis` will resolve to the right one in each network.
### Profit!
Once we have that Compose file, it's a good time to make sure
that everyone is on board; i.e. that all our developers have
a working installation of Docker. Windows and Mac users will
find this particularly easy thanks to [Docker Desktop](
https://www.docker.com/products/docker-desktop).
Our team will need to know a few Docker and Compose
commands; but in many scenarios, they will be fine if they only
know `docker-compose up --build`. This command will make sure that
all images are up-to-date, and run the whole application, showing
its log in the terminal. If we want to stop the app, all we have
to do is hit `Ctrl-C`.
At this point, we are already benefiting immensely from
Docker and containers: everyone gets a consistent development
environment, up and running in minutes, independently of the host
system.
For simple applications that don't need to span multiple servers,
this would almost be good enough for production; but we don't have
to go there yet, as there are other fields where we take advantage of
Docker without the high stakes associated with production.
### End-to-end testing and QA
When we want to automate a task, it's a good idea to start
by having it done by a human, and write down the necessary steps.
In other words: do things manually first, but document them.
Then, these instructions can be given to another person,
who will execute them. That person will probably ask us some
clarifying questions, which will allow us to refine our manual
instructions.
Once these manual instructions are perfectly accurate, we can
turn them into a program (a simple script will often suffice)
that we can then execute automatically.
My suggestion is to follow these principles to deploy test
environments, and execute CI (Continuous Integration) or
end-to-end testing (depending on the kind of tests that you
use in your organization). Even if you don't have automated
testing, I guess that you have *some* kind of testing
happening before you ship a feature (even if it's just
someone messing around with the app in staging before your
users see it).
In practice, this means that we will document (and then
automate) the deployment of our application, so that anyone
can get it up and running by running a script.
The example that we gave above involved 3 lines, but in
a real application, we might have other steps. On the first
run, we probably want to populate the database with initial
objects; on subsequent runs, we might have to run *database
migrations* (when a release changes the database schema).
Our final deployment scripts will certainly have more than
3 lines, but they will also be way simpler (to write and to run)
than full-blown configuration management manifests, VM images, and so on.
If we have a QA team, they are now empowered to test new
releases without relying on someone else to deploy the code
for them!
(Don't get me wrong: I'm fully aware that many QA teams are
perfectly capable of deploying code themselves; but as projects
grow in complexity, we tend to get more specialized in our
respective roles, and it's not realistic to expect our whole
QA team to have both solid testing skills and "5 years of
experience with Capistrano, Puppet, Terraform".)
If you're doing any kind of unit testing or end-to-end
testing, you can now automate these tasks as well, by following the
same principle as we did to automate the deployment process.
We now have a whole sequence of actions: building
images, starting containers, executing initialization or
migration hooks, running tests ... From now on, we will call
this the *pipeline*, because all these actions have
to happen in a specific order, and if one of them fails,
we don't execute the subsequent stages.
### Continuous Deployment to staging
The next step is to hook our pipeline to our source repository,
to run it automatically on our code when we push changes to
the repository.
If we're using a system like GitHub or GitLab, we can set it up
to notify us (through a webhook) each time someone opens (or
updates) a pull request. We could also monitor a specific branch,
or a specific set of branches.
Each time there are relevant changes, our pipeline will automatically:
- build new images,
- run unit tests on these images (if applicable),
- deploy them in a temporary environment,
- run end-to-end tests on the application,
- make the application available for human testing.
If we had to build this from scratch, this would certainly
be a lot of work; but with the roadmap that I described,
we can get there one step at a time, while enjoying concrete
benefits at each step.
Note that we still don't require container orchestration
for all of this to work. If our application (in a staging
environment) can fit on a single machine, we don't need to
worry about setting up a cluster (yet). In fact, thanks to
Docker's layer system, running side-by-side images that share
a common ancestry (which *will* be the case for images
corresponding to successive versions of the same component)
is very disk- and memory-efficient; so there is a good
chance that we will be able to run many copies of our
app on a single Docker Engine.
But this is also the right time to start looking into
orchestration, and platforms like Docker Swarm or Kubernetes.
Again, I'm not suggesting that we roll that out straight to
production; but that we use one of these orchestrators
to deploy the staging versions of our application.
This will give us a low-risk environment where we can
ramp up our skills on container orchestration and scheduling,
while having the same level of complexity (minus the volume
of requests and data) that our production environment.
### Continuous Deployment to production
It might be a while before we go from the previous
stage to the next, because we need to build confidence
and operational experience.
However, at this point, we already have a *continuous
deployment pipeline* that takes every pull request
(or every change in a specific branch or set of branches)
and deploys the code on a staging cluster, in a fully
automated way.
Of course, we need to learn how to collect logs, and
metrics, and how to face minor incidents and major outages;
but eventually, we will be ready to extend our pipeline
all the way to the production environment.
### More containers, less risks
Independently of our CI/CD pipeline, we may want to use
containers in production for other reasons. Containers can
help us to reduce the risks associated with a new release.
When we start a new version of our app (by running the
corresponding image), if something goes wrong, rolling back
is very easy. All we have to do is stop the container, and
restart the previous version. The image for the previous
version will still be around and will start immediately.
This is way safer than attempting a code rollback, especially
if the new version implied some dependency upgrades. Are
we sure that we can downgrade to the previous version?
Is it still available on the package repositories?
If we are using containers, we don't have to worry about
that, since our container image is available and ready.
This pattern is sometimes called *immutable infrastructure*,
because instead of changing our services, we deploy new ones.
Initially, immutable infrastructure happened with virtual
machines: each new release would happen by starting a new
fleet of virtual machines. Containers make this even easier to
use.
As a result, we can deploy with more confidence, because
we know that if something goes wrong, we can easily go back
to the previous version.
## Final words TBD

534
ch02.md
View File

@@ -1,534 +0,0 @@
# How Kubernetes simplifies zero-downtime deployment
When getting started with Kubernetes, one of the first commands
that we learn and use is generally `kubectl run`. Folks who have
experience with Docker tend to compare it to `docker run`, and
think: "Ah, this is how I can simply run a container!"
As it turns out, when one uses Kubernetes, one doesn't simply
run a container.
Let's look at what happens after running a very basic `kubectl run`
command:
```bash
$ kubectl run web --image=nginx
deployment.apps/web created
```
Alright! Then we check what was created on our cluster, and ...
```bash
$ kubectl get all
NAME READY STATUS RESTARTS AGE
pod/web-65899c769f-dhtdx 1/1 Running 0 11s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 46s
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
deployment.apps/web 1 1 1 1 11s
NAME DESIRED CURRENT READY AGE
replicaset.apps/web-65899c769f 1 1 1 11s
$
```
Instead of getting a *container*, we got a whole zoo of unknown beasts:
- a *deployment* (called `web` in this example),
- a *replicaset* (`web-65899c769f`),
- a *pod* (`web-65899c769f-dhtdx`).
(Note: we can ignore the *service* named `kubernetes` in the example
above; that one already did exist before our `kubectl run` command.)
"I just wanted a container! Why do I get three different objects?"
We are going to explore the roles of these different objects, and
explain how they are essential to zero-downtime deployments in Kubernetes.
This is the kind of situation where at first glance, we wonder
"what's the point of this?", but once we get the full picture,
we will understand the role and purpose of each component.
(In fact, a lot of people end up thinking that if we had been tasked
with designing the system, they would have come up with something quite similar!)
## Containers and pods
In Kubernetes, the smallest unit of deployment is not a container;
it's a *pod*. A pod is just a group of containers (it can be a group
of *one* container) that run on the same machine, and share a few
things together.
For instance, the containers within a pod can communicate with each
other over `localhost`. From a network perspective, all the processes
in these containers are local.
But we can never create a standalone container: the closest we can do
is create a *pod*, with a single container in it.
That's what happens here: when we tell Kubernetes, "create me some
NGINX!", we're really saying, "I would like a pod, in which there
should be a single container, using the `nginx` image."
Alright, then, why don't we just have a pod? Why the replica set and
deployment?
## Declarative and imperative
Kubernetes is a *declarative* system (by opposition to *imperative* systems).
This means that we can't give it *orders*.
We can't say, "run this container." All we can do, is describe
*what we want to have*, and wait for Kubernetes to take action to *reconcile*
what we have, with what we want to have.
In other words, we can say, "I would like a 40-feet long blue container
with yellow doors," and Kubernetes will find such a container for us.
If it doesn't exist, it will build it; if there is already one but it's green
with red doors, it will paint it for us; if there is already a container
of the right size and color, Kubernetes will do nothing, since *what we have*
already matches *what we want*.
In software container terms, we can say, "I would like a pod named `web`,
in which there should be a single container, that will run the `nginx` image."
If that pod doesn't exist yet, Kubernetes will create it. If that pod
already exists and matches our spec, Kubernetes doesn't need to do anything.
With that in mind, how do we scale our `web` application, so that it runs
in multiple containers or pods?
## Pods and replica sets
If all we have is a pod, and we want more identical pods, all we can do
is get back to Kubernetes, and ask it, "I would like a pod named `web2`,
with the following specification: ..." and re-use the same specification
as before. Then repeat as many times as we want to have pods.
This is rather inconvenient, because it is now our job to keep track of
all these pods, and to make sure that they are all in sync and use the
same specification.
To make things simpler, Kubernetes gives us a higher level construct,
the *replica set*. The specification of a replica set looks very much like
the specification of a pod, except that it carries a number, indicating how
many replicas (pods with that particular specification) we want.
So we tell Kubernetes, "I would like a replica set named `web`, which
should have 3 pods, all matching the following specification: ..." and
Kubernetes will accordingly make sure that there are exactly 3 matching pods.
If we start from scratch, the 3 pods will be created. If we already have 3 pods,
nothing is done, because *what we have* already matches *what we want*.
Replica sets are particularly relevant for scaling and high availability.
For scaling, because we can update an existing replica set to change the
desired number of replicas. As a consequence, Kubernetes will create or
delete pods, so that there ends up being exactly the desired number.
For high availability, because Kubernetes will continuously monitor
what's going on on the cluster, and it will ensure that no matter what happens,
we still have the desired number. If a node goes down, taking one of the `web` pods
with it, Kubernetes creates another pod to replace it. If it turns out that the
node wasn't down, but merely unreachable or unresponsive for a while, when it
comes back, we may have one extra pod. Kubernetes will then terminate a pod to make
sure that we still have exactly the requested number.
What happens, however, if we want to change the definition of the *pod*
within our replica set? For instance, if we want to switch the image that we
are using?
Remember: the mission of the replica set is, "make sure that there are N pods
matching this specification." What happens if we change that definition?
Suddenly, there are zero pods matching the new specification.
As a result, Kubernetes immediately creates N pods matching that new specification.
The old pods just stay around, and will remain until we clean them up manually.
It would be nice if these pods could be removed cleanly; and if the creation
of new pods could happen in a more gradual manner.
## Replica sets and deployments
This is exactly the role of *deployments*. At a first glance, the specification
for a deployment looks very much like the one for a replica set: it features
a pod specification, and a number of replicas. (And a few additional parameters
that we will discuss a bit later!)
Deployments, however, don't create (or delete) pods directly.
They delegate that work to one ore more replica sets.
When we create a deployment, it creates a replica set, using the exact
pod specification that we gave it.
When we update a deployment and adjust the number of replicas, it
passes that update down to the replica set.
Things get interesting when we need to update the pod specification itself.
For instance, we might want to change the image to use (because we're
releasing a new version), or the application's parameters (through
command-line arguments, environment variables, or configuration files).
When we update the pod specification, the deployment creates
a *new* replica set with the updated pod specification.
That replica set has an intial size of zero.
Then, the size of that replica set is progressively increased,
while decreasing the size of the other replica set. We could imagine that
we have a sound mixing board in front of us, and we are going to fade in
(turn up the volume) on the new replica set, while we fade out (turn down
the volume) on the old one.
During the whole process, requests are sent to pods of both the old and new
replica sets, without any downtime for our users.
That's the big picture, but there are many little details that make
this process even more robust.
## Broken deployments and readiness probes
If we roll out a broken version, it could bring the entire application down
(one pod at a time!), as Kubernetes will steadily replace our old pods
with the new (broken) version, one at a time.
Unless we use *readiness probes*.
A readiness probe is a test that we add to a container specification.
It's a binary test, that can only say "IT WORKS" or "IT DOESN'T," and
will get executed at regular intervals. (By default, every 10 seconds.)
Kubernetes uses the result of that test to know if the container (and the
pod that it's a part of) is ready to receive traffic. When we roll out
a new version, Kubernetes will wait for the new pod to mark itself as
"ready" before moving on to the next one.
If a pod never reaches the ready state (because the readiness probe keeps
failing), Kubernetes will never move on to the next. The deployment stops,
and our application keeps running with the old version until we address
the issue.
Note: if there is *no* readiness probe, then the container is
considered as ready, as long as it could be started. So make sure
that you define a readiness probe if you want to leverage that feature!
## Rollbacks
At any point in time, during the rolling update or even later, we
can tell Kubernetes: "Hey, I changed my mind; please go back to the
previous version of that deployment." It will immediately switch
the roles of the "old" and "new" replica sets. From that point, it
will increase the size of the old replica set (up to the nominal
size of the deployment), while decreasing the size of the other one.
Generally speaking, this is not limited to two "old" and "new"
replica sets. Under the hood, there is one replica set that is
considered "up-to-date" and that we can think of as the "target"
replica set. That's the one that we're trying to move to; that's
the one that Kubernetes will progressively scale up. Simultaneously,
there can be any number of other replica sets, corresponding to older versions.
As an example, we might run version 1 of an application over 10
replicas. Then we start rolling out version 2. At some point, we
might have 7 pods running version 1, and 3 pods running version 2.
We might then decide to release version 3 without waiting for
version 2 to be fully deployed (because it fixes an issue that we
hadn't noticed earlier). And while version 3 is being deployed,
we might decide, after all, to go back to version 1. Kubernetes
will merely adjust the sizes of the replica sets (corresponding
to versions 1, 2, and 3 of the application) accordingly.
## MaxSurge and MaxUnavailable
Kubernetes doesn't exactly update our deployment one pod at a time.
Earlier, we said that deployments had "a few extra parameters": these
parameters include *MaxSurge* and *MaxUnavailable*, and they
indicate the pace at which the update should proceed.
We could imagine two strategies when rolling out new versions.
We could be very conservative about our application availability,
and decide to start new pods *before* shutting down old ones.
Only after a new pod is up, running, and ready, we can terminate an old one.
This, however, implies that we have some spare capacity available on
our cluster. It might be the case that we can't afford to run any
extra pod, because our cluster is full to the brim, and that we
prefer to shutdown an old pod before starting a new one.
*MaxSurge* indicates how many extra pods we are willing to run
during a rolling update, while *MaxUnavailable* indicates how many
pods we can lose during the rolling update. Both parameters
are specific to a deployment (in other words, each deployment can
have different values for them). Both parameters can be expressed
as an absolute number of pods, or as a percentage of the deployment
size; and both parameters can be zero (but not at the same time).
Let's see a few typical values for MaxSurge and MaxUnavailable,
and what they mean.
Setting MaxUnavailable to 0 means, "do not shutdown any old pod
before a new one is up and ready to serve traffic."
Setting MaxSurge to 100% means, "immediately start all the new
pods" (implying that we have enough spare capacity on our cluster,
and that we want to go as fast as possible).
In Kubernetes 1.12, the default values for both parameters are 25%,
meaning that when updating a deployment of size 100, 25 new pods
are immediately created, while 25 old pods are shutdown. Each time
a new pod comes up (and is marked ready), another old pod can
be shutdown. Each time an old pod has completed its shutdown
(and its resources have been freed), another new pod can be created.
## Demo time!
It is easy to see these parameters in action. We don't need to
write custom YAML, define readiness probes, or anything like that.
All we have to do is to tell a deployment to use an invalid
image; for instance an image that doesn't exist. The containers
will never be able to come up, and Kubernetes will never mark
them as "ready."
If you have a Kubernetes cluster (a one-node cluster like
minikube or Docker Desktop is fine), you can run the following commands
in different terminals to watch what is going to happen:
- `kubectl get pods -w`
- `kubectl get replicasets -w`
- `kubectl get deployments -w`
- `kubectl get events -w`
Then, create, scale, and update a deployment with the following commands:
```bash
kubectl run deployment web --image=nginx
kubectl scale deployment web --replicas=10
kubectl set image deployment web nginx=that-image-does-not-exist
```
We see that the deployment is stuck, but 80% of the application's capacity
is still available.
If we run `kubectl rollout undo deployment web`, Kubernetes will
go back to the initial version (running the `nginx` image).
## Selectors and labels
It turns out that when we said earlier, "the job of a replica set
is to make sure that there are exactly N pods matching the right
specification", that's not exactly what's going on.
Actually, the replica set doesn't look at
the pods' specifications, but only at their labels. In other words, it
doesn't matter if the pods are running `nginx` or `redis` or whatever;
all that matters is that they have the right labels. (In our examples
above, these labels would look like `run=web` and `pod-template-hash=xxxyyyzzz`.)
A replica set as a *selector*, which is a logical expression
that "selects" (just like a `SELECT` query in SQL) a number of pods.
The replica set makes sure that there is the right number of pods,
creating or deleting pods if necessary; but it doesn't change
existing pods.
Just in case you're wondering: yes, it is absolutely possible to manually
create pods with these labels, but running a different image (or with
different settings), and fool our replica set.
At first, this could sound like a big potential problem. In practice,
though, it is very unlikely that we would accidentally pick
the "right" (or "wrong", depending on the perspective) labels,
because they involve a hash function on the pod's specification
that is all but random.
Selectors are also used by *services*, which act as the load balancers
for Kubernetes traffic, internal and external. We can create a service
for the `web` deployment with the following command:
```bash
kubectl expose deployment web --port=80
```
The service will have its own internal IP address
(denoted by the name `ClusterIP`),
and connections to this IP address on port 80 will be load-balanced
across all the pods of this deployment.
In fact, these connections will be load-balanced across all the pods
matching the service's selector. In that case, that selector will be
`run=web`.
When we edit the deployment and trigger a rolling update, a new
replica set is created. This replica set will create pods, whose
labels will include (among others) `run=web`. As such, these pods
will receive connections automatically.
This means that during a rollout, the deployment doesn't reconfigure
or inform the load balancer that pods are started and stopped.
It happens automatically through the *selector* of the service
associated to the load balancer.
(If you're wondering how probes and healthchecks play into this:
a pod is added as a valid *endpoint* for a service only if all its
containers pass their readiness check. In other words, a pod starts
receiving traffic only once it's actually ready for it.)
## Advanced rollout strategies
Sometimes, we want even more control when we roll out a new version.
Two popular techniques which you might have heard about are
*blue/green deployment* and *canary deployment*.
In blue/green deployment, we want to instantly switch over
all the traffic from the old version to the new, instead of doing it
progressively like explained previously. There could be a few
reasons for us to do that, including:
- we don't want a mix of old and new requests, and we want the
break from one version to the next to be as clean as possible;
- we are updating multiple components (say, web frontend and API
backend) together, and we don't want the new version of the
web frontend to talk to the old version of the API backend or
vice versa;
- if something goes wrong, we want the ability to revert as fast
as possible, without even waiting for the old set of containers
to restart.
We can achieve blue/green deployment by creating multiple
deployments (in the Kubernetes sense), and then switching from
one to another by changing the *selector* of our service.
This is easier than it sounds!
The following commands will create two deployments `blue` and
`green`, respectively using the `nginx` and `httpd` container
images:
```bash
kubectl create deployment blue --image=nginx
kubectl create deployment green --image=httpd
```
Then, we create a service called `web`, which initially won't
send traffic anywhere:
```bash
kubectl create service clusterip web --tcp=80
```
Now, we can update the selector of service `web` by
running `kubectl edit service web`. This will retrieve the
definition of service `web` from the Kubernetes API, and open
it in a text editor. Look for the section that says:
```yaml
selector:
app: web
```
... and replace `web` with `blue` or `green`, to your liking.
Save and exit. `kubectl` will push our updated definition back
to the Kubernetes API, and voilà! Service `web` is now sending
traffic to the corresponding deployment.
(You can verify for yourself by retrieving the IP address of
that service with `kubectl get svc web` and connecting to that
IP address with `curl`.)
The modification that we did with a text editor can also be
done entirely from the command line, using (for instance)
`kubectl patch` as follows:
```bash
kubectl patch service web -p '{"spec": {"selector": {"app": "green"}}}'
```
The advantage of blue/green deployment is that the traffic
switch is almost intantaneous, and we can roll back to the
previous version just as fast, by updating the service
definition again.
## Canary deployment
*Canary deployment* alludes to the canaries that were used in
coal mines, to detect dangerous concentrations of toxic gas like
carbon monoxide. The miners would carry a canary in a cage.
Canaries are more sensitive to toxic gas than humans.
If the canary passed out, it meant that the miners had reached
a dangerous area and should head back before *they* would pass out too.
How does that map to software deployment?
Sometimes, we can't or won't afford to affect all our users
with a flawed version, even for a brief period of time. So instead,
we do a partial rollout of the new version. For instance, we deploy
a couple of replicas running the new version; or we send 1% of our
users to that new version.
Then, we compare metrics between the current version and the canary
that we just deployed. If the metrics are similar, we can proceed.
If latency, error rates, or anything else looks wrong, we roll back.
This technique, which would be fairly involved to set up, ends up
being relatively straightforward thanks to Kubernetes' native
mechanisms of labels and selectors.
It's worth noting that in the previous example, we changed
the service's selector, but it is also possible to change the pods'
labels.
For instance, is a service's selector is set to look for pods
with the label `status=enabled`, we can apply such a label
to a specific pod with:
```bash
kubectl label pod fronted-aabbccdd-xyz status=enabled
```
We can apply labels *en masse* as well, for instance:
```bash
kubectl label pods -l app=blue,version=v1.5 status=enabled
```
And we can remove them just as easily:
```bash
kubectl label pods -l app=blue,version=v1.4 status-
```
## Conclusions
We saw a few techniques that can be used to deploy with more
confidence. Some of these techniques simply reduce the downtime
caused by the deployment itself, meaning that we can deploy
more often, without being afraid of affecting our users.
Some of these techniques give us a safety belt, preventing
a bad version from taking down our service. And some others
give us an extra peace of mind, like hitting the "SAVE" button
in a video game before trying a particularly difficult sequence,
knowing that if something goes wrong, we can always go back where
we were.
Kubernetes makes it possible for developers and operation teams
to leverage these techniques, which leads to safer deployments.
If the risk associated with deployments is lower, it means that
we can deploy more often, incrementally, and see more easily
the results of our changes as we implement them; instead of
deploying once a week or month, for instance.
The end result is a higher development velocity, lower time-to-market
for fixes and new features, as well as better availability of our
applications. Which was the whole point of implementing containers
in the first place.

View File

@@ -1,3 +0,0 @@
# Semaphore + Docker features overview
WRITEME

891
ch04.md
View File

@@ -1,891 +0,0 @@
# Implementing a production-grade CI/CD pipeline with Semaphore2.0 and Kubernetes
While developing software, you tend to have a set of guidelines, familiar to you, when building features locally and running builds. Linting your code and running tests fall into this category as well. All of this lets you take your mind off repetitive tasks and focus on what's important. Writing business logic and creating value.
Doing all of this by hand, manually, is fine if you're alone. If you try doing this in a team you won't have a nice time. It also raises the issue of having to dedicate team members to make sure all tests and builds succeed. That's another stress you don't need. Be lazy instead. Everything that can be automated, should be.
## What we're building
This article will show you how to build a production-ready CI/CD pipeline, by using a simple multi-container application that works with [Docker Compose](https://docs.docker.com/compose/). I'll start slow and first explain the application, how it works, how to run the tests and build the Docker images. Once that's covered I'll move on to the Kubernetes setup, where I'll show how to create a cluster on AWS with [KOPS](https://github.com/kubernetes/kops), and deploy the containers with [Kompose](http://kompose.io/).
I'm going to assume you have an AWS account, the [AWS CLI](https://aws.amazon.com/cli/) installed on your machine, and credentials set up. But also make sure to [have Kubectl installed](https://kubernetes.io/docs/tasks/tools/install-kubectl/) as well. If you've never heard of KOPS, that's fine. [Here](https://github.com/kubernetes/kops#installing) are the install instructions.
The bulk of this walkthrough will cover the benefits of having a CI/CD pipeline and setting up a production-ready workflow with [Semaphore2.0](https://id.semaphoreci.com). The pipeline itself will automate everything you need. Once you push a new version of your application to GitHub, a webhook will trigger Semaphore2.0, and the pipeline will spring to life! The end goal is for it to build the Docker images, run all tests, push the images to Docker Hub and finally deploy everything to your cluster. But, it should only deploy automatically from the `master` branch.
Sounds fun, let's jump in!
## Step 1 - Get to know the multi-container system
The [system as a whole](https://github.com/adnanrahic/boilerplate-api) has has 3 distinct containers. First of all, a MongoDB database for persistent storage, then a Node.js API and an Nginx front end for serving HTML pages. It's configured with Docker Compose so all containers communicate between each other through networks. The MongoDB container will have a volume for persisting data even if the container is destroyed.
I tend to say, _the easiest way to learn is to follow along_, and today is not different. [Fork the repo](https://github.com/adnanrahic/boilerplate-api), clone the fork to your local machine and tag along.
### Build and test locally
What would a regular day look while developing features for a code base like this? Well, once you're done with a task, you'd most likely first just run *ye olde* smoke test. Run everything and CURL the endpoint to make sure it works. I did make it a bit easier by creating a tiny [bash script](https://github.com/adnanrahic/boilerplate-api/blob/master/smoke_test.sh).
To build the app you'll use a `docker-compose.yml` file. Here's the one from the repo.
```yaml
# docker-compose.yml
version: "3"
services:
mongo:
image: mongo:4.0
volumes:
- mongo:/data/db
networks:
backend:
aliases:
- mongo
ports:
- "27017"
api:
build:
context: ./
dockerfile: Dockerfile
image: "${API_IMAGE}:${TAG}"
networks:
backend:
aliases:
- api
frontend:
aliases:
- api
depends_on:
- mongo
ports:
- "3000:3000"
client:
build:
context: ./client
dockerfile: Dockerfile
image: "${CLIENT_IMAGE}:${TAG}"
networks:
- frontend
depends_on:
- api
ports:
- "80:80"
networks:
backend:
frontend:
volumes:
mongo:
```
As you can see it takes three environment variables as values for the images. Luckily, Docker Compose can use an `.env` file to import environment variables. Here's what the `.env` should like. Keep in mind, you need to keep it in the root of your project.
```
SECRET=secretkey
NODE_ENV=dev
DB=mongodb://mongo:27017/boilerplate_api
PORT=3000
API_IMAGE=<username/api_image>
CLIENT_IMAGE=<username/client_image>
TAG=latest
```
_**Note**: Make sure to add this file to the `.gitignore`._
Where `<username/api_image>` and `<username/client_image>` are placeholders for your Docker Hub username and the name you give the images. Another nice thing to do is to export these values to your terminal. This'll make your like easier as well.
```bash
$ export API_IMAGE=<username/api_image>
$ export CLIENT_IMAGE=<username/client_image>
```
After checking out the **docker-compose.yml** file, you'll need to make sure the **Dockerfile**s are nice and tidy. The Dockerfile for the Node.js API looks like this.
```Dockerfile
# API Dockerfile
FROM alpine:3.8 AS builder
WORKDIR /usr/src/app
RUN apk add --no-cache --update nodejs nodejs-npm
COPY package.json package-lock.json ./
RUN npm install
FROM alpine:3.8
WORKDIR /usr/src/app
RUN apk add --no-cache --update nodejs nodejs-npm
COPY . .
COPY --from=builder /usr/src/app/node_modules ./node_modules
EXPOSE 3000
CMD node app.js
```
It's using the builder pattern to minimize the final footprint of the production image. This means it'll use an intermediary image to install dependencies and then copy them over to the final image, minimizing Docker layers in the process.
The client's **Dockerfile** is a lot simpler.
```Dockerfile
FROM nginx:1.14-alpine
COPY . /usr/share/nginx/html
COPY ./default.conf /etc/nginx/conf.d/default.conf
EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]
```
A super simple Nginx server. With all of that out of the way, you can run the build, and trigger the smoke tests.
```bash
$ docker-compose up --build -d
$ ./smoke_test.sh localhost:3000/api # back end
$ ./smoke_test.sh localhost/api # front end
$ docker-compose down
```
Then you'd move on to running unit and integration tests. For this you'll need to change the `NODE_ENV` value to `test` in the `.env` file. Once you've changed it, run the commands below.
```bash
$ docker-compose build
$ docker-compose run api node test-unit.js bash
$ docker-compose run api node test-integration.js bash
```
First trigger the build again to make sure the new value in the `.env` file gets applied. We're using the `docker-compose run` command to trigger the `api` service with a custom command. In these cases, instead of running the Node.js server, it'll trigger Mocha and run tests instead. In short, you're overriding the default command from the Dockerfile.
### Push to Docker Hub
Not until the tests are all wrapped up, would you build and push the images to Docker Hub. Change the `NODE_ENV` value in the `.env` file to `prod`, and go ahead and run the build once again.
```bash
$ docker-compose build
```
You'll have a fresh set of images ready to push to Docker Hub. You've already exported the `$API_IMAGE` and `$CLIENT_IMAGE` values which equal your own username and image names, making it easy to run the push commands like this. Go ahead and run the commands now. This will push your images to Docker Hub.
```bash
$ docker push $API_IMAGE
$ docker push $CLIENT_IMAGE
```
### Configure the Kubernetes cluster on AWS with KOPS
[KOPS](https://github.com/kubernetes/kops#what-is-kops) is a tool that helps you create and manage cluster resources. It's the `kubectl` for clusters, and surprisingly easy to set up. By running a couple of commands you can have a cluster running in no time at all. After [installing KOPS](https://github.com/kubernetes/kops#installing), open up a terminal window and run this set of commands.
```bash
## Choose a name for your organization
export ORGANIZATION_NAME=<your_org_name>
## Create a state store ##
export BUCKET_NAME=${ORGANIZATION_NAME}-kops-state-store
aws s3api create-bucket \
--bucket ${BUCKET_NAME} \
--region eu-west-1 \
--create-bucket-configuration LocationConstraint=eu-west-1
aws s3api put-bucket-versioning \
--bucket ${BUCKET_NAME} \
--versioning-configuration Status=Enabled
## Create a cluster ##
export KOPS_CLUSTER_NAME=${ORGANIZATION_NAME}.k8s.local
export KOPS_STATE_STORE=s3://${BUCKET_NAME}
# Define cluster config
kops create cluster \
--master-count=1 \
--master-size=t2.micro \
--node-count=1 \
--node-size=t2.micro \
--zones=eu-west-1a \
--name=${KOPS_CLUSTER_NAME}
# Apply and create cluster
kops update cluster --name ${KOPS_CLUSTER_NAME} --yes
# Validate cluster is running
kops validate cluster
```
The `kops create cluster` command will create the initial configuration for the cluster while the `kops update cluster` command will create the resources on AWS. Run the `kops validate cluster` command to check if it's running and working like it should.
You'll also need to export the `kubectl` configuration to a file.
```bash
# export kubectl config file
KUBECONFIG=${HOME}/${KOPS_CLUSTER_NAME}-kubeconfig.yaml kops export kubecfg --name ${KOPS_CLUSTER_NAME} --state ${KOPS_STATE_STORE}
```
You'll use this file to interact with your cluster. It'll be saved in your `${HOME}` directory. The command will also set the `kubectl` context to use the configuration so you can interact with your cluster right away from your terminal.
If by any chance you need to edit your configuration, use the edit command. It lets you change the cluster configuration.
```bash
# Run this only if you want to edit config
kops edit cluster --name ${KOPS_CLUSTER_NAME}
```
### Deploy containers to the Kubernetes cluster
To check if the resources work the way they should, I tend to use [Kompose](http://kompose.io/) to test configurations and generate my Kubernetes resource files. It's a rather simple tool that lets you simulate the behavior of Docker Compose, but run it against a Kubernetes cluster.
After you install Kompose, you can run the same commands you're used to with Docker Compose. You can deploy resources from a `docker-compose.yml` file with Kompose in the same way you would with Docker Compose.
With the `up` command you deploy your Dockerized application to a Kubernetes cluster.
```bash
$ kompose up
```
The `down` command deletes instantiated services/deployments from a Kubernetes cluster.
```bash
$ kompose down
```
While the `convert` command converts a Docker Compose file into Kubernetes YAML resource files.
```bash
$ kompose convert
```
Let's check out a sample Kompose file. The file itself needs to be edited slightly to work with Kompose, based on what you're used to with Docker Compose. Let me show you. I've named [this file](https://github.com/adnanrahic/boilerplate-api/blob/master/docker-kompose.yml) `docker-kompose.yml`, and use it only to run Kompose. Check it out below.
```yaml
# docker-kompose.yml
version: "3"
services:
mongo:
environment:
- GET_HOSTS_FROM=dns # Add environment variable to map DNS values
image: mongo:4.0
ports:
- "27017" # Expose the port within the Kubernetes cluster
labels:
kompose.volume.size: 1Gi # Add a label for volume size
volumes:
- mongo:/data/db # Map the volume to the data folder in the container
api:
environment:
- SECRET=secretkey
- NODE_ENV=prod
- DB=mongodb://mongo:27017/boilerplate_api
- PORT=3000
- GET_HOSTS_FROM=dns
image: <username/api_image> # replace with your api_image
deploy:
replicas: 1 # Add number of replicas the Kubernetes Deployment will have
ports:
- "3000" # Expose the port within the Kubernetes cluster
labels:
kompose.service.type: LoadBalancer # Add a label to make sure the Kubernetes Service will be of type LoadBalancer
client:
environment:
- GET_HOSTS_FROM=dns
image: <username/client_image> # replace with your client_image
deploy:
replicas: 1 # Add number of replicas the Kubernetes Deployment will have
ports:
- "80:80" # Expose port publicly
labels:
kompose.service.type: LoadBalancer # Add a label to make sure the Kubernetes Service will be of type LoadBalancer
volumes:
mongo:
```
It's a strange mix of both a Kubernetes resource file and a Docker Compose file. But, it's rather easy to understand.
The only crucial thing you need to remember is the `GET_HOSTS_FROM=dns` environment variable. It makes sure the services can interact with each other inside the Kubernetes cluster. A typical Kubernetes cluster has a DNS service used to find service host info. With this environment variable, you tell the services from where to grab info about each other. In short, it makes sure all services can see each other and interact with each other through their service label. Meaning, the `mongo` service's DNS record will have an alias named `mongo` in your Node.js API. You'll be able to access the database by connecting to `mongodb://mongo:27017/boilerplate_api` from your Node.js API, which is pretty cool.
Once you have an edited Kompose file, you're ready to deploy it. Use the `-f` flag to choose the `docker-kompose.yml` file to run with the `kompose` command.
```bash
$ kompose up -f docker-kompose.yml
```
It'll take a few seconds for the persistent volume claim to create a volume and the pods to spin up. Because the service type is set to `LoadBalancer`, AWS will handle the load balancing automatically with the [AWS ELB](https://aws.amazon.com/elasticloadbalancing/) service. This process always takes a while, so you'll have to wait for it to finish before you can access your app.
To make sure it works, use `kubectl` to get all services and their external IPs.
```bash
$ kubectl get all
```
If the external IPs have been allocated, the Load Balancers have been started and are running like they should. To get the whole URL, run these two commands.
```bash
$ kubectl describe svc api | grep External\ IPs:
$ kubectl describe svc client | grep External\ IPs:
```
Once you're happy, and sure it works, use the `kompose convert` command to generate YAML files.
```bash
$ kompose convert -f docker-kompose.yml
```
Seven new files should have appeared. One persistent volume claim for MongoDB, and three pairs of deployments and services. If you encountered any issues with the resources not starting properly while running Kompose, stop Kompose entirely and run `kubectl apply` one YAML file at a time.
```bash
$ kompose -f docker-kompose.yml down
```
Once it's stopped, first apply the persistent volume claim, then everything else.
```bash
$ kubectl apply -f mongo-persistentvolumeclaim.yaml
$ kubectl apply -f mongo-deployment.yaml
$ kubectl apply -f mongo-service.yaml
$ kubectl apply -f api-deployment.yaml
$ kubectl apply -f api-service.yaml
$ kubectl apply -f client-deployment.yaml
$ kubectl apply -f client-service.yaml
```
Awesome! You have a fully functional Kubernetes cluster running with a client application, API and MongoDB database.
But, you don't really want to deploy things from your local machine every time there's a new version, do you? So, what should you do? Use a continuous integration and delivery tool to run your builds and tests automatically. You need that peace of mind. Trust me.
## Step 2 - Set up Semaphore2.0
SemaphoreCI has been on the market for the last 6 years. However, they just [recently released a new, revamped version](https://www.producthunt.com/posts/semaphore-2-0) called [Semaphore2.0](https://semaphoreci.com/blog/2018/11/06/semaphore-2-0-launched.html), and it's awesome! The new software comes with powerful and fully [customizable CI/CD pipelines](https://docs.semaphoreci.com/article/64-customizing-your-pipeline) and stages. You can also set up [parallel execution](https://docs.semaphoreci.com/article/62-concepts) of your tests and other jobs. Control flow switches are also included as well as [secrets and dependency management](https://docs.semaphoreci.com/article/66-environment-variables-and-secrets).
Let's go ahead and [sign up](https://id.semaphoreci.com/) so we can get started. Semaphore2.0 is integrated with GitHub meaning you can easily log in with your GitHub profile.
![login](images/login.png)
After you've logged in, create an organization.
![create-org](images/create-org.png)
This will take you to a list of projects where you'll see a short guide of how to install the Semaphore CLI, connect the organization and initialize a project. Remember to **run these commands in the project directory of your cloned repo**.
![list-of-projects](images/list-of-projects.png)
Install Semaphore CLI:
```bash
$ curl https://storage.googleapis.com/sem-cli-releases/get.sh | bash
```
Connect to your organization:
```bash
$ sem connect <your-org>.semaphoreci.com <ID>
```
Add your first project:
```bash
$ sem init
```
After running this command you'll see a **.semaphore** folder get created, with a **semaphore.yml** file in it. Don't touch anything for now. Add and commit the files for now and push them to GitHub.
```bash
$ git add .semaphore/semaphore.yml && git commit -m "First pipeline" && git push
```
Open up your browser again and navigate to Semaphore2.0. You can see the pipeline running. It should looks something like this.
![first-pipeline](images/first-pipeline.png)
There you have it! Your initial pipeline is running fine. Let's edit the **semaphore.yml** file and create a proper build workflow.
## Step 3 - Add a build workflow
In order to run your application in different environment settings, you need to have a [secrets system](https://docs.semaphoreci.com/article/66-environment-variables-and-secrets) in place. Luckily Semaphore2.0 has an incredibly simple system for creating and managing secrets. First you create a YAML file, then run a command to create a secret from the file.
### Add `sem` secrets
Start by adding a few files. First a **test.variables.yml**.
```yaml
apiVersion: v1beta
kind: Secret
metadata:
name: test-variables
data:
env_vars:
- name: SECRET
value: secretkey
- name: NODE_ENV
value: test
- name: DB
value: mongodb://mongo:27017/boilerplate_api_test
- name: PORT
value: 3000
- name: API_IMAGE
value: <username/api_image>
- name: CLIENT_IMAGE
value: <username/client_image>
```
Run the command to create the secret.
```bash
$ sem create -f test.variables.yml
```
Repeat the same steps with the **prod.variables.yml**.
```yaml
apiVersion: v1beta
kind: Secret
metadata:
name: prod-variables
data:
env_vars:
- name: SECRET
value: secretkey
- name: NODE_ENV
value: prod
- name: DB
value: mongodb://mongo:27017/boilerplate_api
- name: PORT
value: 3000
- name: API_IMAGE
value: <username/api_image>
- name: CLIENT_IMAGE
value: <username/client_image>
```
Run another command to create the production variables secret.
```bash
$ sem create -f prod.variables.yml
```
Secrets added, check! Add them to the **.gitignore** and make sure **NOT** to push them to GitHub. I've made that mistake once too often. * *sweating* * :neutral_face:
To make sure the secrets were saved to Semaphore2.0, you can always check.
```bash
$ sem get secrets
```
This should list all secrets you've saved.
```bash
# Output #
prod-variables 1m
test-variables 1m
```
### Configure the build workflow in the semaphore.yml
The initial pipeline will have a build task and two test tasks. First create the build task then move on to the tests later.
Feel free to delete everything from your **semaphore.yml** file, and paste this in.
```yaml
version: v1.0
name: Build & Test
agent:
machine:
type: e1-standard-2
os_image: ubuntu1804
blocks:
# Docker only
- name: "Build and cache images"
task:
jobs:
- name: docker-compose build && cache store
commands:
- checkout
- export GIT_HASH=$(git log --format=format:'%h' -1)
- export TAG=${SEMAPHORE_GIT_BRANCH}_${GIT_HASH}_${SEMAPHORE_WORKFLOW_ID}
- docker pull mongo:4.0
- ./generate-node-env.sh
- docker-compose build
- mkdir docker_images
- docker save $API_IMAGE:$TAG -o docker_images/api_image_${TAG}.tar
- docker save $CLIENT_IMAGE:$TAG -o docker_images/client_image_${TAG}.tar
- docker save mongo:4.0 -o docker_images/mongo_4.0.tar
- cache store docker-images-$SEMAPHORE_WORKFLOW_ID docker_images
secrets:
- name: prod-variables
```
*__Note__: The `./generate-node-env.sh` file is a simple bash script that'll grab environment variables from the shell and create a `.env` file which the Node.js API needs in order to run. In doing this you'll have access to the environment variables from the `process.env`.*
```bash
cat <<EOF > .env
SECRET=$SECRET
NODE_ENV=$NODE_ENV
DB=$DB
PORT=$PORT
API_IMAGE=$API_IMAGE
CLIENT_IMAGE=$CLIENT_IMAGE
EOF
```
In this step you're pulling a MongoDB image from Docker Hub and building the API and Client images with Docker Compose. But, here's the catch. You'll also save the Docker images in the [Semaphore2.0 cache](https://docs.semaphoreci.com/article/68-caching-dependencies). This will allow you to grab the images from the cache instead of wasting bandwidth and time on pushing and pulling the images from Docker Hub all the time.
Next up, adding tests.
## Step 4 - Add tests
Stay in the **semaphore.yml** file for now. You're not done here. The tests should run only if the build task was a success. That's pretty easy actually. Add two more tasks under the build task.
```yaml
- name: "Smoke tests"
task:
jobs:
- name: CURL /api
commands:
- checkout
- export GIT_HASH=$(git log --format=format:'%h' -1)
- export TAG=${SEMAPHORE_GIT_BRANCH}_${GIT_HASH}_${SEMAPHORE_WORKFLOW_ID}
- cache restore docker-images-$SEMAPHORE_WORKFLOW_ID
- ls -l docker_images
- docker load -i docker_images/api_image_${TAG}.tar
- docker load -i docker_images/client_image_${TAG}.tar
- docker load -i docker_images/mongo_4.0.tar
- docker images
- ./generate-node-env.sh
- docker-compose up -d --build
- sleep 1
- ./smoke_test.sh localhost:3000/api
- ./smoke_test.sh localhost
- ./smoke_test.sh localhost/api
secrets:
- name: prod-variables
- name: "Unit & Integration tests"
task:
jobs:
- name: npm run lint
commands:
- checkout
- cache restore node_modules-$SEMAPHORE_GIT_BRANCH-$(checksum package-lock.json),node_modules-$SEMAPHORE_GIT_BRANCH-,node_modules
- npm i
- npm run lint
- cache store node_modules-$SEMAPHORE_GIT_BRANCH-$(checksum package-lock.json) node_modules
- name: npm run test-unit
commands:
- checkout
- export GIT_HASH=$(git log --format=format:'%h' -1)
- export TAG=${SEMAPHORE_GIT_BRANCH}_${GIT_HASH}_${SEMAPHORE_WORKFLOW_ID}
- cache restore docker-images-$SEMAPHORE_WORKFLOW_ID
- ls -l docker_images
- docker load -i docker_images/api_image_${TAG}.tar
- docker load -i docker_images/client_image_${TAG}.tar
- docker load -i docker_images/mongo_4.0.tar
- docker images
- ./generate-node-env.sh
- docker-compose build
- docker-compose run api node test-unit.js bash
- name: npm run test-integration
commands:
- checkout
- export GIT_HASH=$(git log --format=format:'%h' -1)
- export TAG=${SEMAPHORE_GIT_BRANCH}_${GIT_HASH}_${SEMAPHORE_WORKFLOW_ID}
- cache restore docker-images-$SEMAPHORE_WORKFLOW_ID
- ls -l docker_images
- docker load -i docker_images/api_image_${TAG}.tar
- docker load -i docker_images/client_image_${TAG}.tar
- docker load -i docker_images/mongo_4.0.tar
- docker images
- ./generate-node-env.sh
- docker-compose build
- docker-compose run api node test-integration.js bash
secrets:
- name: test-variables
```
The smoke test will load the Docker images from the cache, and run them with Docker Compose. Our tiny script will make sure the endpoints are responsive. If nothing fails, it moves on to the unit and integration tests.
If everything works as expected, Semaphore2.0 should shine in a nice bright green.
![build-and-test](images/build-and-test.png)
With these two tasks added you're done with the build and test pipeline. Next up, we'll jump into [promotions](https://docs.semaphoreci.com/article/67-deploying-with-promotions), what they are, and how they work.
## Step 5 - Push Docker images with promotions
A promotion lets you create a new pipeline that can continue from a previous pipeline, either automatically upon it ending, or manually if you choose to trigger it yourself through the UI.
Let me show you. At the bottom of the **semaphore.yml** file add this tiny snippet.
```yaml
promotions:
- name: Push Images
pipeline_file: push-images.yml
auto_promote_on:
- result: passed
```
This will look for a **push-images.yml** file and automatically trigger it if the **semaphore.yml** pipeline has passed. Pretty cool!
Add the **push-images.yml** file in the same **.semaphore** directory as the **semaphore.yml** file, and paste in some tasty YAML.
```yaml
version: v1.0
name: Push images to Docker Hub
agent:
machine:
type: e1-standard-2
os_image: ubuntu1804
blocks:
- name: "Push Images"
task:
jobs:
- name: docker push images
commands:
- checkout
- export GIT_HASH=$(git log --format=format:'%h' -1)
- export TAG=${SEMAPHORE_GIT_BRANCH}_${GIT_HASH}_${SEMAPHORE_WORKFLOW_ID}
- docker login -u $DOCKER_USER -p $DOCKER_PASS
- cache restore docker-images-$SEMAPHORE_WORKFLOW_ID
- ./build_image_if_not_exists.sh
- echo "Push API image"
- docker push $API_IMAGE:$TAG
- echo "Push CLIENT image"
- docker push $CLIENT_IMAGE:$TAG
- docker images
- cache delete docker-images-$SEMAPHORE_WORKFLOW_ID
secrets:
- name: docker-secrets
- name: prod-variables
```
This pipeline can get a bit tricky if you don't watch out. If it's triggered automatically once the **semaphore.yml** finishes it'll have access to the cached Docker images. Otherwise, if you trigger it manually before the previous pipeline has had a chance to cache the images this pipeline will fail. To bypass this I've added a simple bash script called `build_image_if_not_exists.sh` with an `if` statement to check if the `docker_images` directory exists in which case it loads the images, otherwise it'll build the images.
```bash
# build_image_if_not_exists.sh
if [ -d "docker_images" ]; then
docker load -i docker_images/api_image_${TAG}.tar
docker load -i docker_images/client_image_${TAG}.tar
else
if [[ "$(docker images -q $API_IMAGE:$TAG 2> /dev/null)" == "" ]]; then
docker-compose -f docker-compose.build.yml build
fi
fi
```
Not until now are you ready to push the images to Docker Hub. Finally! Once they're safe and sound in the Hub, you can delete the cache to free up the space.
One last thing to do now. We can't push images to our Docker Hub account without credentials right? Add a `sem` secret to hold our Docker Hub credentials. Name the file **docker.secrets.yml**.
```yaml
apiVersion: v1beta
kind: Secret
metadata:
name: docker-secrets
data:
env_vars:
- name: DOCKER_USER
value: <username>
- name: DOCKER_PASS
value: <password>
```
Run the create command once again and don't forget to add this YAML file to the **.gitignore**.
```bash
$ sem create -f docker.secrets.yml
```
Finally, add and commit all the files you added, except the `docker.secrets.yml`, push them to GitHub and check out Semaphore2.0. The workflow should show you something similar to this.
![push-images](images/push-images.png)
## Step 6 - Deploy to production
I'm getting hyped, you're close to the end now. The last step is to create a pipeline which will deploy all the changes to our production cluster. The easiest way to implement this is by adding another promotion.
First, we need to edit the **api-deployment.yaml** and **client-deployment.yaml**. They're currently not using the `$TAG` environment variable to choose which version of the Docker image to deploy. Let's fix this.
Open the **api-deployment.yaml** first, and under `spec -> containers -> image` add `:${TAG}` behind the image name.
```yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
annotations:
kompose.cmd: kompose -f docker-kompose.yml convert
kompose.service.type: LoadBalancer
kompose.version: 1.1.0 (36652f6)
creationTimestamp: null
labels:
io.kompose.service: api
name: api
spec:
replicas: 1
strategy: {}
template:
metadata:
creationTimestamp: null
labels:
io.kompose.service: api
spec:
containers:
- env:
- name: DB
value: mongodb://mongo:27017/boilerplate_api
- name: GET_HOSTS_FROM
value: dns
- name: NODE_ENV
value: prod
- name: PORT
value: "3000"
- name: SECRET
value: secretkey
image: <username/api_image>:${TAG} # <= ADD THE $TAG HERE
name: api
ports:
- containerPort: 3000
resources: {}
restartPolicy: Always
status: {}
```
Do the exact same thing with the **client-deployment.yaml**.
```yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
annotations:
kompose.cmd: kompose -f docker-kompose.yml convert
kompose.service.type: LoadBalancer
kompose.version: 1.1.0 (36652f6)
creationTimestamp: null
labels:
io.kompose.service: client
name: client
spec:
replicas: 1
strategy: {}
template:
metadata:
creationTimestamp: null
labels:
io.kompose.service: client
spec:
containers:
- env:
- name: GET_HOSTS_FROM
value: dns
image: <username/client_image>:${TAG} # <= ADD THE $TAG HERE
name: client
ports:
- containerPort: 80
resources: {}
restartPolicy: Always
status: {}
```
That's it! Now, move on to the deployment pipeline. At the bottom of the **push-images.yml** file add this snippet.
```yaml
promotions:
- name: Deploy Production
pipeline_file: deploy-prod.yml
auto_promote_on:
- result: passed
branch:
- master
```
It'll trigger the **deploy-prod.yml** pipeline automatically only from the `master` branch. Follow up by creating the **deploy-prod.yml** file and pasting this in.
```yaml
version: v1.0
name: Deploy to Production K8S Cluster
agent:
machine:
type: e1-standard-2
os_image: ubuntu1804
blocks:
- name: "Deploy"
task:
jobs:
- name: kubectl apply
commands:
- checkout
- export GIT_HASH=$(git log --format=format:'%h' -1)
- export TAG=${SEMAPHORE_GIT_BRANCH}_${GIT_HASH}_${SEMAPHORE_WORKFLOW_ID}
- docker pull $CLIENT_IMAGE:$TAG
- docker pull $API_IMAGE:$TAG
- envsubst '${TAG}' <api-deployment.yaml > api-deployment.prod.yaml
- envsubst '${TAG}' <client-deployment.yaml > client-deployment.prod.yaml
- kubectl apply -f api-deployment.prod.yaml
- kubectl apply -f client-deployment.prod.yaml
secrets:
- name: kubeconfig
- name: prod-variables
```
As you can see you first pull the images, then use [**envsubst**](https://www.systutorials.com/docs/linux/man/1-envsubst/) to substitute the `$TAG` you just added above with the value of the `$TAG` environment variable in the Kubernetes deployment files you generated with Kompose.
For the pipeline to be able to talk to our cluster you need to add a **kubeconfig** file so that **kubectl** knows where the cluster is, and has proper authentication to interact with it.
Remember the KOPS export command we ran in step 1? It created a file with the **kubectl** configuration. That file should be located in your `$HOME` directory. If you forgot to run it, no worries. Run the command now, and follow up by creating the **kubeconfig** `sem` secret.
```bash
$ KUBECONFIG=${HOME}/${KOPS_CLUSTER_NAME}-kubeconfig.yaml kops export kubecfg --name ${KOPS_CLUSTER_NAME} --state ${KOPS_STATE_STORE}
```
Let's take a look at this **kubeconfig** file. Here's what it should look like.
```bash
$ cat ${HOME}/${KOPS_CLUSTER_NAME}-kubeconfig.yaml
# Output
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: <redacted>
server: https://<ORGANIZATION_NAME>-k8s-local-<ID>-<ID>.<REGION>.elb.amazonaws.com
name: <ORGANIZATION_NAME>.k8s.local
contexts:
- context:
cluster: <ORGANIZATION_NAME>.k8s.local
user: <ORGANIZATION_NAME>.k8s.local
name: <ORGANIZATION_NAME>.k8s.local
current-context: <ORGANIZATION_NAME>.k8s.local
kind: Config
preferences: {}
users:
- name: <ORGANIZATION_NAME>.k8s.local
user:
as-user-extra: {}
client-certificate-data: <redacted>
client-key-data: <redacted>
password: <redacted>
username: admin
- name: <ORGANIZATION_NAME>.k8s.local-basic-auth
user:
as-user-extra: {}
password: <redacted>
username: admin
```
Finally, go ahead and add another `sem` secret.
```bash
$ sem create secret kubeconfig \
--file ${HOME}/${KOPS_CLUSTER_NAME}-kubeconfig.yaml:/home/semaphore/.kube/config
```
You're all set. The pipeline is complete. It uses a total of three pipelines with two promotions. Here's what the final result looks like.
![semaphore-workflow2.gif](images/semaphore-workflow2.gif)
![semaphore-workflow2.png](images/semaphore-workflow.png)
## Wrapping up
In the end there's not much else you can do except for enjoying what you just built. In all its awe-inspiring glory. Every time a commit gets pushed to your repo, Semaphore2.0 will run all builds and tests, push images and deploy them to your cluster. Being a lazy developer is awesome.
By opening up your fork on GitHub and checking the commits you can see green check marks alongside them indicating successful builds. Pressing on one of them will take you to Semaphore2.0 where you can check them out in more detail.
Automation has made developing software so much more enjoyable. By automating boring repetitive tasks, you free up your time to focus on what is important. Creating real value and developing business logic. This saves you a ton of time, money, and of course, headaches.
All the code above is on GitHub and you can check it out [right here](https://github.com/adnanrahic/boilerplate-api). Feel free to give it a star if you like it. Make sure to also give the Semaphore peeps some love by [following their blog](https://semaphoreci.com/blog) if you want to read more about CI/CD. Check out their [Slack community](https://join.slack.com/t/semaphorecommunity/shared_invite/enQtMzk1MzI5NjE4MjI5LWY3Nzk4ZGM2ODRmMDVjYmIwZGFhMWI0ZDYyOWIxMGI1ZjFlODU1OTZiZWM3OGVkZjBmMWRiNWYzNjA4MjM2MTA) of you have any questions. You can also [sign up](https://id.semaphoreci.com/), and check their [documentation](https://docs.semaphoreci.com/). It's pretty awesome.
*Hope you guys and girls enjoyed reading this as much as I enjoyed writing it. Do you think this tutorial will be of help to someone? Do not hesitate to share.*

Binary file not shown.

Before

Width:  |  Height:  |  Size: 48 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 224 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 48 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 153 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 19 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 73 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 84 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 652 KiB