My Journey into The World of Containers, Docker, CoreOS and Kubernetes -- Part 2

In continuation of my series of blog posts on Containers, Docker, CoreOS and Kubernetes, I will in this part two of the series be discussing CoreOS, Kubernetes, along with running a cluster on Vagrant and AWS. I will also be discussing the use of confd for configuration management.

Introduction

In Part One of the Series I discussed taking a traditional application stack and “containerizing” it and using Docker Compose to bring up the stack locally on your laptop running Docker. In this post I will quickly go over why I chose CoreOS and Kubernetes, along with how the application stack was made ready to be run on Kubernetes. I will also go over confd which is a lightweight configuration management tool, and our use of it with our nginx.conf file in our Nginx containers. I will then go over how to get CoreOS and Kubernetes running locally using Vagrant. Final piece I will go over in this post is the use of AWS’s Elastic File System (EFS) to be used as the storage backend for Kubernetes Persistent Volumes, then launch a cluster on EC2 using kube-aws.

CoreOS

CoreOS as they put it is:

“A lightweight Linux operating system designed for clustered deployments providing automation, security, and scalability for your most critical applications”

It pretty much is a clustered OS for running your containers on. When I first started researching CoreOS, I went to my trusty source of greatly written How-To’s at DigitalOcean. They have a great nine part series on CoreOS, etcd (key-value store), fleet, and confd. With the series being almost two years old, I thought well it’s a tad bit dated but I’ll give it a whirl. Making my way through the series, you learn a lot about CoreOS and its underlying features, but when I made my way through the section on fleet, I go to myself “wow this is a lot of work to get a service built.” In fleet there is a lot of low level tasks you need to setup in your fleet unit files in order to setup your services. After researching if there is a better way to manage fleet on CoreOS, I found a project from CenturyLink Labs called Panamax that looked very slick, but the project looked like it was inactive. I go what gives and after researching I find out CoreOS has pretty much abandoned fleet in favor of Kubernetes, as they put it on their GitHub page:

“This project is quite low-level, and is designed as a foundation for higher order orchestration. Those looking for more complex scheduling requirements or a first-class container orchestration system should check out Kubernetes.”

Kubernetes

K8s Introduction

Well OK then lets take a look at Kubernetes or abreviated as K8s, I would only go through DigitalOcean’s CoreOS tutorial if you have the time, for our purposes I would focus on Kubernetes. Looking at DigitalOcean, their tutorial for Kubernetes was also again almost two years old, time to try to find some newer content. I found a great free course from Udacity, Scalable Microservices with Kubernetes which is taught by Kelsey Hightower who works for Google and is highly active in the Kubernetes space and has some great tools out on GitHub. Their tutorial uses Google’s Container Service as the underlying computing engine, but the course material explains Kubernetes excellently. It may take you a day or two to finish depending on your time but I would highly recommend taking this course, and it looks like it will be part of a future DevOps Nanodegree program to be offered by Udacity.

So after learning about Kubernetes I still wanted to keep CoreOS in the mix, as I liked the underlying features of CoreOS as a clustered environment. I also didn’t want to have to rely on a cloud providers container service such as AWS’s ECS or Google’s Container Engine, I wanted to build an environment that could be hosted on any cloud provider or in-house data center. Next was to get CoreOS with Kubernetes locally up on my laptop through Vagrant for development purposes. Luckily CoreOS has this covered for us and I will go into more detail below.

Kompose to Kubernetes Intro

Getting my docker-compose ready application stack from my previous blog post into Kubernetes. Hmmm, there must be a tool out there to convert my docker-compose.yml file to Kubernetes objects. Of-course there is, Kelsey has a tool called “compose2kube” on GitHub. Unfortunately at the time the tool was not building correctly (fixed in a later commit), so I had to find another tool that worked. I found the “kompose” tool from the guys at Skippbox. But first we need to make a few modifications to the docker-compose.yml file to make it more Kubernetes friendly.

If we see below doing a file diff on the left is our original, the right is going to be our file we will use with kompose. First difference is the image we will be using. When initiating “docker-compose up” it can mount local storage on our laptop to the running image so we can bypass creating a new Docker image every time we update our code for testing. In production its best to have your code baked into an image, so that the image is “ephemeral” as possible. In Docker’s words:

“By “ephemeral,” we mean that it can be stopped and destroyed and a new one built and put in place with an absolute minimum of set-up and configuration.”

We want to be able to know that when we launch a container with a certain tag, we know exactly each time what is on that image, packages, code, config, files, etc.. When you update your code, package versions, etc. you build a new container with a new tag (i.e. build number). But before we build the new images we need to modify our code in one place for the configuration.php file, I will explain why below.

 Left: Original docker-compose.yml file.  Right: kompose docker-compose.yml file. Left: Original docker-compose.yml file. Right: kompose docker-compose.yml file.

Updating our Application Stack Code

configuration.php

So if we look at the YAML file I removed the “networks” configuration as kompose does not support it since Kubernetes networking is a little different than Docker’s. If you look at our original on the left above, there is an alias of “mariadb” associated with our “db” service. Well that does not exist in Kubernetes so we have to get the MariaDB container’s IP or hostname somehow. Luckily Kubernetes exports the environment’s metadata as environment variables. These variables are dynamic if the environment changes because of containers being killed and brought up on new hosts, the variables will be updated on the fly. If we see below we update our “$servername” variable to “getenv(‘DB_SERVICE_HOST’)”, each service in the environment will have this variable along with others that I will show you how to retrieve later on. Also related is there are no application links in Kubernetes because we will be connecting containers to services, more can be read here on the subject.

 configuration.php

I have done this work already for you in this repository. If take a look at it, we have our code folder along with a docker_build folder which contains the “Dockerfile” for each service in the stack. The reason we did this is so we move the build context for “docker build” to the root of our stack’s repository. This is so when we build our php and web containers we reference the code folder in the root directory of the repo, “docker build” does not let you reference folders outside the build context. We then have the individual folders for each service of the stack with related code. Ignore the JSON files for now but if we look at the “web” folder we see two folders, “confd” and “scripts.” If you remember from Part 1, I wanted to make our Nginx configuration more dynamic, so while we are updating our code we are going to take care of that using confd.

confd, etcd and redis

From the confd GitHub page, confd is a lightweight configuration management tool focused on:

  • keeping local configuration files up-to-date using data stored in etcd, consul, dynamodb, redis, vault, zookeeper or env vars and processing template resources.
  • reloading applications to pick up new config file changes.

We are going to use confd in our example to manage our “nginx.conf” file on our “web” containers. Backend wise we can see confd can use many data stores, luckily etcd is a part of CoreOS, but we won’t use it currently in our example. Why you may ask, because currently the tool from CoreOS (kube-aws) to get our CoreOS – Kubernetes cluster up in AWS does not support a highly available etcd cluster. It is currently in the works in the project but has not been released yet. So for our example we will use Redis as our data store for confd, when kube-aws is updated I will definitely return and update confd to use etcd since it would be better to use resources already deployed then have additional resources for Redis.

If we take a look at the “confd” folder we see two sub directories of “conf.d” and “templates.” Inside “conf.d” we see a file of “counter-nginx.toml” which is our template resource config. Lets go over the file below, first we are going to prefix our key lookups in Redis with “/counter” then we are defining our template file of “nginx.tmpl” which resides in the “templates” sub directory. We are then defining the destination of our nginx.tmpl file along with owner and file permissions. Next we are defining what key values we want to lookup inside our template file. The last two entries are our check and reload Nginx commands. First we are going to check the the rendered config file, what is interesting in this command is the “{ {.src}}” Go language action. Before confd lays down the rendered nginx.conf file, it will create a temporary nginx.conf file and have Nginx test this temporary config file which is defined by { {.src}} and if it fails confd will not lay the file down in the running location. If the test is successful confd lays the file down in the running location and reloads Nginx.

If we now look at our “nginx.tmpl” file under the “templates” sub directory we see its a normal Nginx config file. The reason we are not separating out our server block into a separate config file that is included in the nginx.conf file is because of confd. confd when performing the check command can only check one file in a single command. What we mean here is confd cannot combine the running nginx.conf file along with a rendered temporary server.conf file. If we did that then when the check command is run it would error out because all it sees is the server block and nothing else in the config. According to a comment to an issue raised in the GitHub project Kelsey wants to keep confd simple and have one template resource map to a single output file. The only section that we are looking at in our example is “server_name” where we are getting the “/subdomain” and “/domain” keys from Redis with the “{ {getv “/subdomain”}}.{ {getv “/domain”}}” action. Everything else in this template file is not having any actions taken against it from confd.

The last piece of confd is to have a script running at the container’s launch to first create the initial nginx.conf file, then to monitor Redis for changes and apply them to the nginx.conf file. In the script we can see we are exporting a “REDIS” variable which it getting its value from environment variables we are defining in our Kubernetes “web-deployment.json” file. First action to perform is a until loop of running confd to set the initial configuration file, once that is successful then it moves on to running confd in the background to monitor for changes. Next it will start Nginx, and last but not least we need to keep this script running in the foreground so the container does not exit unless there is an error which we set in the beginning of the script, so we just tail the Nginx logs.

What we will do with this script is after we build our Nginx container with “docker build” we will run the image locally on our laptop with “docker run -t -t image /bin/bash” i.e.(docker run -i -t evergreenitco/corp_website_example_nginx:1.0.7 /bin/bash) so we can add the script to “/usr/local/bin/confd-watch” and make the file executable. Then in our “web-deployment.json” file we are defining the command to be run by the image when it starts of “confd-watch.” Exit your bash session on the image, and commit the changes using “docker commit” and push the image to Docker hub. Also now create the MariaDB and PHP containers and push those up as well using our Dockerfile’s under the “docker_build” folder.

The “docker build” command using our “docker_build” folder is:

docker build -f docker\_build/web/Dockerfile -t evergreenitco/corp\_website\_example\_nginx:1.0.13 .

So with our images created and the image definition updated we then can remove the volume definitions (we will create a volume definition in Kubernetes later for MariaDB’s persistent storage in AWS).

kompose convert

So lets give this a whirl:

kompose convert --file docker-compose.yml

We can see it creates a deployment and service JSON files for each service.

 kompose convert output

Now that our Kubernetes objects created, we need to do some edits. First we want to increase our replicas or number of containers running in a pod for each service, lets up it to 2 in our deployment files except for our db service which we will keep at 1. Next I added some labels to our service’s labeling what app is running, and what tier the service is. I also added some resource requests and limits for our pods, more can be read here about resource requests and limits. In our “web-deployment.json” file I also added our “confd-watch” command, added some environment variables to define our Redis host and port and defined a NodePort to access the web service. Time to get our stack up locally using Vagrant.

Kubernetes – CoreOS Vagrant Environment

Configure Vagrant

CoreOS already has a Vagrant template ready for us to use. Follow the steps in the document to get your local environment up, except we will make some additions to the multi-node Vagrant file. If you look below I found an easy way to port forward on Vagrant to our NodePorts in Kubernetes from a blog post by Shaun Domingo, just add those additions to the file and we can fly on. I would also recommend increasing the worker count to 2 and each with more memory than the default 512 MB in the “config.rb” file, my laptop has 12GB of RAM so I gave each 2GB of RAM so I would have more than enough RAM to run multiple containers inside the environment. I also do not use Calico network policy either in my Vagrant or AWS environment’s, I will play with Calico and go over my findings in a future blog post. Now run “vagrant up” and all should be well in the world.

On a side note if your nodes or pods start acting funny after coming back up from a “vagrant halt” first try deleting the pods and if that doesn’t help just do a “vagrant destroy” and rebuild the environment.

Configure Redis

On your laptop get a virtual Redis server running, and make sure it is connected to the same host-only adapter as your CoreOS-Kubernetes cluster. In the redis.conf file I turned off “protected-mode” and commented out “requirepass” since this is just for development purposes. Also set a static IP on the host-only adapter so its constant for our Kubernetes environment variables. Now in Redis using “redis-cli” set your keys for “/counter/domain” and “/counter/subdomain” (i.e. “SET /counter/subdomain “counter”” and “SET /counter/domain “evergreenitco.com””).

kubectl

Now that we have our Vagrant environment up and a supporting Redis server, lets make sure everything is cool in Kubernetes, let’s grab hold of the ship’s wheel. Run the following command to see our nodes in our CoreOS-Kubernetes cluster:

kubectl get nodes

You should see one controller node and two worker nodes. The fourth box created by Vagrant is an etcd server. If you see below the first node has scheduling disabled, this is our controller node so its saying no pods will be deployed on that server, and only on our two worker nodes.

 kubectl get nodes output

Now that our Kubernetes environment is up an ready, lets start deploying the application stack. When deploying your services and pods you first apply your service file, then your deployment file. This is so the service is available when we deploy our pods. So lets deploy the database service first, then the database deployment. We see below we issue the command “kubectl apply -f” first with our “db-svc.json” file then the “db-deployment.json” file. Do the same for the php and web services in the same order as the db service.

 kubectl apply -f

Now lets take a look at our pods deploying in the Kubernetes cluster with the “kubectl get pods” command below. In the first image we see the status is the containers are creating, what this means is the container images are being downloaded from the Docker hub, if you take a look at your network interface it should be pretty busy downloading, depending on your connection this may take a few minutes. One by one you should see the status for the pods change to running.

“kubectl get pods” containers creating

“kubectl get pods” containers created and running

Some other commands so we can see our environment are “kubectl get deployments” and “kubectl describe deployments” which the get command will give us a quick summary of our deployments, and the describe command will give us more detail.

“kubectl get deployments” & “kubectl describe deployments” output

We can do the same with our services, “kubectl get services” and “kubectl describe services” one note to make is our “NodePort” for our web service set to 30100/TCP, this is the port we will forward to that was setup in our Vagrant file to test our web application.

“kubectl get services” & “kubectl describe services” output

So now we should be able to connect to our web service through our NodePort set on both Kubernetes worker nodes. To hit the first worker node browse to http://127.0.0.1:30100 on your laptop and you should see the website come up, to hit the service running on the second worker node browse to http://127.0.0.1:30101 and you should see the site come up and our counter numbers increase. Now if you need to update your service or deployment json or yaml files let’s say with a newer Nginix image and want to roll it out to kubernetes just run “kubectl apply -f” again and then use the above commands to watch your updates roll out to the environment.

 Accessing application stack through

Accessing application stack through “NodePort” and Vagrant Port Forwarding

A few more helpful kubectl commands are:

  • kubectl describe pods
  • kubectl describe nodes
  • kubectl logs _pod_name_
  • kubectl exec _pod_name_ env (List environment variables in pod)
  • kubectl exec -ti _pod_name_ -- bash (Open bash terminal in pod)

Kubernetes – CoreOS AWS Environment

Using AWS’s EFS Service to Provide Persistent NFS Storage for CoreOS & Kubernetes

Introduction

All code for deploying in AWS is found here.

When deploying containers to a production environment anywhere you are storing data that needs to persist between container life cycle events such as your database files. Containers are ephemeral in nature, and by default so is the data inside the container. Two ways of accomplishing this are either to have your containers directly map to external storage, be it NFS or iSCSI, or have the underlying container management software manage the persistent storage for you. Kubernetes currently supports the following Persistent Volumes:

  • GCEPersistentDisk
  • AWSElasticBlockStore
  • AzureFile
  • FC (Fibre Channel)
  • NFS
  • iSCSI
  • RBD (Ceph Block Device)
  • CephFS
  • Cinder (OpenStack block storage)
  • Glusterfs
  • VsphereVolume
  • HostPath (single node testing only – local storage is not supported in any way and WILL NOT WORK in a multi-node cluster)

Since I want to use AWS’s Elastic File System (EFS) service we are going to do a sort of hybrid approach. Reason being is with EFS, AWS gives you a mount point for each Availability Zone you create the EFS file system in. To keep costs down and performance up we would want to only mount to the corresponding mount point inside our AZ.

 EFS Mount Points EFS Mount Points

Well in Kubernetes there currently is no easy way I found to have the node or pod discover what AZ it is in, especially because you cannot retrieve the AWS instance metadata from the metadata URI (http://169.254.169.254/latest/meta-data/) inside Kubernetes when its running on top of CoreOS. Since CoreOS has access to the instance metadata why not have a script run on boot, parse the metadata to find what AZ the instance is in then map to the corresponding EFS mount point. Then we can use a HostPath Persistent Volume in Kubernetes that mounts to the EFS storage mounted on CoreOS. You might be saying to yourself hey above it says HostPath is not supported in a multi-node cluster. Well yes it wouldn’t work if you were using local storage on the node itself., but with it being an NFS mount and multiple nodes can connect to it, it should work like normal NFS storage in a clustered environment. There are others out there running it this way, and is how I found the cloud-config below to mount the EFS mounts on CoreOS.

Configure NFS Mount on CoreOS

So I found this Gist on GitHub, but cleaned it up a bit below and removed sections that are not required according to CoreOS documentation. So you will add the config to the “cloud-config-worker” file that will be produced by the kube-aws tool below. This will mount the EFS storage on just the worker nodes, which is what we are only concerned with where the pods run. Just edit the “fs-XYZ” and “AMAZON_REGION” with your relevant information in the “What=” section. Then when the worker node boots it will run an executable that will parse the AZ from the instance metadata URI using curl and sed, and place it in a variable called “$AZ_ZONE”

I will go over how to create and mount Persistent Volumes after we launch out CoreOS–Kubernetes stack.

kube-aws Tool

So the guys at CoreOS have a tool that will render a CloudFormation template to deploy a CoreOS and Kubernetes Cluster on AWS called kube-aws. Follow the steps from CoreOS, but do not launch the stack we need to edit a few things. First we need to edit the “cloud-config-worker” file and add our additions from above to mount the EFS storage to the worker nodes. Second I made some changes to the CloudFormation Template exported from the kube-aws tool with the command “**kube-aws up –export”** this will not actually bring up the stack, just export a CloudFormation JSON template file. I added configuration for bringing up a Multi-AZ Redis cluster, and locked down SSH access to the worker nodes to just my local VPC CIDR instead of 0.0.0.0/0. Take a look at it here, diff it against yours and modify to meet your environment. Then instead of using “kube-aws up” just import the template and launch the stack using the CloudFormation console or CLI.

Kubernetes Persistent Volumes

Create Persistent Volume

Now with our CoreOS–Kubernetes stack up on AWS, lets see if our EFS/NFS storage mounted. SSH into one of your worker nodes, and run “df -h” you should see our EFS mount point in the correct AZ mounted to “/mnt/efs” locally. SSH into the other worker which should be running in a different AZ, it should be mounted to the correct mount point for its AZ.

 EFS Mounted on CoreOS Worker Node in Availability Zone us-west-2b EFS Mounted on CoreOS Worker Node in Availability Zone us-west-2b

 EFS Mounted on CoreOS Worker Node in Availability Zone us-west-2a EFS Mounted on CoreOS Worker Node in Availability Zone us-west-2a

Now we can use this NFS mount to create HostPath Persistent Volumes. Lets take a look creating a PersistentVolume and a PersistentVolumeClaim. If we take a look at the “db” folder in my repository we have our service and deployment files, but then we have two new files, “db-nfs-pv.yaml” and “db-nfs-pvc.yaml” If we take a look at the “db-nfs-pv.yaml” below which is going to create our PersistentVolume. We see we are giving this PersistentVolume a name of “db-nfs-pv” with a capacity of 10GB, we could give it a larger number but I will show you why you may not want to do that when we make our PersistentVolumeClaim. We see we are giving it an Access Mode of “ReadWriteOnce” because we will be using the “HostPath” plugin. We are setting a Recycling Policy of “Recycle” which means when we delete the PVC the PV will have a basic scrub performed on it. Finally we are creating our “HostPath” mount to “/mnt/efs/coreos/db-nfs-pv” this full path does not have to be created beforehand, if “/coreos/db-nfs-pv” is not created under “/mnt/efs” Kubernetes will perform this automatically.

So lets create the PersistentVolume on Kubernetes with the command:

kubectl apply -f db-nfs-pv.yaml

Then lets take a look at the PV in Kubernetes with the command “kubectl get pv”:

 PersistentVolume in Kubernetes PersistentVolume in Kubernetes We see it was created and its Status is “Available” to be claimed by a PVC.

Lets now take a look at a PersistentVolumeClaim.

Create Persistent Volume Claim

If we look at our “db-nfs-pvc.yaml” file we can see its very similar to creating other Kubernetes resources. We give it a name of “db-nfs-pvc” we are then looking for a PersistentVolume with an Access Mode of “ReadWriteOnce” and requesting a storage capacity of 3GB. So we will run the same “kubectl apply -f” command as above but with our “db-nfs-pvc.yaml” file.

Now if we run “kubectl get pvc” and “kubectl get pv” we can see the PVC is created and Bound (my guess it states capacity is “0” because it can’t handle EFS showing storage is 8.0 exabytes) and we see in the PV the CLAIM now. Now our claim is only 3GB which means possibly 7GB is wasted, and this is what I have read from others what the behavior is in Kubernetes. I still need to test what happens if the container uses more than what capacity is specified in the PVC, but I want to perform that when I have better monitoring of the Kubernetes-CoreOS cluster in place which I will go over in the next part of this series.

 PersistentVolumeClaim in Kubernetes PersistentVolumeClaim in Kubernetes

Mounting Our Persistent Volume Claim In MariaDB container

So now that we have our PV and PVC set its time to modify our deployment JSON file to mount our PVC in the container. If you see below we are adding “volumes” and “volumeMounts” configurations to our “db-deployment.json” file. We first give a name to the persistentVolumeClaim in the file of “dbpd” and specify the claimName of “db-nfs-pvc” which is the name of our PVC. Then under the container section we create the volumeMount and mount it to “/var/lib/mysql”

Now bring up the stack like we did in Vagrant, stating the svc file, then deployment for db, php and web. Once the stack is up and you run “kubectl describe services web” you should get the Load Balancer Ingress address for reaching our web service. Below I will discuss the cons of how Kubernetes implements this ingress.

 kubectl get services web output kubectl get services web output

Now to test the PV delete our db deployment with “kubectl delete -f db-deployment.json” check the node is gone and then reapply the file, you should see that our counter numbers on the webpage continue on where they left off because when we deleted the deployment it did not delete the PVC. The data on the PV is only deleted when the PVC is deleted using “kubectl delete -f db-nfs-pvc.yaml”.

Kubernetes AWS Elastic Load Balancer

So as we see with each web service where we specify “type: LoadBalancer” its going to allocate a Classic ELB which does not have capabilities to perform routing functions based on URL path patterns, it can only do very basic Layer 4 TCP and Layer 7 HTTP/S (only supports X-Forwarded-For header). With many public facing web services this would start to become expensive. Kubernetes realized this and has created the concept of Ingress Controllers, which would be exposed to the Internet, a simple Nginx or HAProxy server deployment on Kubernetes that would be in charge of routing traffic to the correct backend service. This is currently in beta but more about it can be read here, and I plan to experiment with this and show my findings in a future post.

Conclusion

As we see CoreOS has made it really easy to deploy a CoreOS–Kubernetes cluster on Vagrant and AWS. I have also shown how easy it is to use Amazon’s EFS service for persistent storage in Kubernetes.

As I am learning more about containerization I can see how powerful and flexible it is as a platform, but as any new technology there are a few minor inconveniences such as AWS ELB. But with the community so strongly behind Docker, Kubernetes and other container technologies, all I see is a bright future ahead for containerization technology.

In my next posts I want to build upon what we have done so far with the following topics:

  • Pod Health Checks
  • Logging and Monitoring (Fluentd & Elasticsearch with Kibana, Heapster, Slack Integration)
  • Horizontal Pod Autoscaling
  • Moving DB away from single pod, to MariaDB Galera Cluster on Kubernetes.
  • Ingress Controllers
  • Playing with different storage types and scenarios.
  • Moving from Redis to etcd
  • Other things I think of or come across

Till next time, Cheers!

Share