Containers, Docker, and Kubernetes Part 2

What is Kubernetes and how does it make containerized infrastructure easy?

Steel Container On Dock by Freestocks.org is licensed under Creative Commons Zero (CC0)

In Part 1 of this series I touched on containers, Docker, and how these techonologies are rapidly redefining operations and infrastructure across the industry. However, just knowing about containers and Docker isn’t enough to know how to apply these technologies to your stack. Here, in Part 2 of this blog series, I will go over Kubernetes, the container orchestration tool I’ve chosen to provide the support structure for fully moving to a containerized infrastructure.

Google is an avid user of containers, running billions of them on hundreds of thousands of servers over many years. Over time they’ve built up internal tools to help manage this massive infrastructure, a tool suite they call Borg. Over the past few years, many on the Borg team have taken the lessons learned along the way and applied them in a new orchestration tool they call Kubernetes, releasing it to the public as an open source project.

Kubernetes, like Borg, is a suite of tools and services that work together to provide answers to all of the questions I posed at the end of Part 1. This is a complex system with many moving parts, but it is production ready and has already been heavily tested and used by many companies outside of Google. Learning such a system is not trivial, but the Kubernetes project has some fantastic documentation. Every aspect of the system is covered including multiple examples, suggestions, and possible error cases. They even provide an in-browser interactive tutorial that I highly recommend running through.

All that said, you’ll quickly realize that there are many parts of Kubernetes to grok, but that’s ok. You don’t need to understand every aspect of the system to successfuly use it. For this post I’m going to go over what I’ve learned about Kubernetes, its basic concepts, and what pieces I currently use.

Resources

At the highest level, a Kubernetes cluster will consist of many Resources. These resources can be defined in JSON or YAML, though I personally prefer YAML as I find it easier to both read and write, and it supports commenting sections of the configuration.

I won’t cover all of the available Resource types, as that list is quite large and growing, but will instead cover the Resources that you need to know to set up an application in the Kubernetes way. Specifically, the resources I’ll cover here are: Pod, Deployment, Service, and Namespace. I’ll also cover a few other concepts and layers that these resources manage or make use of.

Node

First off is the Node. This is nothing more than the server or virtual machine on which the Kubernetes cluster is running. A Node provides the computing resources necessary to run your containers.

Pod

The lowest level of abstraction that you’ll work with in Kubernetes is the Pod. A Pod consists of one or more containers that run on the same Node and have shared resources. Containers in a pod are able to communicate with each other via localhost, providing a way to run an application consisting of several tightly-knit containers in a scalable fashion.

The Pod is the immutable layer of Kubernetes. Pods are never updated, but instead are shut down, thrown away, and replaced. They can be started and stopped manually, but that is not common in practice. The configuration and management of Pods in the cluster will almost always be managed by a Deployment.

Deployment

Deployments are the workhorse of managing and running a Kubernetes cluster. A Deployment is where all of the heavy lifting happens to run and manage Pods. Deployments are configured with how many Pods are needed to run, what those Pods look like, and how the Pods are started and shut down depending on deployments or issues with the Node or cluster.

(Technically, Deployments hand off some of that work to a Replica Set that it creates for you, but that isn’t necessary to understand at this time.)

An example Deployment that brings up three nginx Pods:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  replicas: 3
  template:
    metadata:
      labels:
        role: web
    spec:
      containers:
      - name: nginx
        image: nginx:1.7.9
        ports:
        - containerPort: 80

Service

Having multiple Deployments each managing many Pods and containers is great, but how to you get a web request from the Internet into your Rails application? This is where Services come in. Services provide the entry point and exposure to your Deployments and Pods, both to other containers in your cluster, and to the Internet itself. A NodePort Service is used for internal access, but can also be used to expose a high-level port (30000 - 35000) outside of the cluster that maps down into containers. If you are on Amazon (AWS) or Google’s Container Engine (GKE), you can instead make use of the LoadBalancer Service type. LoadBalancer Services work with your cloud provider to provide an actual load balancer configured with the proper rules to forward traffic into your cluster.

This may be hard to follow, so here’s an example use case to help show how Services are used. Let’s say we are running nginx as a front-end to our rails application, and we need redis to be available internally only. Assuming we’re on GKE or AWS, we want a load balancer to point to nginx, nginx to point to rails, and rails to have access to redis. The labels in the selector field are used by Kubernetes to hook up the Service to its matching Deployment.

##
# nginx
# Listen to the world on port 80 and 443
##
apiVersion: v1
kind: Service
metadata:
  name: nginx
  labels:
    role: web
spec:
  type: LoadBalancer
  ports:
  - name: http
    port: 80
    targetPort: 80
  - name: https
    port: 443
    targetPort: 443
  selector:
    # Find all Resources that are tagged with the "role: web" label
    # In our case, it will find the nginx Deployment mentioned above
    role: web
---
##
# Rails
# Listen for traffic on 8080 so we don't have to run as root.
##
apiVersion: v1
kind: Service
metadata:
  name: rails
  labels:
    role: rails
spec:
  type: NodePort
  ports:
  - port: 8080
    targetPort: 8080
  selector:
    role: rails
---
##
# Redis
##
apiVersion: v1
kind: Service
metadata:
  name: redis
  labels:
    role: redis
spec:
  type: NodePort
  ports:
  - port: 6379
    targetPort: 6379
  selector:
    role: redis

Namespace

On top of all of this you can provide a Namespace. A Namespace is nothing more than a text identifier you can use to encapsulate your infrastructure. Kubernetes makes use of Namespaces internally to segregate its own services (kubedns, kube-proxy, etc) from your application by putting them in the kube-system namespace. If you don’t provide a Namespace, Kubernetes will put your resources in the default namespace, and for most use cases that will suffice. However, if you are running, for example, an infrastructure used by multiple different teams, using Namespaces can help provide a logical seperation as well as prevent collisions and confusion.

An example Namespace:

apiVersion: v1
kind: Namespace
metadata:
  name: my-app

Labels

Kubernetes makes heavy use of labels to both tag and find resources across your clusters. You can see the use of labels in the examples above, where the Deployment has a labels section and the Services have matching selectors. With labels, Kubernetes will link up matching Resources into the full stack for you, resulting in: Service -> Deployment -> Pod -> Container.

Like I said at the beginning, Kubernetes is a large ecosystem that’s getting larger with each release, but once you understand these initial Resources and how to use them, continuing your education on the other Resources that Kubernetes provides gets much easier. For some next steps, I recommend looking into the following:

  • Every application has information that needs to stay secure (database credentials, etc). Secrets are how Kubernetes takes your sensitive information and makes them available to Pods and containers.
  • DaemonSet for when you want to make sure that some or all Nodes run a copy of a Pod.
  • Job for when you have one-off tasks that you need to run on the cluster (for example, we use a Job to run database migrations).

In Part 3 of this series I will dive into more technical details of setting up, configuring, and managing your own Kubernetes cluster!


To skip around to other parts of this blog series, use the links below.

Part 1 - Looking at containerized infrastructure

Part 3 - How to get started with a Kubernetes configuration

Photo of Jason Roelofs

Jason is a senior developer who has worked in the front-end, back-end, and everything in between. He has a deep understanding of all things code and can craft solutions for any problem. Jason leads development of our hosted CMS, Harmony.

Comments:


Post a Comment

(optional)
(optional — will be included as a link.)