Introduction to Containers and Container Security
Introduction to Kubernetes
Following Docker's impact on the software industry, Google open-sourced Kubernetes in 2014. Kubernetes is a container orchestration platform, designed for ru...
Following Docker's impact on the software industry, Google open-sourced Kubernetes in 2014. Kubernetes is a container orchestration platform, designed for running and managing containers at scale. Kubernetes makes deployment of a containerized application to a cluster of machines easy. It also has integrated support for service discovery, load balancing, and scaling.
Kubernetes is a Greek word that means helmsman or pilot, helm being a ship's steering wheel. Speaking of helms, there is another tool usually used with Kubernetes called Helm. Helm makes writing Kubernetes manifests more DRY and reusable, by separating the common parts of the manifests into templates. As you can imagine, the Kubernetes ship can get really wet at times.
Some of the benefits of using Kubernetes are:
- Automation: Kubernetes makes deploying application to a cluster of machines easy. So, instead of SSHing into the machine, getting the last version of the code (e.g. with
git pull), building the application, and running it, you can just run a single command and Kubernetes will take care of the rest. - High Availability: Kubernetes can run multiple instances of the same application. If one of the instances fails, Kubernetes will restart it. Servers can fail, containers will die, but you can't kill the helmsman that easily!
- Out of the Box: Kubernetes comes with a lot of features out of the box. It has support for service discovery, load balancing, and scaling. You can also extend it with plugins and custom resources.
Now that we learned Kubernetes is so awesome as a whole, let's learn about its pieces.
Kubernetes Terminology
In this part, we're going to discuss the different building blocks of a Kubernetes application. As Kubernetes is already overwhelming for beginners, we're going to skip details about how it operates.
Let's start with a list of basic Kubernetes concepts:
- Pod: A pod is the smallest unit of deployment in Kubernetes. It's usually used to run a single container. It's also used to run multiple containers that are tightly coupled and need to share resources.
- Service: A service is an abstraction that defines a logical set of pods and a policy to access them. It's used to expose a pod to the outside world.
- Deployment: A deployment is a declarative way to manage pods. It's used to manage pods that are part of the same application, and to scale them up and down. If you create a deployment with 3 replicas, Kubernetes will make sure that there are always 3 pods running.
- Namespace: A namespace is a way to divide cluster resources between multiple users. It's used to separate different environments like production and staging.
- Ingress: An ingress is an API object that manages external access to the services in a cluster. It's used to expose a service to the outside world.
These terms sound overwhelmingly complicated, because they are. But they'll make more sense when we start using them. So, let's do it!
Getting Started with Kubernetes
A Kubernetes cluster usually consists of multiple machines. There are two types of machines in a Kubernetes cluster:
- Master node: It's the machine that runs the Kubernetes control plane. It's responsible for managing the cluster.
- Worker node: It's the machine that runs the Kubernetes worker. It's responsible for running the containers.
In the case one of the machines fails, the other machines can take over its responsibilities. So, the cluster is highly available.
To test Kubernetes locally (on a single machine), there are the following options:
- Docker Desktop: This tool is a cross-platform GUI for Docker. It comes with a single-node Kubernetes cluster.
- Minikube: It's a tool that runs a single-node Kubernetes cluster inside a virtual machine. It's CLI-based and can be used on Linux, macOS, and Windows.
- Kind: Stands for Kubernetes in Docker. It's a tool that runs a single-node Kubernetes cluster inside a Docker container.
At the time of writing, there are other options to run Kubernetes locally as well, like MicroK8s, K3s, and K0s.
And of course there is the option of running Kubernetes on a cloud provider like AWS, GCP, or Azure. Or perhaps smaller cloud providers like DigitalOcean or Linode, for a more cost-effective solution.
Now that we have a Kubernetes cluster running, let's get our hands dirty with some Kubernetes commands.
Kubernetes Hello World
Before starting with the example, we need to have kubectl installed. We mentioned how to install it in the technical requirements section.
The kubectl can connect to one or more Kubernetes clusters, so we have a concept of context in kubectl. To connect to different clusters, we need to switch between contexts. We can list the available contexts with the following command:
$ kubectl config get-contexts
The output on my machine is something like this:
CURRENT NAME CLUSTER AUTHINFO
docker-desktop docker-desktop docker-desktop
* minikube minikube minikube
Let's switch to the docker-desktop context:
$ kubectl config use-context docker-desktop
If you want to use a different cluster, you can of course use the context of your choice.
Now let's create a pod with the kubectl run command:
$ kubectl run hello-world --image=hello-world
The output should be something like this:
pod/hello-world created
We can list the pods with the kubectl get pods command:
$ kubectl get pods
The output should be something like this:
NAME READY STATUS RESTARTS AGE
hello-world 0/1 CrashLoopBackOff 3 (26s ago) 72s
The pod is in the CrashLoopBackOff state. This is expected, as the hello-world image doesn't have a process that keeps running. It just prints a message and exits.
We can get more information about the pod with the kubectl describe pod command:
$ kubectl describe pod hello-world
The output should be something like this:
Events:
Type Reason Age Message
---- ------ ---- -------
Normal Scheduled 37s Successfully assigned hello-world to docker-desktop
Normal Pulled 34s Successfully pulled image "hello-world" in 1.7959s
Normal Pulled 33s Successfully pulled image "hello-world" in 1.2793s
Normal Pulling 18s Pulling image "hello-world"
Normal Created 17s Created container hello-world
Normal Started 17s Started container hello-world
Normal Pulled 17s Successfully pulled image "hello-world" in 1.1725s
Warning BackOff 5s Back-off restarting failed container
We can look at the logs of the pod with the kubectl logs command:
$ kubectl logs hello-world
The output should be something like this:
Hello from Docker!
The rest of the output is omitted, as it's the same output we got when we ran the docker run hello-world command.
Now let's delete the pod with the kubectl delete pod command:
$ kubectl delete pod hello-world
In real-world applications, we usually don't create pods directly. We use deployments instead. And to create deployments, we use YAML manifests. Even a step further, we usually don't create deployments directly. We use Helm charts instead. And to create Helm charts, we use YAML manifests as well.
Now let's create a deployment with the kubectl create deployment command:
$ kubectl create deployment hello-world --image=busybox -- /bin/sh -c \
"while true; do echo Hello World; sleep 1; done"
Then let's list all the Kubernetes resources with the kubectl get all command:
NAME READY STATUS RESTARTS AGE
pod/hello-world-57d969cbb9-tkdg4 1/1 Running 0 9s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 214d
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/hello-world 1/1 1 1 10s
NAME DESIRED CURRENT READY AGE
replicaset.apps/hello-world-57d969cbb9 1 1 1 10s
You can see there is a pod, a deployment, a service, and a replicaset. Let's look into each one of them in more details.
Pod
You can think of a pod as a container. It's called a pod (and not a container) in Kubernetes terminology, because it could potentially run multiple containers. But in most cases, it runs only one container.
To get more information about the pod, we can use the kubectl describe pod command:
$ kubectl describe pod hello-world-57d969cbb9-tkdg4
You should use your own pod name here. The output should be something like this:
Name: hello-world-57d969cbb9-tkdg4
Namespace: default
Priority: 0
Service Account: default
Node: docker-desktop/192.168.65.4
Start Time: Sun, 17 Sep 2023 23:24:17 +0200
Labels: app=hello-world
pod-template-hash=57d969cbb9
Annotations: <none>
Status: Running
IP: 10.1.1.0
IPs:
IP: 10.1.1.0
Controlled By: ReplicaSet/hello-world-57d969cbb9
Containers:
busybox:
Container ID: docker://b683e2c5613ceb50dfc3166bed0506394cc4dc8688
Image: busybox
Image ID: docker-pullable://busybox@sha256:b683e2c5613cecc4dc8688
Port: <none>
Host Port: <none>
Command:
/bin/sh
-c
while true; do echo Hello World; sleep 1; done
State: Running
Started: Sun, 17 Sep 2023 23:24:19 +0200
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-qtjkm
There is even more description on the terminal, but we copied only up to the Containers section. You can see that the pod has a container called busybox. It's the container we created with the kubectl create deployment command.
You can also see the command. We also have the labels and annotations, that would help us to identify the pod.
We could initially describe the pod by its labels:
$ kubectl describe pod -l app=hello-world
Deployment
Deployments are ways to manage pods. They are declarative, so we don't need to create pods directly. In a deployment, we specify the number of replicas we want, and Kubernetes will make sure that there are always that many replicas running.
To get more information about the deployment, we can use the kubectl describe deployment command:
$ kubectl describe deployment hello-world
The output should be something like this:
Name: hello-world
Namespace: default
CreationTimestamp: Sun, 17 Sep 2023 23:24:17 +0200
Labels: app=hello-world
Annotations: deployment.kubernetes.io/revision: 1
Selector: app=hello-world
Replicas: 1 desired | 1 updated | 1 total | 1 available
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Pod Template:
Labels: app=hello-world
Containers:
busybox:
Image: busybox
Port: <none>
Host Port: <none>
Command:
/bin/sh
-c
while true; do echo Hello World; sleep 1; done
Environment: <none>
Mounts: <none>
Volumes: <none>
Conditions:
Type Status Reason
---- ------ ------
Available True MinimumReplicasAvailable
Progressing True NewReplicaSetAvailable
OldReplicaSets: <none>
NewReplicaSet: hello-world-57d969cbb9 (1/1 replicas created)
Events: <none>
The deployment has information like the number of replicas, the strategy for updating the replicas, and the pod template.
Service
Services are logical objects that make containers accessible from outside the cluster. I say "logical" because services don't have a physical representation. They are just some rules that Kubernetes uses to route traffic to the pods.
To get more information about the service, we can use the kubectl describe service command:
$ kubectl describe service kubernetes
The output should be something like this:
Name: kubernetes
Namespace: default
Labels: component=apiserver
provider=kubernetes
Annotations: <none>
Selector: <none>
Type: ClusterIP
IP Family Policy: SingleStack
IP Families: IPv4
IP: 10.96.0.1
IPs: 10.96.0.1
Port: https 443/TCP
TargetPort: 6443/TCP
Endpoints: 192.168.65.4:6443
Session Affinity: None
Events: <none>
The service contains details such as the IP address, port, and endpoints. The endpoints are the pods that are managed by the service.
Here we have type of ClusterIP. It's the default type for services. It means that the service is only accessible from inside the cluster.
We can change the type to NodePort to make the service accessible from outside the cluster.
Replicaset
Replicasets differ from deployments in that they are not declarative. They are used by deployments to manage pods.
To get more information about the replicaset, we can use the kubectl describe replicaset command:
$ kubectl describe replicaset hello-world-57d969cbb9
The output should be something like this:
Name: hello-world-57d969cbb9
Namespace: default
Selector: app=hello-world,pod-template-hash=57d969cbb9
Labels: app=hello-world
pod-template-hash=57d969cbb9
Annotations: deployment.kubernetes.io/desired-replicas: 1
deployment.kubernetes.io/max-replicas: 2
deployment.kubernetes.io/revision: 1
Controlled By: Deployment/hello-world
Replicas: 1 current / 1 desired
Pods Status: 1 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
Labels: app=hello-world
pod-template-hash=57d969cbb9
Containers:
busybox:
Image: busybox
Port: <none>
Host Port: <none>
Command:
/bin/sh
-c
while true; do echo Hello World; sleep 1; done
Environment: <none>
Mounts: <none>
Volumes: <none>
Events: <none>
The replicaset has information like the number of replicas, the pod template, and the deployment that manages it.
We can say, only the pods are physical objects in Kubernetes. The rest are logical stuff.
Kubernetes Hello World with Helm
The standard way of creating Kubernetes applications is using Helm charts. Helm charts are packages of Kubernetes manifests.
First, make sure helm is installed on your machine. Then let's clone the following git repository that contains a link shortener application:
$ git clone https://github.com/aerabi/link-shortener-js.git
Then let's go to the link-shortener-js directory:
$ cd link-shortener-js
On the root of the repository, there is a directory called chart. Let's take a look into it:
$ tree chart
The output should be something like this:
chart
Chart.yaml
templates
_helpers.tpl
deployment.yaml
ingress.yaml
service.yaml
values.yaml
2 directories, 6 files
Let's run the following command to install the application in the link-shortener namespace:
$ kubectl create namespace link-shortener
$ helm install link-shortener-js chart --namespace link-shortener
The output should be something like this:
NAME: link-shortener-js
LAST DEPLOYED: Sun Sep 24 11:55:14 2023
NAMESPACE: link-shortener
STATUS: deployed
REVISION: 1
TEST SUITE: None
Let's list everything in the link-shortener namespace:
$ kubectl get all -n link-shortener
The output should be something like this:
NAME READY STATUS RESTARTS AGE
pod/link-shortener-js-5458ff4bb-nsrx2 1/1 Running 0 2m17s
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/link-shortener-js 10.102.219.230 <none> 3000/TCP 2m17s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/link-shortener-js 1/1 1 1 2m17s
NAME DESIRED CURRENT READY AGE
replicaset.apps/link-shortener-js-5458ff4bb 1 1 1 2m17s
Let's describe the service:
$ kubectl describe service link-shortener-js -n link-shortener
The output should be something like this:
Name: link-shortener-js
Namespace: link-shortener
Labels: app.kubernetes.io/instance=link-shortener-js
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=link-shortener-js
app.kubernetes.io/version=hashmap
helm.sh/chart=link-shortener-js-0.1.0
Annotations: meta.helm.sh/release-name: link-shortener-js
meta.helm.sh/release-namespace: link-shortener
Selector: app.kubernetes.io/instance=link-shortener-js,
app.kubernetes.io/name=link-shortener-js
Type: ClusterIP
IP Family Policy: SingleStack
IP Families: IPv4
IP: 10.102.219.230
IPs: 10.102.219.230
Port: api 3000/TCP
TargetPort: 3000/TCP
Endpoints: 10.1.1.1:3000
Session Affinity: None
Events: <none>
As you can see, the service is only accessible from inside the cluster. To make it accessible from outside the cluster, we need to change the service type to NodePort.
Go to the chart/values.yaml file and change the service.type value to NodePort.
imagePullSecrets: []
nameOverride: ""
fullnameOverride: ""
service:
type: NodePort # <-- Change this line
ports:
internal:
name: "api"
number: 3000
Then run the following command to upgrade the application:
$ helm upgrade link-shortener-js chart --namespace link-shortener
The output should be something like this:
Release "link-shortener-js" has been upgraded. Happy Helming!
NAME: link-shortener-js
LAST DEPLOYED: Sun Sep 24 12:24:18 2023
NAMESPACE: link-shortener
STATUS: deployed
REVISION: 2
TEST SUITE: None
Let's list everything in the link-shortener namespace again:
$ kubectl get all -n link-shortener
You can see that the service is now of type NodePort. The port 3000 is mapped to the port 32045. You can access the service on the port 32045 now.
$ curl localhost:32045
The output should be the following:
Hello World!
Voilà! You have a Kubernetes application running. Next stop is talking about container security.
Exercises
- In the example above, investigate the
Chart.yamlfile. What's the name of the chart? What's the version of the chart? - In the example above, investigate the
templates/deployment.yamlfile. What's the name of the deployment? What's the name of the pod template? What's the name of the container? To find out, you should investigate_helpers.tplfile as well. The_helpers.tplfile is a template that is used by the other templates. It's usually used to define reusable parts of the templates. - The
templates/deployment.yamlfile has areplicasfield. The value is:{{ .Values.replicaCount }}which means it reads the value from thevalues.yamlfile. What's the value of thereplicaCountin thevalues.yamlfile? Change the value to2and upgrade the application with thehelm upgradecommand. What would you expect to be the output of thekubectl get allcommand?