You are currently viewing Kubernetes 101: Beginner’s Cookbook for Seamless Container Orchestration
codeacademia kubernetes

Kubernetes 101: Beginner’s Cookbook for Seamless Container Orchestration

Mastering Kubernetes: Simplifying Container Orchestration with Essential Concepts and Deployment Techniques

If you’re familiar with the name Kubernetes but find the learning curve daunting, you’re not alone. Kubernetes is a robust orchestrator designed to streamline application deployment and management across clusters of machines. However, its complexity can be overwhelming, even for experienced developers. The good news is that mastering Kubernetes unlocks a world of possibilities, as many of its fundamental concepts are transferable to other orchestrators like Docker Swarm.

In this comprehensive article, we aim to demystify Kubernetes by breaking down its most commonly used concepts into relatable system administration terms. By bridging the gap between familiar concepts and Kubernetes, we’ll guide you through deploying a simple web server and highlight the interactions between different resources within the cluster. Additionally, we’ll provide insights into typical command-line interactions to streamline your Kubernetes workflow.

While our focus is primarily on the developer side of Kubernetes, we’ll also provide valuable resources for cluster administration to ensure a well-rounded understanding of the platform. Take the first step towards mastering Kubernetes and unlock the full potential of container orchestration.

Terminology and concepts

Understanding Kubernetes Architecture: Demystifying Cluster Components and Node Roles

In the realm of Kubernetes, the fundamental building block is the cluster itself, which encapsulates all the necessary components. Within this cluster, there are two crucial types of nodes: the Control Plane and the Worker Nodes.

The Control Plane represents a centralized set of processes responsible for managing cluster resources, load balancing, health monitoring, and more. Typically, a Kubernetes cluster consists of multiple controller nodes to ensure availability and distribute the workload efficiently. As a developer, your primary interaction point with Kubernetes will likely be through the API gateway, which facilitates seamless communication and control.

On the other hand, Worker Nodes are the hosts that run a local Kubernetes agent called Kubelet and a communication process known as Kube-Proxy. The Kubelet executes commands from the control plane on the local container runtime, such as Docker, ensuring the proper functioning of containers. Meanwhile, Kube-Proxy takes care of directing network connectivity to the appropriate pods within the cluster.

By grasping the underlying architecture of Kubernetes, including the roles and responsibilities of the Control Plane and Worker Nodes, you’ll gain a solid foundation for effectively deploying and managing your applications within the cluster.

kubernetes

Namespaces
After some time, a Kubernetes cluster may become huge and heavily used. In order to keep things well organized, Kubernetes created the concept of Namespace. A namespace is basically a virtual cluster inside the actual cluster.

Most of the resources will be contained inside a namespace, thus unaware of resources from other namespaces. Only a few kinds of resources are completely agnostic of namespaces, and they define computational power or storage sources (i.e. Nodes and PersistentVolumes). However, access to those can be limited by namespace using Quotas.

Namespace-aware resources will always be contained in a namespace as Kubernetes creates and uses a namespace named default if nothing is specified.

Namespace Organization

There is no silver bullet on the way to use namespaces, as it widely depends on your organization and needs. However, we can note some usual namespaces usages:

Divide the cluster by team or project, to avoid naming conflict and help repartition of resources.
Divide the cluster by environment (i.e. dev, staging, prod), to keep a consistent architecture.
Deploy with more granularity (e.g. blue/green deployment), to quickly fall back on an untouched working environment in case of issue.
Further reading:

Namespace Documentation

Manage The Cluster Namespaces

Glossary
Kubernetes did a great work of remaining agnostic of any technology in their design. This means two things: handle multiple technologies under the hood and there is a whole new terminology to learn.

Fortunately, these concepts are pretty straightforward and can most of the time be compared to a unit element of classic system infrastructure. The table below will summarize the binding of the most basic concepts. The comparison might not be a hundred per cent accurate but rather here to help understand the need behind each concept.

Abstraction Layer Physical Layer Uses Namespace Description
Pod Container ✅ A Pod is the minimal work unit of Kubernetes, it is generally equivalent to one applicative container but it can be composed of multiple ones.
Replicaset Load Balancing ✅ A ReplicaSet keeps track of and maintain the amount of instances expected and running for a given pod.
Deployment – ✅ A Deployment keeps track of and maintain the required configuration for a pod and replicaset.
StatefulSet – ✅ A StatefulSet is a Deployment with insurance on the start order and volume binding, to keep state consistent in time.
Node Host ❌ A Node can be a physical or virtual machine that is ready to host pods.
Service Network ✅ A Service will define an entrypoint to a set of pods semantically tied together.
Ingress Reverse Proxy ✅ An Ingress publishes Services outside the Cluster.
Cluster Datacenter ❌ A Cluster is the set of available nodes, including the Kubernetes controllers.
Namespace – ➖ A Namespace defines an isolated pseudo cluster in the current cluster.
StorageClass Disk ❌ A StorageClass configures filesystems sources that can be used to dynamically create PersistentVolumes.
PersistentVolume Disk Partition ❌ A PersistentVolume describe any kind of filesystem ready to be mounted on a pod.
PersistentVolumeClaim – ✅ A PersistentVolumeClaim binds a PersistentVolume to a pod, which can then actively use it while running.
ConfigMap Environment Variables ✅ A ConfigMap defines widely accessible properties.
Secret Secured Env. Var. ✅ A Secret defines widely accessible properties with potential encryption and access limitations.
Further reading:

Official Kubernetes Glossary

Official Concepts Documentation

Definition files
The resources in Kubernetes are created in a declarative fashion, and while it is possible to configure your application deployment through the command line, a good practice is to keep track of the resource definitions in a versioned environment. Sometimes named GitOps, this practice is not only applicable for Kubernetes but widely applied for delivery systems, backed up by the DevOps movement.

To this effect, Kubernetes proposes a YAML representation of the resource declaration, and its structure can be summarized as follow:

Field File type Content
apiVersion All files Version to use while parsing the file.
kind All files Type of resource that the file is describing.
metadata All files Resource identification and labeling.
data Data centric files (Secret, ConfigMap) Content entry point for data mapping.
spec Most files (Pod, Deployment, Ingress, …) Content entry point for resource configuration.
Watch out: some resources such as StorageClass do no use a single entry point as described above

Further reading:

Guide on apiVersion

Yaml Specifications

Metadata and labels
The metadata entry is critical while creating any resource as it will enable Kubernetes and yourself to easily identify and select the resource.

In this entry, you will define a name and a namespace (defaults to default), thanks to which the control plane will automatically be able to tell if the file is a new addition to the cluster or the revision of a previously loaded file.

On top of those elements, you can define a labels section.
It is composed of a set of key-value pairs to narrow down the context and content of your resource. Those labels can later be used in almost any CLI commands through Selectors. As those entries are not used in the core behavior of Kubernetes, you can use any name you want, even if Kubernetes defines some best practices recommendations.

Finally, you can also create an annotations section, which is almost identical to labels but not used by Kubernetes at all. Those can be used on the applicative side to trigger behaviors or simply add data to ease debugging.

narrows down selection and identify the resource

metadata:
# The entry is required and used to identify the resource
name: my-resource
namespace: my-namespace-or-default
# is optional but often needed for resource selection
labels:
app: application-name
category: back
# is optional and not needed for the configuration of Kubernetes
annotations:
version: 4.2
Further reading:

Naming and Identification

Labels and Selectors

Annotations

Data centric configuration files
Those files define key-value mappings that can be used later in other resources. Usually, those resources (i.e. Secrets and ConfigMap) are loaded before anything else, as it is more likely than not that your infrastructure files are dependent on them.
apiVersion: v1

defines the resource described in this file

kind: ConfigMap
metadata:
name: my-config
data:
# configures data to load
configuration_key: “configuration_value”
properties_entry: |
# Any multiline content is accepted
multiline_config=true
Infrastructure centric configuration files
Those files define the infrastructure to deploy on the cluster, potentially using content from the data files.
apiVersion: v1

defines the resource described in this file

kind: Pod
metadata:
name: my-web-server
spec:
# is a domain specific description of the resource.
# The specification entries will be very different from one kind to another
Resources definition
In this section, we will take a closer look at the configuration of the most used resources on a Kubernetes application. This is also the occasion to showcase the interactions between resources.

At the end of the section, we will have a running Nginx server and will be able to contact the server from outside the cluster. The following diagram summarizes the intended state:

Intended Deployment

ConfigMap
ConfigMap is used to hold properties that can be used later in your resources.
apiVersion: v1
kind: ConfigMap
metadata:
name: simple-web-config
namespace: default
data:
configuration_key: “Configuration value”
The configuration defined above can then be selected from another resource definition with the following snippet:
valueFrom:
configMapKeyRef:
name: simple-web-config
key: configuration_key
Note: ConfigMaps are only available in the namespace in which they are defined.

Further reading:

ConfigMap Documentation

Secret
All sensitive data should be put in Secret files (e.g. API keys, passphrases, …). By default, the data is simply held as base64 encoded values without encryption. However, Kubernetes proposes ways of mitigating leakage risks by integrating a Role-Based Access Control or encrypting secrets.

The Secret file defines a type key at its root, which can be used to add validation on the keys declared in the data entry. By default, the type is set to Opaque which does not validate the entries at all.
apiVersion: v1
kind: Secret
metadata:
name: simple-web-secrets

Opaque can hold generic secrets, so no validation will be done.

type: Opaque
data:
# Secrets should be encoded in base64
secret_configuration_key: “c2VjcmV0IHZhbHVl”
The secret defined above can then be selected from another resource definition with the following snippet:
valueFrom:
secretKeyRef:
name: simple-web-secrets
key: secret_configuration_key
Note: Secrets are only available in the namespace in which they are defined.

Further reading:

Secrets Documentation

Available Secret Types

Pod
A Pod definition file is pretty straightforward but can become pretty big due to the quantity of configuration available. The name and image fields are the only mandatory ones, but you might commonly use:

ports to define the ports to open on both the container and pod.
env to define the environment variables to load on the container.
args and entrypoint to customize the container startup sequence.
Pods are usually not created as standalone resources on Kubernetes, as the best practice indicates to use pod as part of higher level definition (e.g. Deployment). In those cases, the Pod file’s content will simply be embedded in the other resource’s file.
apiVersion: v1
kind: Pod
metadata:
name: my-web-server
spec:
# is a list of container definition to embed in the pod
containers:
– name: web
image: nginx
ports:
– name: web
containerPort: 80
protocol: TCP
env:
– name: SOME_CONFIG
# Create a line “value: ” from the ConfigMap data
valueFrom:
configMapKeyRef:
name: simple-web-config
key: configuration_key
– name: SOME_SECRET
# Create a line “value: ” from the Secret data
valueFrom:
secretKeyRef:
name: simple-web-secrets
key: secret_configuration_key
Note: Pods are only available in the namespace in which they are defined.

Further reading:

Pod Documentation

Advanced Pod Configuration

Fields available in Pod entry

Fields available in Pod entry

Deployment
The Deployment is generally used as the atomic working unit since it will automatically:

Create a pod definition based on the template entry.
Create a ReplicaSet on pods selected by the selector entry, with the value of replicas as a count of pods that should be running.
The following file requests 3 instances of an Nginx server running at all times. The file may look a bit heavy, but most of it is the Pod definition copied from above.
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-web-server-deployment
namespace: default
labels:
app: webserver
spec:
# should retrieve the Pod defined below, and possibly more
selector:
matchLabels:
app: webserver
instance: nginx-ws-deployment
# asks for 3 pods running in parallel at all time
replicas: 3
# The content of

Leave a Reply