Kubernetes is Google’s open-source orchestration tool for managing containers like Mezos and Docker Swarm. It was officially launched in 2014, where Google saw opportunity to drive customers to its cloud services. Kubernetes became very popular managing container runtimes like Docker and it is capable for orchestration of complex applications, load balancing, security, networking, storage and many more. Kubernetes cluster offers many features and benefits like automation of application deployment, scaling and management. Very handy in automation many boring manual tasks with Linux operating system.
Kubernetes is a cluster which means it is planning, monitoring and managing workloads. Workloads in this case are Linux containers. Infrastructure components of Kubernetes cluster are master and worker nodes. Master node represents logic, it decides how to move load between nodes, load containers based on capacity and many more. Workers host containers are it represent data part of Kubernetes cluster. Components of Kubernetes cluster are
ETCD – Database (Key value which stores information in key value format)
Kube scheduler – marks correct node to place node based on policy, constraints, load and capacity
Kube-controller-manager – manages namespaces, workloads, service account and many more.
Node controller – joins new worker nodes to cluster, handles and removes unavailable nodes
Replication controller – maintains defined number of replicas in cluster
Kube API –taking inputs from external users and internal components, create and modify existing cluster, orchestrating all components in cluster
Docker – container runtime engine
Kubelet – agent which is installed on every node, executes commands from Kube API like creation and destruction of containers
Kube proxy – network proxy that establishes communication between pods maintain network rules
Caption 1 Kubernetes architecture
Etcd is Kubernetes brain, used as distributed reliable key-value store which is simple, secure and fast. It stores all configuration, status of running workloads, roles, configs, secrets, nodes and many more. Etcd is distributed across all defined master nodes and replication is done by RAFT protocol. Service listens on port 2379. It can be managed using etcd tool which is installed separately.
Crucial component on master nodes which is used to authenticate requests from external users and executes commands by delegating instruction to kubelet service which is installed on worker nodes. It retrieves information from etcd database (only component which interacts with database), and also taking input from internal Kubernetes components like controller manager and scheduler.
Basic workflow will look like this:
- Authenticate user which sends request to Kube API server
- Validate request
- Get information from etcd database
- Get information from kube scheduler on which node to put pod
- Sends instructions to kubelet component on worker node
- Kubelet sends information to container runtime(Docker)
- When kubelet send information back, etcd database is updated with new information
List api server process:
#ps – aux | grep kube-apiserver
Kube controller manager
Kube controller manager has two main components: node controller and replication controller.
Manages nodes and containers, taking proper action. Creates and destroys new nodes and pods. Node controller monitor nodes every 5 seconds. If stops receiving heartbeats, nodes is marked as unreachable (40 second). After unreachable it gives 5 minutes, and then removes nodes are provision pods to another.
Node monitoring period – 5s
Node monitor grace period – 40s
Pod eviction timeout – 5min
Responsible for monitoring status of replica sets maintain defined number of replicas.
Get system namespace pods:
#kubectl get pods –n kube-system
Kube controller service:
Kube controller process:
#ps-aux| grep kube-controller –manager
Responsible for scheduling pods on nodes, used only for decision and not execution, it does not actually place pods but informs Kube API server. Determines which nodes are suitable for placement of new pods.
Kubelet is agent running on each worker node which implements all activities on nodes, build pods, unload pods, registers nodes to Kubernetes cluster and sends reports. It is actually builder taking commands from Kube API server.
Get kubelet process:
#ps –aux | grep kubelet
Network proxy component installed on very worker node. When you build pod network you are not sure that pod will keep its IP address each time it is recreated. So we create logical construct service which always holds static IP address. Pods are then access through service. Kube-proxy then comes to play by creating rules to forward traffic to each service by implementing IP table’s rules on Linux machines.
Kube proxy is virtual IP services which uses IP tables inside nodes to route traffic to target pod. IP tables are capturing traffic and redirect it to backend services. Linux kernel capability net filter handles the traffic. So why proxy and not the DNS? Well, DNS sometimes do not go with defined TTLs and caches the results even when records are expired.
User space proxy mode, pintables proxy mode (default one), and IPVS proxy mode.
Get system namespace pods:
#kubectl get pods –n kube-system
#kubectl get daemon set –n kube-system
Kubernetes has built-in DNS service which name is CoreDNS. It works directly with Kubernetes API and creates DNS records for each service.
Pods are smallest deployment unit you can create on Kubernetes cluster. In virtualization terminology pods are virtual machines running with defined service. If you remember Docker world, you have containers, but containers in Kubernetes are encapsulated in pods. Usually there is one to one relationship between containers and pods, but there can be more containers inside one pod. That is true in cases where you have basic and helper container in one pod.
Docker is by default configured to get images from Docker Hub. In other words, if you run command kubectl run ngnix, Docker will check Docker Hub by default. Nice to start but not recommended to pull images from public repository.
Get all running pods:
#kubectl get pods
Everything you create in Kubernetes it is in form of objects and yaml format. Pods are also created as objects:
|apiVersion: v1 # (version of kubernetes API)
kind: Pod # type of object
name: app # name of pod
Image: centos # name to get from repository
Command to create pod:
#kubectl create –f pod-example.yml
Command to display running pods:
#kubectl get pods
Command to check pod details:
#kubectl describe pod myapp-pod
Objects which is responsible for defining how many pod replicas you will run in the cluster. Ensures high availability, load balancing and scaling.
Replica set is new version of replication controller.
Let’s check object example:
kind: ReplicaSet # type of object
POD.yaml #here you paste pod.yaml content
Create replica set object:
#kubectl create –f replicataset-definiton.yml
Get replicaset information:
#kubectl get replicaset
Labels and selectors
Match replicas set with pod(s). This is the way how replica set will know to which pod will apply its settings. If pods are already created replica set will not create new pods.
Scaling is procedure of changing defined number of pod replicas. It can be done in several ways:
Change by manual change of replica set yaml file
#kubectl replace –f replicatset.yml
Change by sending parameter replicas:
#kubectl scale –replicas=6 –f replicaset-definition.yml
Deployment object has highest level in Kubernetes hierarchy. This means that deploying the deployment object you can include pod, replica set and deployment in one shot.
Create deployment object:
# kubectl create –f deployment.ymk
Check deployment object (it will automatically create replica set):
#kubectl get deployments (it will automatically create replica set)
Check deployment objects:
#kubectl get all
Caption 2 Deployment object
Notice that you can create pod and replica set objects, but creating deployment object it will create all objects in one row.
Imagine two or more projects installed on Kubernetes cluster. There should be some way to organize cluster in isolated sections. We can say namespace is isolated virtual projects which is bounded to pods and all infrastructure related to one project. Once you create Kubernetes cluster, default namespace is created (kube-system) which runs several pods which deliver various internal cluster services – networking, DNS, control plane and many more. It possible to establish communication between different namespaces.
Default DNS zone in cluster is cluster.local. So the full pod name would look like this:
cluster.local – default dns zone
svc – service object
dev – name of the namespace
web – name of the pod
Get pods from default namespace:
#kubectl get pods –namespace=kube-system
Create pod in dev namespace
#kubectl create-f pod.yml –namespace=dev
#kubectl create –f namespace.yml
Switch between namespaces:
#kubectl config set-context $(kubectl config current context) –namespace=dev
Get pods from all namespaces:
#kubectl get pods –all-namespaces
Resource quota objects are used to set constraints in order to limit resource consumption like CPU and memory. Very useful because if you do not define resource quota, namespace will have unlimited resources at its disposal which can lead to unexpected behavior.
In example above we specified hard limits for namespace which means those values cannot be exceeded. Request limits are resources which are guaranteed to namespace to even start. Limits are top bound resources namespace can get. Memory limits are usually set to Megabytes (Mi) and Gigabystes (Gi). CPU is always set in milicores:
1000milicores = 1 CPU core
Create resource quota object for dev namespace:
#kubectl apply -f resourcequota.yaml –namespace=dev
Services are Kubernetes objects used to expose pod or group of pods as a network service with static and constant IP address and fully qualified domain name. The need for services came into the mind of Kubernetes architects because you can’t rely on PODs IP address. What does it mean? Well, pod get IP from network CDN and you can’t guarantee that this IP address will be constant all the time. Pods are restarted, recreated and destroyed and each of this actions will change it IP address.
ClusterIP: (Default service to expose pods only inside the Kubernetes cluster)
NodePort: (Expose the service on each Node IP’s. You are able to contact service from outside).
LoadBalancer: (Exposes the service with custom load balancer or cloud provider IP address)
Default network setup inside Kubernetes cluster. Communication is only happening inside Kubernetes cluster. It is not possible to expose traffic to external world. Random port is opened on each local node for each service. Good to debug the service or connect directly via you desktop or laptop.
Caption 3 ClusterIP architecture
– port: 80
Service myservice with namespace mynamespace exposing it on port 80 with Cluster IP. Notice the selector in which we connect application with service.
Create service with clusterip:
#kubectl apply -f clusterip.yaml
Start the Kubernetes Proxy:
# kubectl proxy –port=80
Nodeport opens specific ports on each node and forward external traffic directly to your service. There are many disadvantages with this setup: only one service per ports, you can only use ports between 30000-32767 and if Kubernetes IP is changes you have to do modifications with nodeip configuration.
Caption 4 Node port architecture
Load balancer is most convenient way to expose traffic to external world. Instead of using internal proxy service to route traffic you can define software or hardware load balancer to the job. Network Load Balancer has unique IP address which then forwards traffic to your service.
Caption 5 Load balancer architecture
– protocol: TCP
– ip: 192.0.2.127
Kubernetes ingress traffic
Once created, services are not exposed to external world. Kubernetes ingress objects allows HTTP and HTTPS communication from outside to internal Kubernetes services. Routing is controlled by certain rules which you can define. In other words you define which external source can access Kubernetes service. Ingress controllers are systems pods and usually installed as NGINX and HAProxy service.
Caption 6 Ingress controller behind load balancer
– path: /test
Ingress object which defines how URL path /test should connect to backend service myservice.
Imperative vs declarative approach
In Kubernetes world you can do the things in two ways: imperative or declarative mode. Imperative mode tells you what to do (step by step) and how to do it. Declarative mode will specify destination only, no specific details, just what to do.
This is example of imperative mode:
- Provision virtual machine ‘Database server’
- Install MySQL client and MySQL server package
- Install and configure MySQL database
- Configure root user
- Start MySQL-server instance
Declarative mode will look like this:
- VM: Database server
- Package: MySQL, MySQL-server
- Install: yes
- User: root
- Start: yes
Imperative mode in Kubernetes include running plain commands which you run once and forget:
#kubectl create deployment –image=centos centos
#kubectl run –image=centos centos
#kubectl edit deployment centos
Last command will just edit and change kubernetes object in memory. Better approach will be to use replace command: kubectl replace –f centosdeployment.yml
Declarative mode uses command kubectl apply:
#kubectl apply –f centos.yml
Command will check existing configuration and figure out what changes has to be done in the system. If object does not exist, it will create one. It does not throw error that pod does not exist
Run configuration at once : kubectl apply –f /path/to/config-files
kubectl apply command
Manages objects in declarative way. There are three types of Kubernetes objects for same instance:
localfile -> last applied configuration – > live object configuration.
Local file is local yaml file that you run command on. Last applied configuration is JSON format file which is used for decision changes. Live object configures lives in memory and Kubernetes cluster use it as current configuration. When you use kubectl apply local file is converted to JSON format file and stored as last applied configuration.
Why do we need last applied configuration file?
It you have data which is missing in local file but present in last applied configuration, it has to be removed from live configuration. In other words, it helps what data has to be removed from local file.