Pages

Friday, June 5, 2020

Containers Vs Pods

A Common confusion to most of the developers and administrators is the difference between a Pod and Container. We use the term container when using Docker and use the term pod with Kubernetes or openshift. So what exactly is the difference between a pod and a Container?

In simple terms, if you see the logo of docker we see a whale as below,
 
What are groups of whales called?, a Pod. If a single whale is called a container, multiple whales or containers are called pods. Though a pod contains multiple containers, the way they work in the pod is different. In this article, we will see what containers are and pods are?

Containers are not new 
Containers are not new atall. Docker is not the one who started the container revolution. The first of the container sort of implementations came with the OpenBSD operating system when they came up with the “Jails” Concept in 1999. Solaris introduced a Concept of “Zones” in around 2004. Then many other implementations came into picture. In 2006, Google came up with something called “process Containers”. Process Containers was designed for limiting, accounting and isolating resource usage (CPU, memory, disk I/O, network) of a collection of processes. It was renamed “Control Groups (cgroups)” a year later and eventually merged to Linux kernel 2.6.24. In 2013, google came up with a container stack called “LMCTFY” and Docker came into picture. Docker made the life of a developer easy to implement and run the containers in production.
 
How are containers Created?
Though container implementations are introduced by various organizations, the core components for creating the containers are available in the linux kernel itself. Using these core components we can create containers without using any container runtime. More over every container runtime like Docker, RKT etc that are available in the market now uses the same core components to create containers under the hood. The core components include Chroot, Namespaces, Cgroups, Capabilities and UnionFS. A detailed introduction about the components is provided in the “anatomy of containers” article. 

Containers are just like normal processes that run with some extra features of linux kernel called Namespaces and Cgroups. Namespaces attach a separate view of the system resources like process, Network, Hostname etc to processes that hide everything to other processes and external worlds. By this separate view, the processes will get their own execution environment to run. Namespaces include,
Hostname
Process Tree
File System
Network Interfaces
Inter-process communication

While Namespaces can’t allow processes to interface with other processes, the process can still access resources like Memory and CPU from the Host machine. There needs to be control of these host resources to containers. Cgroups are introduced to restrict Memory, CPU ,I/O and network from Host machine to containers. By default a Container will get unlimited resources from the Host machine until restricted.

Combining Everything
A Container when created will have these core components attached to that. Each of these core components have different functionality and can be attached or not attached to containers. Some containers can have few namespaces attached but not all. Similarly these core components can be attached to one or more processes. We can have multiple processes running with a single namespace. By attaching these core components to multiple processes we can extend their functionality. For example, if multiple processes are added with network namespace then both processes can communicate with each other as local processes. Similarly if two processes are attached with a Memory cgroup restriction, then both processes are restricted with memory.

For example, create a nginx container as below,
[root@ip-172-31-16-91]# docker run -d --name nginx -p 8080:80 nginx
Now start the second container by attaching some of the namespace components from the first container to the second as below,

[root@ip-172-31-16-91]# docker run -it --name centos --net=container:nginx --pid=container:nginx centos /bin/bash
[root@91202914ab48 /]# ps ux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.2 0.6 10624 6076 ? Ss 06:02 0:00 nginx: master process nginx -g daemon off;
root 34 6.0 0.3 12024 3356 pts/0 Ss 06:03 0:00 /bin/bash

[root@91202914ab48 /]# curl localhost:80



Welcome to nginx!


Welcome to nginx!


If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.

For online documentation and support please refer to
nginx.org.

Commercial support is available at
nginx.com.

Thank you for using nginx.


[root@91202914ab48 /]# exit

If we can see the second container process, we see a Nginx process running. This way we have connected 2 containers with the same namespace. The second container has the same network and process tree namespace attached from the first container. Since we have the network namespace shared, the nginx process running on the first container can be accessed by using “localhost:80” from the second container.

Generally when ever a container is created, we see the similar namespace settings as below,

But when we have 2 containers with shared namespaces we can see both containers attached as below,

This is the same with Pods. Instead of creating multiple containers and running them on shared namespaces and cgroups, we create Pods with multiple containers. The orchestration platform that creates pods will take care of creating containers inside the pods with a shared model. The Orchestration platform like Kubernetes will take care of automating the creation of containers inside a pod with correct namespaces and cgroups set.

So What are pods anyway?
Pod is a group of one or more containers ( basically docker or rkt containers ) with shared storage/network and a spec file on how to run the containers inside the pod.

The containers inside the pod are always co-located and co-scheduled that run in a shared context. The containers running inside the pod can now talk with each other on a local host. The pod will be given a single IP address and all containers inside the pod will be accessed with the pod ip address.A Storage location attached to the pod can now be accessed by any number of containers inside the pod.

We can also think of a pod to be a single VM with one or more containers that have a single IP address and Port space. Containers inside the pod talk to each other on localhost since we have the same shared network space. They communicate with each other with standard inter-process communication systems. Contains inside the pod also have access to shared volumes, which are defined in the pod spec file. These volumes can be mounted to one or all containers running inside the pod by defining the mount options in the pod spec file.

Why Pods?
A Pod represents a Unit of deployment, a single instance of an application or multiple instances of different applications that are tightly coupled and that share resources. The “one-container-per-pod” model is the most common use case where you run one instance of your application in a container in a pod.

Multi-container model comes into picture when we have applications that are tightly integrated together. For example, we have a pod with 2 containers as below,

In the above image, we have a pod with shared volume and network. The volume is mounted to both containers “content puller application” and “web server”. The container “content puller” will pull content from the content management system and put it in the volumes. The web server container will pull the content from volume and will be displayed for users. Since these 2 containers are serving a single purpose of displaying the content to users, we can think of these multiple cooperating processes ( Containers ) a single unit of service.

Unit of deployment - Since multiple containers run in the pod they are treated as a single unit of deployment. Everything done on that pod will also work on the containers running inside the pod.

Management : Since multiple containers in pods represents a single unit of service, management, deployment can be very easy. Pods can now be easily deployed with multiple versions, horizontally scaled and replicated. Colocation (co-scheduling), shared fate (e.g. termination), coordinated replication, resource sharing, and dependency management are handled automatically for containers in a Pod

Ease of use : The orchestration platform takes care of running the pods with containers. We application developers don't need to worry about the container exit , signalling etc. When a pod goes down, the platform takes care of getting up the pod in a new machine including the containers and volumes attached as before.

If one of the containers in the pod goes down, the whole pod goes down making sure that the whole stack of applications are up or down at any point of time. Since the pods are scaled up and down as a unit, it is very important for the pod to be small. Since pods with multiple containers are scaled, every container inside is scaled regardless of the use. So make sure the pod is small and contains a single container until needed.

No comments :

Post a Comment