Tuesday, June 16, 2020

Docker : Understanding Build Context

Context refers to the source directory used when building the image for a container. Whenever we build an image, we run the build command passing the Dockerfile. The Dockerfile contains instructions for building the image of which ADD, COPY etc will copy or upload files to the image. In order for the files to be uploaded or copied to the final image, we need to pass the files to the docker engine while building. This is called passing the context.

What this means is , while building the image the current directory is passed to the docker engine as context or build context. The build context contains all the files in the current directory. Based on the instructions defined in the Dockerfile, docker engine will identify files that need to be uploaded or copied to the final build image from the Build context.

Why BuildContext?
The idea behind Docker build context is to share the current directory to docker engine while building images. Docker provides a facility to run docker command on a host and have the docker engine run on different host. In this case,the build command will go to the docker engine running on a different host which doesn't have the files that need to be copied or uploaded. So when we run the docker build, the current directory is moved to the docker engine where it is running and build happens over there.

Increasing Build Time
We already knew that we can use .dockerignore to ignore files that are not needed or secure files that are not to be added. But .dockerignore can also be used to ignore large files. Let's say i have 100mb files in my current directory along with Dockerfile, if i try to build it now it takes more time because the context contains 100mb of file. So it is very important to ignore large files while building the image.

It is always advisable to ignore .git directories along with dependencies that are downloaded/built within the image such as node_modules.These are never used by the application running within the Docker Container and just add overhead to the build process.

Building a Image without Docker Build ContextDocker provides a Way to build a Image without providing a Build Context. We can run the build process as,
[root@ip-172-31-40-44 testing]# docker image build -t mytestimage - < Dockerfile

This makes sure the Docker build context is not passed to the Docker engine. There is a drawback with this approach as the current directory will not be send to the docker engine as build context and hence engine dont know how to add files. If i run the same above Dockerfile as below,
[root@ip-172-31-40-44 testing]# docker image build -t mytestimage - < Dockerfile
Sending build context to Docker daemon 2.048 kB
Step 1/4 : FROM alpine
---> a24bb4013296
Step 2/4 : ADD . /app
---> 42cdc87f30a2
Removing intermediate container 2b864d39d223
Step 3/4 : COPY test.sh /test.sh
lstat test.sh: no such file or directory
We can see the files are not available and hence build fails. So our Dockerfile needs to be written in such a way to download from the internet rather than uploaded from the current directory. This way the build context will not be shared and the build process will be faster.

3 Different Ways to provide Docker build Context
Traditional Way : The traditional way is the most used way to pass the build context to the docker engine. Run the build process as,
[root@ip-172-31-40-44 testing]# docker build -t noignore .

This way the whole current directory is passed on fly to the Docker engine and build process completes

Git Way : Docker also supports passing the Context from the Git location as below,
[root@ip-172-31-40-44 testing]# docker build -t gitimage https://github.com/jagadish12/hellonode.git
Sending build context to Docker daemon 94.21 kB
Step 1/5 : FROM node:7-onbuild
Trying to pull repository docker.io/library/node ...
7-onbuild: Pulling from docker.io/library/node
Step 1/1 : ARG NODE_ENV
---> Running in 69ced8152b97
Step 1/1 : ENV NODE_ENV $NODE_ENV
---> Running in 4024e3b9a752
Step 1/1 : COPY package.json /usr/src/app/
Step 1/1 : RUN npm install && npm cache clean --force
---> Running in 5fece08d292e
*******
Step 1/1 : COPY . /usr/src/app
---> 3012df4023cd
Removing intermediate container 7cf9a7b2e561
Removing intermediate container 69ced8152b97
Removing intermediate container 4024e3b9a752
Removing intermediate container 30401b0d6c1f
Removing intermediate container 5fece08d292e
Step 2/5 : LABEL maintainer "jagadish.manchala@gmail.com"
---> Running in fd6478f09223
---> 67303c186c0e
Removing intermediate container fd6478f09223
Step 3/5 : HEALTHCHECK --interval=5s --timeout=5s CMD curl -f http://127.0.0.1:8000 || exit 1
---> Running in f10b60da636c
---> 587f67798add
Removing intermediate container f10b60da636c
Step 4/5 : USER 1001
---> Running in 6e8df2259141
---> 3f874c99e3fc
Removing intermediate container 6e8df2259141
Step 5/5 : EXPOSE 8000
---> Running in 06bedd422c05
---> 350f01446bb3
Removing intermediate container 06bedd422c05
Successfully built 350f01446bb3
In the above case, the Build happened by checking out the source code and building the image using the Dockerfile available on the root location of the source code.

Tar Way : The build context can also be passed from the gzip file as below,
[root@ip-172-31-40-44]: docker build -t hellonode http://host.com/hellonode.tar.gz

Hope this helps in understanding the Build Context. More to Come.

Most of the time we add files that are not necessary to the final build. These files don't do anything but consume space in the final build. If these files are too heavy, it will cause the construction of the final build to take time or if the files are very important and are added to the final build without knowing they can cause security issues.

This goes with Docker too. It is very important to make sure certain files are ignored when building the final image. Files that are not important and files that are very important need to be ignored in the final build. Docker provides a way to let the docker engine ignore certain file while building the images.

Introducing .dockerignore
Docker provides a way to ignore or prevent sensitive files or directories from being included by mistake in the final images and this can be done by adding .dockerignore file. The file .dockerignore will be stored alone with the dockerfile to make sure that files are ignored. For instance, we have a directory with files,

[root@ip-172-31-40-44 testing]# ls
Dockerfile password.txt test.sh

Build the image using, [root@ip-172-31-40-44 testing]# docker build -t noignore .
Run the image as,
[root@ip-172-31-40-44 testing]# docker run noignore ls /app
Dockerfile
password.txt
Test.sh

We can see the all files are included, now add the .dockerignore as,
[root@ip-172-31-40-44 testing]# ls -a
.dockerignore Dockerfile password.txt test.sh

[root@ip-172-31-40-44 testing]# cat .dockerignore
password.txt

Now build the image using,
[root@ip-172-31-40-44 testing]# docker build -t dockerignore .

Run the image using,
[root@ip-172-31-40-44 testing]# docker run dockerignore ls /app
Dockerfile
test.sh

We can see the image does not contain the password.txt file which is ignored.The ignore file supports directories and Regular expressions to define the restrictions, very similar to .gitignore.

Weave Scope - Troubleshooting & Monitoring for Docker & Kubernetes

Containers are of an ephemeral nature and are tricky to monitor compared to traditional applications running on virtual servers or bare metal servers. Yet container monitoring is an important capability needed for applications built on modern microservices architectures to ensure optimal performance.

Weave Scope is an advanced container troubleshooting and monitoring tool for Docker,Kubernetes and Amazon ECS. It not just monitors but provides additional capabilities like mapping application containers. In this article we will see the basics of Weave Scope and how it works.

Introducting WeaveScope
Weave Scope lets you monitor and control your containerized microservices applications. By providing a visual map of your Docker Containers, you can see the dependencies and communication links between them. Scope automatically detects processes, containers, hosts. No kernel modules, no agents, no special libraries, no coding.

The best feature of Weavescope is that it automatically generates a map of your application, enabling you to intuitively understand, monitor, and control your containerized, microservices-based application. For instance, if we have multiple front end and backend applications connecting to each other, weave will identify the connections and will generate a map for visualization.

Some of the best features of Weavescope are,
Manage and Montior containers in real time : provides an overview of the Container infrastructure, or foucs on a specific microservice. Helps in Easily identify and correct issues regarding the microservice.

Helps in interacting with the Container: We can directly launch a Command line from the Dashboard directly to the container for debugging and troubleshooting.

Metadata about the Containers : View contextual metrics, tags, and metadata for your containers

Map you architecture : Provides a detailed mapping of linked containers and infrastructure.

Installing WeaveScope : Installing WeaveScope is Quite Easy,
jagadishm@[/Volumes/Work]: sudo wget -O /usr/local/bin/scope \
https://github.com/weaveworks/scope/releases/download/latest_release/scope
jagadishm@[/Volumes/Work]: sudo chmod a+x /usr/local/bin/scope
jagadishm@[/Volumes/Work]: sudo scope launch

the UI is accessible on port 4040. Use the link below to visualize the Docker host. http://:4040. As new containers are launched, Scope will automatically update to reflect the live architecture. We can see the below dashboard with mapping of linked containers as below,

Weave Scope automatically identifies newly created containers and will show that on the Dashboard. From the UI you can see links and explore the details of each container node. These include CPU usage, TCP connections and memory load.The UI also allows you to attach and launch a shell prompt inside the container. By clicking on a node (the hexagon in Scope) you can find out more information about the container. The Container resource details are shown as below,

A Container shell can be triggered with the options available on the Container. If we see the below image, every container that we click will provide, attach, Exec Shell, Restart, pause and Stop options on the popup that open on clicking the container.

Click on the "exec Shell" to see a terminal is opened for the container where we can login to the container and perform actions like below,

Hope this helps in starting weavescope for container monitoring and managing.

Traditional applications once developed need to deploy to servers in order to provide for customers. The problem with this approach is that it requires capacity planning, procurement of hardware, installation of software and making them ready for the application. This normally takes a week to months in setting up everything. It is time consuming and includes initial and running expenses.

With the introduction of Cloud with on-demand hardware minimizes the capacity planning and solving many issues. There will still be costs involved with CAPEX( Capital expenditure ) and OPEX( Operational expenditure ) but it decreases deployment time, no staff to manage hardware etc. In this article we will see what is serverless and how it can be used

Introducing Serverless
Serverless can be defined as “an approach that replaces long running machines with ephemeral compute power that comes into existence on request and disappears immediately after use”

Understanding Serverless
Serverless doesn't mean no server but a server which we don't need to manage. Instead of maintaining a server and running the services on that, we delegate the maintenance of the server to a third party and we concentrate on the development of the services that we need to run on those servers. Generally these third parties that manage the servers are Cloud vendors.

It would be very useful if we concentrate on the development of the service rather than managing a server. This is where serverless comes into the picture. Serverless or serverless computing is an execution model in which we run our services on a hardware provided by a Cloud vendor like Aws, Google or Azure. These hardware are managed by the cloud and resources are attached and detached to the server based on the requirements. The cost will be based on the amount of resources consumed by the service. This is what makes this different from other models. In normal cases, we buy a server and run our services on it. We manage the server like adding memory or cpu when required but in the serverless the management of the server is handled by the cloud including the resources and everything. All we need to do is to run our services on that.

As we already said that serverless does not require a pre-defined hardware for the execution of the application, but it is the role of the application to trigger an action which will cause the hardware to be created and application is executed on that. Once the execution is completed, the hardware is stopped until another action is triggered.

For instance, let's say we have a content management application where users upload an image to the articles that they write. If we are in a serverless architecture built with Aws Lambda , the image will get uploaded to the S3 bucket first and an event is triggered. The trigger will cause a Aws Lambda function written in multiple programming languages to resize the image and compress it to fit for displaying on multiple devices. The aws lambda code or functions that gets executed by the events triggered run on a hardware built on demand by the cloud provider. Once the execution is complete, the hardware is stopped and will be waiting for further triggers.

Serverless Providers
Most of the other major cloud computing providers have their own serverless offerings. Aws Lambda was the first serverless framework launched on 2014 and is the most matured one. Lambda supports multiple programming languages like Node.js, Java, Python and C# and the best part is this lambda integrates with many other Aws Services.

Google Cloud Functions is also available ,Azure functions from Microsoft and OpenWhisk is a Open source serverless platform run by IBM. Other Serverless options include Iron.io, Webtask etc.

Function as a Service or Faas
When we say that servers are dynamically managed or created when we want to run the service, the idea is that you write your application code in the form of functions.

Faas or Serverless is a concept of Serverless computing where Software developers can leverage this concept to deploy an individual “function”, action, or piece of business logic. They are expected to start within milliseconds and process individual requests and then the process ends. The developer does not need to write code that fit the underlying infrastructure and he can concentrate on writing the business logic

One important thing to understand here is that When we deploy the function, we then need a way to invoke it in the form of an event. The event can be any time from API gateway ( http Request ), an event from another serverless function or an event from another service from cloud like S3.

The cloud provider executes the function code on your behalf. The cloud provider also takes care of provisioning and managing the servers to run the code upon invocation.

Pros & Cons
Serverless provides many pros to developers and cons even. Here are few Pro’s

Pay only for what we use : the first pro is that we don't need to pay for the idle server time, and pay only for the time we execute our code on the server. The servers are kept idle or used for other executions.

Elasticity : with the serverless architecture, our application can automatically scale up to accommodate the spike in the application traffic and scales down when there are fewer users. The cloud Vendor will take the responsibility of scaling up/down the application based on the traffic.

Less time and money spent on Management : Since most of the infrastructure work like creating hardware, scaling up/down the service is taken care of by the vendor and with no hassle of managing the hardwares, it will help organizations to spend less money , time and resources giving them time to focus on the business.

Reduces development time and time to market : Serverless architecture gives enough time for the developers and organizations to focus on building the product. The vendor will take care of hardware, deployment of the services, managing and scaling them leaving organizations to focus on building the product and release it to the market. There are no operating systems they need to select, secure, or patch.

Microservice approach : Microservices are a popular approach where developers build modular software that is more flexible, modular and easier to manage then a monolithic application. With the same approach developers can work in building smaller, loosely coupled pieces of software that can run as functions.

Here are the Cons
Vendor Lock-in & Decreased transparency : this is one of the major concerns in moving to serverless in a cloud. The backend is completely managed by the vendor and once the functions are written, moving them to a different cloud can cause major changes to the application. Not just the code,the other services like Database, Access management , storage that are linked with the functions need a lot of time , money and resources in porting them to different clouds.

Serverless Supported programming Languages : Since the functions are written in certain programming languages not all languages have support. AWS Lambda directly supports Java, C#, Python, and Node.js (JavaScript), Google Cloud Functions operates with Node.js, Microsoft Azure Functions works with JavaScript, C#, F#, Python, and PHP, and IBM OpenWhisk supports JavaScript, Python, Java, and Swift. There are some other programming languages like scala, Goland etc where support for serverless is coming and still work in progress.

Not suitable for Long running Tasks : Since the functions are event based in nature, this can be a best fit for long running tasks. The timeout limit on Lambda used to be 5 minutes, but as it was a major barrier for some applications in using serverless, it was increased and since Oct 2018 Lambda can run up to 15 minutes.

On other serverless platforms, it varies from 9 minutes to 60 minutes maximum. There are many use cases in general for long-running processes, such as video processing, big data analysis, bulk data transformation, batch event processing, very long synchronous request, and statistical computations which can’t be a best fit for serverless computing.

Potentially tough to debug : There are tools that allow remote debugging and some services (i.e. Azure) provides a mirrored local development environment but there is still a need for improved tooling.

Hidden Costs : Auto-scaling of function calls often means auto-scaling of cost. This can make it tough to gauge your business expenses.

Better Tooling : You now have a ton of functions deployed and it can be tough to keep track of them. This comes down to a need for better tooling (developmental: scripts, frameworks, diagnostic: step-through debugging, local run times, cloud debugging, and visualization: user interfaces, analytics, monitoring).

Higher Latency in responding to application events : since the hardware is set idle for quite some time and when we trigger to run a function, the server can take some time to wake up and run the function.

Learning Cure : Serverless does have a learning curve in defining our software in the form of functions. Converting our monolithic application into microservices and then to functions require deep understanding of the architecture and how they work.

Hope this helps in giving the basic understanding of the Serverless.

Falco - Container Bahavior Analysis

End-to-End protection for containers in production is required to avoid the steep operational costs and also to decrease data breaches. New and Fresh Container attacks and Vulnerabilities continue to increase, a strong runtime security is required for containers.

Runtime container security means vetting all activities within the container application environment from analysis of container, runtime and host activities to monitoring protocols and payloads of network connections. Some of the vulnerabilities can be remediated by Host benchmarking and vulnerability scanning on Host Operating systems. Container images are scanned for vulnerabilities and remediation before running them but there is a need for analysis monitoring of the running containers. We need to understand what is happening within the running containers, what network calls are being made, what directories and drives are being accessed etc. It is very important to understand how the containers are behaving while running. This is where Container Runtime monitoring comes into picture and Falco is one such tool. In this blog we learn the basics of Falco tool and how that can be used.

Introducing Sysdig Falco

Sysdig Falco is a Powerful behavioral activity monitoring tool to detect abnormal behavior in your applications and containers. Falco is cloud native runtime security systems that works with both containers and raw linux hosts. Developed by Sysdig organization for the Cloud Native computing foundation, it works by looking at the file changes, network activity, the process table and other data for suspicious behavior and then sending alerts through a pluggable backend. It inspects events at the system call level of a host through a kernel module or an extended BPF probe.

Falco works on rules that we can edit for identifying specific abnormal behaviors and it comes with 25 rules installed.

Installation and configuration

On a centos based machine, install the rpm as below,

[root@ip-172-31-32-147]#dnf install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm

Enable the repo using,

[root@ip-172-31-32-147]#dnf config-manager --enable epel

[root@ip-172-31-32-147]# rpm --import https://falco.org/repo/falcosecurity-3672BA8F.asc

[root@ip-172-31-32-147]# curl -s -o /etc/yum.repos.d/falcosecurity.repo https://falco.org/repo/falcosecurity-rpm.repo

[root@ip-172-31-32-147]# yum -y install kernel-devel-$(uname -r)

[root@ip-172-31-32-147]# yum -y install falco

Once the installation is done, a directory /etc/falco is created which contains 2 files

/etc/falco.yml,/etc/falco_rules.yml and /etc/falco/falco_rules.local.yml.

The file /etc/falco.yml controls several logging and high level configurations.

The file /etc/falco_rules.yml contains the list of rules that can be configured for abnormal behavior checking.It contains a predefined set of rules designed to provide good coverage in a variety of situations.

The file /etc/falco/falco_rules.local.yml is an empty file with some comments. The intent is that additions/modifications/overrides to the main rules file are added to this file. This can be taught of a custom rules file for one organization.

Run the Service as,

[root@ip-172-31-32-147]# service falco restart

Writing Your First Rule

Falco rules are based on Sysdig filter syntax. These filters expose a variety of information about system calls and events that take place in the system. These filters are organized as classes called “field classes”. These classes can be see running the “sysdig -l”. Some of the filters include

Fd : file descriptors

Process : processes

Evt : System events

User : Users

Group : groups

Container : container info

K8s : kubernetes events

Falco Rules are written in YAML format with some required and optional keys. A simple syntax of a rule,

Rule        : Name of the rules
Desc      : Description of what the rule is
Condition : the logic that triggers a notification
Output     : message that will be shown in the notification
Priority     : logging level for the notification
Tags        : tags for categorize rules
Enabled   : turn the rule on or off

A simple falco rule for checking if a shell was triggered in a container

Rule 1 : Log if there is shell trigger in a Container

Create a rule as below,

- rule: Terminal shell in container

desc: A shell was spawned by a program in a container with an attached terminal.

condition: >

spawned_process and container

and shell_procs and proc.tty != 0

output: "A shell was spawned in a container with an attached terminal (user=%user.name %container.info shell=%proc.name parent=%proc.pname cmdline=%proc.cmdline terminal=%proc.tty)"

priority: NOTICE

tags: [container, shell]

Now add the rules to the /etc/falco_rules.yml file and restart the Falco service. Now start a container and run bash inside the container as below,

[root@ip-172-31-32-147]#docker run -d -P --name example2 nginx

[root@ip-172-31-32-147]#docker exec -it example2 bash

Come out of the container and see the /var/log/messages file in the host machine. We can see the below type of logs

May 24 13:28:14 ip-172-31-32-147 falco[16586]: 13:28:14.523089250: Notice A shell was spawned in a container with an attached terminal (user=root example2 (id=1546ca8ce5f0) shell=bash parent=runc cmdline=bash terminal=34816)

May 24 13:28:23 ip-172-31-32-147 falco[16586]: 13:28:23.947776826: Warning Shell history had been deleted or renamed (user=root type=openat command=bash fd.name=/root/.bash_history name=/root/.bash_history path= oldpath= example2 (id=1546ca8ce5f0))

The log says that a shell is spawned in the container with an attached terminal. It also gives information about the user log triggered the shell and name of the container with container id. This way we can see what happens inside a container based on the rule that we defined.

Rule 2 : Check if other processes are running.

Docker best practices recommend running just one process per container. It can be a security issue if there are some other processes too running in a container. In our nginx container, we want only the nginx process to run. We want to log whatever process or job that run inside a nginx container other than nginx processes. Our rule would look like,

- rule: Unauthorized process on nginx containers

desc: There is a process running in the nginx container that is not described in the template

condition: spawned_process and container and container.image startswith nginx and not proc.name in (nginx)

output: Unauthorized process (%proc.cmdline) running in (%container.id)

priority: WARNING

Lets Understand the rule a little,

Spawned_process : a macro to identify that a new process was executed.

Container : the container namespace where it was executed belongs to a container and not the host

Container.image startswith nginx : the image name so you can have an authorized process lists for each one

not proc.name in (nginx) (the list of allowed processes names)

[root@ip-172-31-32-147]# docker run -d -P --name example2 nginx

[root@ip-172-31-32-147]# docker exec -it example2 ls

Run a nginx container and run “ls” inside the nginx container.Now when we check the /var/log/messages file, we can see the below logs as,

May 24 13:34:20 ip-172-31-19-104 falco[17179]: 13:34:20.823028587: Warning Unauthorized process (ls) running in (1546ca8ce5f0)

May 24 13:34:20 ip-172-31-19-104 dockerd[15797]: time="2020-05-24T13:34:20.876166289Z" level=error msg="Handler for POST /v1.39/exec/c4180687a5b2dc2b9d58b7e8ca20f5242e0b3a7657751b7a437645c012f2a67a/resize returned error: cannot resize a stopped container: unknown"

Though falco is a good tool for behavior checking, there are few limitations for the tool. Though it supports integrations with multiple tools , more work needs to be done on enhancing the tool. Hope this helps in starting with the Falco tool.

Containers Vs Pods

A Common confusion to most of the developers and administrators is the difference between a Pod and Container. We use the term container when using Docker and use the term pod with Kubernetes or openshift. So what exactly is the difference between a pod and a Container?

In simple terms, if you see the logo of docker we see a whale as below,

What are groups of whales called?, a Pod. If a single whale is called a container, multiple whales or containers are called pods. Though a pod contains multiple containers, the way they work in the pod is different. In this article, we will see what containers are and pods are?

Containers are not new

Containers are not new atall. Docker is not the one who started the container revolution. The first of the container sort of implementations came with the OpenBSD operating system when they came up with the “Jails” Concept in 1999. Solaris introduced a Concept of “Zones” in around 2004. Then many other implementations came into picture. In 2006, Google came up with something called “process Containers”. Process Containers was designed for limiting, accounting and isolating resource usage (CPU, memory, disk I/O, network) of a collection of processes. It was renamed “Control Groups (cgroups)” a year later and eventually merged to Linux kernel 2.6.24. In 2013, google came up with a container stack called “LMCTFY” and Docker came into picture. Docker made the life of a developer easy to implement and run the containers in production.

How are containers Created?
Though container implementations are introduced by various organizations, the core components for creating the containers are available in the linux kernel itself. Using these core components we can create containers without using any container runtime. More over every container runtime like Docker, RKT etc that are available in the market now uses the same core components to create containers under the hood. The core components include Chroot, Namespaces, Cgroups, Capabilities and UnionFS. A detailed introduction about the components is provided in the “anatomy of containers” article.

Containers are just like normal processes that run with some extra features of linux kernel called Namespaces and Cgroups. Namespaces attach a separate view of the system resources like process, Network, Hostname etc to processes that hide everything to other processes and external worlds. By this separate view, the processes will get their own execution environment to run. Namespaces include,
Hostname
Process Tree
File System
Network Interfaces
Inter-process communication

While Namespaces can’t allow processes to interface with other processes, the process can still access resources like Memory and CPU from the Host machine. There needs to be control of these host resources to containers. Cgroups are introduced to restrict Memory, CPU ,I/O and network from Host machine to containers. By default a Container will get unlimited resources from the Host machine until restricted.

Combining Everything
A Container when created will have these core components attached to that. Each of these core components have different functionality and can be attached or not attached to containers. Some containers can have few namespaces attached but not all. Similarly these core components can be attached to one or more processes. We can have multiple processes running with a single namespace. By attaching these core components to multiple processes we can extend their functionality. For example, if multiple processes are added with network namespace then both processes can communicate with each other as local processes. Similarly if two processes are attached with a Memory cgroup restriction, then both processes are restricted with memory.

For example, create a nginx container as below,
[root@ip-172-31-16-91]# docker run -d --name nginx -p 8080:80 nginx
Now start the second container by attaching some of the namespace components from the first container to the second as below,

[root@ip-172-31-16-91]# docker run -it --name centos --net=container:nginx --pid=container:nginx centos /bin/bash
[root@91202914ab48 /]# ps ux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.2 0.6 10624 6076 ? Ss 06:02 0:00 nginx: master process nginx -g daemon off;
root 34 6.0 0.3 12024 3356 pts/0 Ss 06:03 0:00 /bin/bash

[root@91202914ab48 /]# curl localhost:80

Welcome to nginx!

Welcome to nginx!

If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.

For online documentation and support please refer to
nginx.org.

Commercial support is available at
nginx.com.

Thank you for using nginx.

[root@91202914ab48 /]# exit

If we can see the second container process, we see a Nginx process running. This way we have connected 2 containers with the same namespace. The second container has the same network and process tree namespace attached from the first container. Since we have the network namespace shared, the nginx process running on the first container can be accessed by using “localhost:80” from the second container.

Generally when ever a container is created, we see the similar namespace settings as below,

But when we have 2 containers with shared namespaces we can see both containers attached as below,

This is the same with Pods. Instead of creating multiple containers and running them on shared namespaces and cgroups, we create Pods with multiple containers. The orchestration platform that creates pods will take care of creating containers inside the pods with a shared model. The Orchestration platform like Kubernetes will take care of automating the creation of containers inside a pod with correct namespaces and cgroups set.

So What are pods anyway?
Pod is a group of one or more containers ( basically docker or rkt containers ) with shared storage/network and a spec file on how to run the containers inside the pod.

The containers inside the pod are always co-located and co-scheduled that run in a shared context. The containers running inside the pod can now talk with each other on a local host. The pod will be given a single IP address and all containers inside the pod will be accessed with the pod ip address.A Storage location attached to the pod can now be accessed by any number of containers inside the pod.

We can also think of a pod to be a single VM with one or more containers that have a single IP address and Port space. Containers inside the pod talk to each other on localhost since we have the same shared network space. They communicate with each other with standard inter-process communication systems. Contains inside the pod also have access to shared volumes, which are defined in the pod spec file. These volumes can be mounted to one or all containers running inside the pod by defining the mount options in the pod spec file.

Why Pods?
A Pod represents a Unit of deployment, a single instance of an application or multiple instances of different applications that are tightly coupled and that share resources. The “one-container-per-pod” model is the most common use case where you run one instance of your application in a container in a pod.

Multi-container model comes into picture when we have applications that are tightly integrated together. For example, we have a pod with 2 containers as below,

In the above image, we have a pod with shared volume and network. The volume is mounted to both containers “content puller application” and “web server”. The container “content puller” will pull content from the content management system and put it in the volumes. The web server container will pull the content from volume and will be displayed for users. Since these 2 containers are serving a single purpose of displaying the content to users, we can think of these multiple cooperating processes ( Containers ) a single unit of service.

Unit of deployment - Since multiple containers run in the pod they are treated as a single unit of deployment. Everything done on that pod will also work on the containers running inside the pod.

Management : Since multiple containers in pods represents a single unit of service, management, deployment can be very easy. Pods can now be easily deployed with multiple versions, horizontally scaled and replicated. Colocation (co-scheduling), shared fate (e.g. termination), coordinated replication, resource sharing, and dependency management are handled automatically for containers in a Pod

Ease of use : The orchestration platform takes care of running the pods with containers. We application developers don't need to worry about the container exit , signalling etc. When a pod goes down, the platform takes care of getting up the pod in a new machine including the containers and volumes attached as before.

If one of the containers in the pod goes down, the whole pod goes down making sure that the whole stack of applications are up or down at any point of time. Since the pods are scaled up and down as a unit, it is very important for the pod to be small. Since pods with multiple containers are scaled, every container inside is scaled regardless of the use. So make sure the pod is small and contains a single container until needed.

While trying to launch a Docker container from another docker container, I came across the docker daemon socket. This article provides an introduction with the docker socket and how it can be used.

Introducing Sockets

The Term socket commonly refers to the IP sockets. These are generally the ones that are bound to a port and address. We send TCP requests to and get responses from.

Another type of socket is a Unix Socket, where sockets are used for the IPC ( Inter process communication) and are called Unix Domain Sockets or UDS. These sockets generally use the local file system for communication while the IP sockets use the network for communication. Linux provides various utilities for checking the sockets and also for talking with sockets. “Ss” is a command utility for checking the socket details like,

[root@ip-172-31-16-91 ec2-user]# ss -s

Total: 257 (kernel 691)

TCP: 9 (estab 2, closed 1, orphaned 0, synrecv 0, timewait 1/0), ports 0

Transport Total IP IPv6

* 691 - -

RAW 0 0 0

UDP 8 4 4

TCP 8 6 2

INET 16 10 6

FRAG 0 0 0

The ss utility gives a lot of information about the available sockets, currently open and listening sockets etc.

Nc is a utility where we can communicate using a TCP socket. Open 2 terminals and in the first terminal run,

[root@ip-172-31-16-91 ec2-user]# nc -l 3000

This will take the port 3000 by opening a socket and will stay listening and in the second terminal run,

[root@ip-172-31-16-91 ec2-user]# nc localhost 3000

The second terminal waits for the input, when we enter any content the same content will be displayed on the first terminal window too. This way we can communicate on the TCP ports using the nc command.

Introducing Docker daemon sockets

The docker.sock is a unix socket that docker daemon is listening to. This is the main entry point for the Docker API. This socket can also be a TCP socket but for security reasons docker defaults it to the Unix Socket.

Since this is a Unix socket, they use the local file system for communication. This means unix sockets are faster but they are confined to local communication only. This is how containers talk to docker hosts on the same machine. Docker cli client uses this socket to execute docker commands by default. You can override these settings as well.

The docker daemon can listen for Docker engine API requests via three types of socket, unix, tcp and fd.Docker engine uses this socket to listen to the REST API calls, and the clients use the socket to send API requests to the server. The CLI is one such client. If you observe the docker architecture, we have 3 components mainly,

Whenever we need to create a container, we use the docker cli and run the commands. Docker cli then passes the arguments to the docker engine through the docker REST api for container creation , deletion etc. This is where the Unix socket comes into picture. Docker creates a unix socket by the name docker.sock under the /var/run. Docker engine listens on this unix socket for all api calls. Docker cli uses this to send api requests to the docker engine.

Curl, A Swill army knife

Curl is a tool available in all linux machines. We can use curl to act as a client and use Docker rest api to perform operations on docker engine. Curl can talk using the Unix socket vie the --unix-socket flag. Since the Docker api is exposed on Rest, we can make use of the Curl commands to send requests over Http.

Lets get started by running a simple container as below,

[root@ip-172-31-16-91]# docker run -d -p 6379:6379 redis:latest

Once the container is up and running, lets try to grab some details using the Socket file as below,
[root@ip-172-31-16-91 ec2-user]# curl --unix-socket /var/run/docker.sock http://localhost/images/json | jq

[

{

"Containers": -1,

"Created": 1590711840,

"Id": "sha256:36304d3b4540c5143673b2cefaba583a0426b57c709b5a35363f96a3510058cd",

"Labels": null,

"ParentId": "",

"RepoDigests": [

"redis@sha256:ec277acf143340fa338f0b1a9b2f23632335d2096940d8e754474e21476eae32"

"RepoTags": [

"redis:latest"

"SharedSize": -1,

"Size": 104120748,

"VirtualSize": 104120748

}

]

With this we can see all the images available in our current machine. This is same as the “docker images” command.

List the running containers using,

[root@ip-172-31-16-91]# curl --unix-socket /var/run/docker.sock http://localhost/containers/json | jq

[

{

"Id": "56f94a1d4efc95f566c12b743a0f1f8fdc51b4c06ec5262ff1253e9278283412",

"Names": [

"/admiring_williamson"

"Image": "redis:latest",

"ImageID": "sha256:36304d3b4540c5143673b2cefaba583a0426b57c709b5a35363f96a3510058cd",

"Command": "docker-entrypoint.sh redis-server",

"Created": 1591360805,

"Ports": [

{

"IP": "0.0.0.0",

"PrivatePort": 6379,

"PublicPort": 6379,

"Type": "tcp"

}

"Labels": {},

"State": "running",

"Status": "Up 5 minutes",

"HostConfig": {

"NetworkMode": "default"

"NetworkSettings": {

"Networks": {

"bridge": {

"IPAMConfig": null,

"Links": null,

"Aliases": null,

"NetworkID": "ccbed7cd40074c25f6710d7ccd8d9020e03d6788052feaff3681a101a54811f8",

"EndpointID": "15adb1219880383bf68c833262083202fb135cd45b41da7ceb2a8590726c6456",

"Gateway": "172.17.0.1",

"IPAddress": "172.17.0.2",

"IPPrefixLen": 16,

"IPv6Gateway": "",

"GlobalIPv6Address": "",

"GlobalIPv6PrefixLen": 0,

"MacAddress": "02:42:ac:11:00:02",

"DriverOpts": null

}

"Mounts": [

{

"Type": "volume",

"Name": "6668a59a3db452a1d463ff117e0a55da54f5b1a587e2a7374cd38d43b64f0eb8",

"Source": "",

"Destination": "/data",

"Driver": "local",

"Mode": "",

"RW": true,

"Propagation": ""

}

]

}

]

Get details regarding the Running Container : Details about the running container can be obtained using,

[root@ip-172-31-16-91]# curl --unix-socket /var/run/docker.sock http://localhost/images/36304d3b4540/json | jq

Write Operation with the Unix Socket : Write operations can also be performed with the unix socket. Operations like tagging etc can be done by making a rest call to the socket as below,

[root@ip-172-31-16-91 ec2-user]# curl -i -X POST --unix-socket /var/run/docker.sock "http://localhost/images/36304d3b4540/tag?repo=redis&tag=testing"

HTTP/1.1 201 Created

Api-Version: 1.40

Docker-Experimental: false

Ostype: linux

Server: Docker/19.03.6-ce (linux)

Date: Fri, 05 Jun 2020 12:52:36 GMT

Content-Length: 0

In the above command, I'm tagging the redis image with a testing tag. Now once this command is run successfully, we can see the results using “docker images”.

Stream events

The /events api allows streaming of events on the docker engine. This can be achieved by using the --no-buffer to the curl command to print the output as events occur. The command looks as,

[root@ip-172-31-16-91]# curl --no-buffer --unix-socket /var/run/docker.sock http://localhost/events

This takes you in the listening mode where it will display event when something happens. Now open a new terminal and run a container run command as below,

[root@ip-172-31-16-91 ec2-user]# docker run -d nginx

Unable to find image 'nginx:latest' locally

latest: Pulling from library/nginx

afb6ec6fdc1c: Already exists

dd3ac8106a0b: Pull complete

8de28bdda69b: Pull complete

a2c431ac2669: Pull complete

e070d03fd1b5: Pull complete

Digest: sha256:883874c218a6c71640579ae54e6952398757ec65702f4c8ba7675655156fcca6

Status: Downloaded newer image for nginx:latest

462a065ae91597289cb3052db7013550b8e23527480253fb7d5fe6bf12fcfc58

Now in the first command prompt under the Curl command we can see the live data stream as below,

{"status":"pull","id":"nginx:latest","Type":"image","Action":"pull","Actor":{"ID":"nginx:latest","Attributes":{"maintainer":"NGINX Docker Maintainers \u003cdocker-maint@nginx.com\u003e","name":"nginx"}},"scope":"local","time":1591361908,"timeNano":1591361908825825763}

{"status":"create","id":"462a065ae91597289cb3052db7013550b8e23527480253fb7d5fe6bf12fcfc58","from":"nginx","Type":"container","Action":"create","Actor":{"ID":"462a065ae91597289cb3052db7013550b8e23527480253fb7d5fe6bf12fcfc58","Attributes":{"image":"nginx","maintainer":"NGINX Docker Maintainers \u003cdocker-maint@nginx.com\u003e","name":"ecstatic_dijkstra"}},"scope":"local","time":1591361908,"timeNano":1591361908912415052}

{"Type":"network","Action":"connect","Actor":{"ID":"ccbed7cd40074c25f6710d7ccd8d9020e03d6788052feaff3681a101a54811f8","Attributes":{"container":"462a065ae91597289cb3052db7013550b8e23527480253fb7d5fe6bf12fcfc58","name":"bridge","type":"bridge"}},"scope":"local","time":1591361908,"timeNano":1591361908947748392}

{"status":"start","id":"462a065ae91597289cb3052db7013550b8e23527480253fb7d5fe6bf12fcfc58","from":"nginx","Type":"container","Action":"start","Actor":{"ID":"462a065ae91597289cb3052db7013550b8e23527480253fb7d5fe6bf12fcfc58","Attributes":{"image":"nginx","maintainer":"NGINX Docker Maintainers \u003cdocker-maint@nginx.com\u003e","name":"ecstatic_dijkstra"}},"scope":"local","time":1591361909,"timeNano":1591361909354445673}

We can see a live stream of data of whatever happens in the docker engine.

Docker socket in Container : If we want to launch a new container from another container, the socket file need to be mounted to the docker container. This increases attack surface so you should be careful if you mount docker socket inside a container there are trusted codes running inside that container otherwise you can simply compromise your host that is running docker daemon,since Docker by default launches all containers as root.

Docker socket has a docker group in most installation so users within that group can run docker commands against docker socket without root permission but actual docker containers still get root permission since docker daemon runs as root effectively (it needs root permission to access namespace and cgroups).

Subscribe to: Posts ( Atom )

My Technical Works

Pages

Tuesday, June 16, 2020

Docker : Understanding Build Context

Docker : Ignoring File from build Using dockerignore

Saturday, June 13, 2020

Weave Scope - Troubleshooting & Monitoring for Docker & Kubernetes

Serverless - Less Server

Friday, June 5, 2020

Falco - Container Bahavior Analysis

Containers Vs Pods

Welcome to nginx!

Understanding Docker Socket