Pages

Monday, June 10, 2019

Understanding Dockerfile - Automating your first Httpd Container


We all know that a container need to be created from a Image. An Image can be taught as a pre-build environment for a service. A Docker image is a collection of files, libraries and configuration files that build the environment.

A Docker container is created from an image which is pulled from a public repository or a image created locally. The normal way to create a image is to
pull a base image ( Centos, Ubuntu etc )
create a Container from this image.
Login to the container.
Install all necessary files, libraries, packages and configuration files.
Come out of the container
Commit the running container using the “docker commit <Container ID> <name of the Image>

From now we can create a container from the newly created images created as above. But this process requires manual intervention for adding , modifying, uploading files etc. What if we have a way to automate this process. Dockerfile comes to our help.

Dockerfile is a text file that defines a Docker image. This text file contains instruction that Docker understands. We use these instructions to define our custom environment inside the image like creating files, downloading package etc. In this article we will see how we can create a custom http Image using the Dockerfile instruction set. Below is my directory structure. I have created a directory by the name Build and added the files underneath the build
jagadishm@[/Volumes/Work/httpd]: tree build
.
├── Dockerfile
├── data
│   └── index.html
├── httpd.conf
├── index.html

Httpd.conf is my custom configuration file with changed port number from 80 to 8080.

Index.html is my index page which contains sample content that will be shown when will hit the 8080 inside the container.

Data is the directory which will be mounted to the container. This is the host path location and will be mounted to a directory inside the container. We want to make persistence and hence using volumes.

Writing the Dockerfile :
FROM centos
MAINTAINER jagadish <jagadish.manchala@gmail.com>
LABEL owner="jagadish" \    
     Version="2.4.6"

ARG user_name=root
USER ${user_name}

WORKDIR /
RUN yum -y install epel-release && \
   yum -y install httpd httpd-tools && \
   yum clean all

VOLUME /var/www/html
RUN rm -rf /etc/httpd/conf/httpd.conf
COPY httpd.conf /etc/httpd/conf/httpd.conf
COPY index.html /var/www/html/
ADD http://www.africau.edu/images/default/sample.pdf /
EXPOSE 8080
HEALTHCHECK --interval=5m --timeout=3s --retries=4 CMD curl -f http://localhost:8080/ || exit 1
ONBUILD RUN mkdir /tmp/tempo
ENV CONTAINER_NAME demoContainer
VOLUME /data
ENTRYPOINT ["/usr/sbin/httpd"]
CMD ["-D", "FOREGROUND"]

The above is the docker file that i have created. We will understand line by line.
1. From sets the Base Image for subsequent instructions. In the above example we are building a image that is based on centos image. Once we try to build the image , the centos image is downloaded first or if available on the local repository it will use that.

2. MAINTAINER sets the AUthor field of the generated image

3. Label instruction adds metadata to an image. A Label is a key-value pair. Labels are very useful when we are running large number of containers in a Data center etc. We can use the “docker ps” command to get the running containers list based on the label defines using “docker ps --filter "owner=jagadish"

4. ARG Instruction defines a variable that users can pass during build time. For example in the above case, i have defined a ARG user_name. I have defined a default value “root” to it. We can also bypass the default value by passing a different value while building the image using “docker build --build-arg user_name=jenkins”.
ARG user_name=root
USER ${user_name}

5. WORKDIR instruction sets the working directory for any RUN, CMD , ENTRYPOINT,COPY and ADD Instructions that comes in Dockerfile after WORKDIR. I have set the workdir to “/” which means the subsequent commands all run in the “/” location. Also once the image is build and we start the container this will be location when we land. A Workdir can be used any number of times in the Dockerfile. The last Workdir will be the one we land when we start the container.

6. USER instruction sets the username to use when running the image. This means since i have set the User to root, we will be running as root user when we start the container. This user will be the one that will be used to execute the RUN, CMD, COPY etc instructions that comes after USER instruction in the Dockerfile

7. RUN instruction executes any commands passed as arguments. This command is ran inside the image. In the above file, we have ran the yum command passing to the RUN instruction.

8. VOLUME instruction sets a mount point with the name defined in the Dockerfile. This mount point is marked as it will holding volumes mounted externally from native host or other containers. This is one of the instruction in Dockerfile that we we need to pass a value while running the container. This instruction accepts only 1 argument , a mount point inside the container but it will not accept any host path that needs to be mounted. The host path needs to be passed which running the container. This lets users to mount a different path to the volume defined in the container.

In our case,i have define a Volume “/var/www/html/”. This means a mount point /var/www/html will be created and users can have a choice of attaching a host path to the volume. I can now attach a host path /root to the /var/www/html which makes files created and deleted in both container mount path and host path will reflect.

9. COPY instruction allows to upload or Copy files to the container images. Iam copying a httpd.conf from local machine to the container image.

10. ADD instruction allows to upload or Copy files to the container images. One additional advantage using this instruction is that this allows to download files or content from remote location whereas COPY instruction just copies content.

11. EXPOSE instruction tells docker that the container created from this image will be listening on the specified network port. In our case, we have defined 8080, which means when we run the container we will have the port 8080 being used. This is one of the another instruction as VOLUME where we need to pass the host port that need to be mapped to container port during runtime. So since this container runs on the 8080, we need to use the port publish (-p) to map this 8080 to the host port so that we can access the service as if it is running on local machine

12. ENV sets environment variables. The difference between ENV and ARG is that variables set using ARG will be used during building the image and not available from Running container where as variables set using ENV are available in the running container.

13. ONBUILD instruction works in a different way. This instruction will not run in this image but will set a trigger instruction to be executed at later point. The Onbuild instruction in our example is “ONBUILD RUN mkdir /tmp/hello”. This instruction will not be executed in the current image but will be executed in the next image if the current build is used as Base image. Lets understand with an example,

In the First Dockerfile we have,
[root@ip-172-31-31-127 onbuild]# cat hello1/Dockerfile
FROM busybox
RUN echo "hello world" >> /tmp/hello
ONBUILD RUN mkdir /tmp/onbuild-hello

Now when i build the docker image hello1 with the above dockerfile as below,
[root@ip-172-31-31-127 hello1]# docker build -t hello1 .
Sending build context to Docker daemon  2.048kB
Step 1/3 : FROM busybox
latest: Pulling from library/busybox
53071b97a884: Already exists
Digest: sha256:4b6ad3a68d34da29bf7c8ccb5d355ba8b4babcad1f99798204e7abb43e54ee3d
Status: Downloaded newer image for busybox:latest
---> 64f5d945efcc
Step 2/3 : RUN echo "hello world" >> /tmp/hello
---> Running in cadcbc51c329
Removing intermediate container cadcbc51c329
---> c24d27271dee
Step 3/3 : ONBUILD RUN mkdir /tmp/onbuild-hello
---> Running in 827b86347475
Removing intermediate container 827b86347475
---> 1eae88c4a42d
Successfully built 1eae88c4a42d
Successfully tagged hello1:latest

Now if i run the image and see the contents of the container, we don’t see any directories created as defined in ONBUILD instruction

[root@ip-172-31-31-127 hello1]# docker run -it hello1 /bin/sh
/ # ll /tmp
/bin/sh: ll: not found
/ # ls /tmp
hello
/ # exit

Now lets see the second Dockerfile,
[root@ip-172-31-31-127 hello2]# cat Dockerfile
FROM hello1
RUN echo "second Build" >> /tmp/second

Now in the second dockerfile, i using the hello1 ( created above ) as base image. Now if we create the image and run a container from it, we can see things as below,
[root@ip-172-31-31-127 hello2]# docker run -it hello2 /bin/sh
/ # ls /tmp
hello          onbuild-hello  second

We can see that the onbuild-hello directory is created in the container created from the image whose base image is hello1. Now in order for the ONBUILD instruction to execute, we need to create a container whose base image has this ONBUILD instruction.

14. CMD instruction is similar as RUN, which can be used for executing a specific commands. The difference between RUN and CMD is that commands passed as arguments to RUN will by run during build time and commands passed as arguments to CMD will be run while the container is running. This means we can specify command using the CMD instruction that will run when we start the container.

15. ENTRYPOINT instruction is one of the very important instruction in Dockerfile. The ENTRYPOINT instruction set the default application that will start when ever we start the container. ENTRYPOINT is used when we want to run the container as executable.

It is very important to understand the difference on using CMD and ENTRYPOINT instructions.

Our first Dockerfile contains ,
[root@ip-172-31-31-127 hello1]# cat Dockerfile
FROM busybox
CMD sleep 200

When we build the image and start a container as “docker run -d hello1”. The container gets started and will run until the sleep 200 command completes. If we try to attach to this container we cannot which means we are running a executable.

Now one important thing about the CMD is that if we run the same container by passing a command at the end as “ docker run -d hello1 sleep 10”, the CMD instruction will be overridden with the command passed during runtime.

Our second dockerfile looks as below,
[root@ip-172-31-31-127 hello2]# cat Dockerfile
FROM busybox
ENTRYPOINT sleep 100

In this case we defined a executable as entrypoint. Now when we build the image and run the container, we can see the container starts and run until the sleep 100 is done. In this case also we cannot attach to the container since we are running the executable.

Now if we try to run the container by passing a command as we did with CMD, it wont work like CMD by ignoring the ENTRYPOINT INSTRUCTION. Though we pass a command as argument or not while running the container, the container will run keeping entrypoint as the entry process.

The best way is to use both CMD and ENTRYPOINT in combination. When both are used, the argument passed to ENTRYPOINT will be the executable that will start when the container starts and arguments passed to CMD will be passed to the ENTRYPOINT as arguments. In our case we have,
ENTRYPOINT ["/usr/sbin/httpd"]
CMD ["-D", "FOREGROUND"]

This means when the container gets started, the /usr/sbin/httpd will be the executable that gets started and arguments passed to CMD will be passed to executable defined in entrypoint. At end, when container starts, we have the command “/usr/sbin/httpd -D FOREGROUND” executed by combining both ENTRYPOINT and CMD instructions.

Build the docker image
Now we understood the Dockerfile instructions, we can now build the image using “docker build -t httpd .”. The period at the end command points to the current working directory. We are telling docker to look for the Dockerfile in the current directory.

Run the Container
Once the Image is build we will now run the container as “docker run -d -p 18080:8080 --mount src=/root/build/data,target=/var/www/html,type=bind http"

In the above command, we have passed the host port and host path to the VOLUME and EXPOSE instructions defined in the Dockerfile. We can now access the http server using “localhost:18080” which sends the request to the container:80 port.

Hope this helps in understanding the Dockerfile instructions.

No comments :

Post a Comment