I recently delivered a presentation at PHP South West about how to put your Docker image on a diet.
The talk mentioned six points that help reduce the size of a Docker image. The majority of the points focused on improving the Dockerfile and the last point was based on using a .dockerignore file. This post will go through each one of these in detail.
The talk was centred around a real Docker image which served a basic Symfony 3 web application. From this I created six Github tags and six corresponding Docker image tags (Not including the latest tag or master branch)
If you attended the talk I would very much appreciate some feedback on joind.in.
You can skip over this post by viewing the Speaker deck slides below.
The six Docker tags that I created evolved from the same Dockerfile. Below is a table of file sizes that we achieved. This is also shown on the Docker hub
Tag 0.1.0 Our large Docker image
This is the start of our journey. As mentioned in the talk, each instruction in a Docker file creates a new Docker layer (Intermediate Docker Image). Depending on the type of the layer a portion of disk space is required to create the intermediate image. Meta layers such as labels hardly consume any space but instructions that install packages can quickly bloat the image.
Take a look at tag 0.1.0 of the Dockerfile for more details
Tag 0.2.0 Merging the RUN instructions
The first thing to do is to merge as many run instructions as possible. This is done using the double ampersand symbols and a back slash like so:
The ampersands will chain commands together and the back slash allows the continuation of the command on the next line.
This saves 5MB of space. Not a large amount but we are heading into the right direction and this is a good demonstration that by altering the Dockerfile we can reduce the file size.
Checkout tag 0.2.0 of the Dockerfile to see these changes
Tag 0.3.0 Removing the packages that we don’t need including the build packages
The next task is to look at what packages we are installing. Here we have git, apache2 and php7.0*.
RUN apt-get update -y \ && \ apt-get install -y \ git \ apache2 \ libapache2-mod-php \ php7.0* \ && \ a2enmod rewrite
First lets talk about git. Git is what I term as a build package. This is a package that is used to install other packages or application dependencies. For instance you might need git to install dependencies via composer. Once those dependencies have been installed there is no need for git. If we needed to install something after the Docker image has been built and deployed then we should bake those changes into a new Docker image. My opinion on build packages is also the same for other applications such as make.
Now lets talk about using php7.0*. Seeing an asterisk when installing packages via a Dockerfile should spark fear into everyone. Here we are saying ‘Install all the things that start with php7.0’ This is very bad. Not only are you installing packages that you don’t know about but you are probably going to install packages that you don’t need.
By removing these issues the Docker image size is now down to 314MB. This is a vast improvement but we can still reduce the Docker image even futher.
Take a look at the alterations in tag 0.3.0 of the Dockerfile
Tag 0.4.0 Cleaning up after ourselves
We also need to clean up after ourselves. A raft of cached files is generated after running:
RUN apt-get update -y
These cached files are used when processing future apt-get update, install and upgrade commands. We are only doing this once so there is no point in keeping these files on a deployed system. So let’s remove them and also purge their configuration and remove any temporary files that may of been added in the install process:
RUN apt-get update -y \ && \ apt-get install -y \ apache2 \ libapache2-mod-php \ php7.0 php7.0-cli php7.0-xml \ && \ a2enmod rewrite \ && \ apt-get autoremove -y --purge \ && \ rm -rf /var/lib/apt/lists/*
This has reduced the Docker image size to 275MB.
Here is tag 0.4.0 of the Dockerfile
Tag 0.5.0 Using a smaller base image
The next thing we are going to look at is the base image. In this example we are using Ubuntu which is a great base image but its full of things that we don’t need.
So what we are going to do is use a smaller base image such as debian:stretch. This brings the Docker image down to 262MB. We can take this to the nth degree and build our own base image from scratch or use an Alpine image. Be warned that creating custom base images can take a lot of time and trial and error to get right.
Have a look at the base image improvement in tag 0.5.0 of the Dockerfile
Tag 0.6.0 Ignoring what you don’t need
At this point our Docker image is half the size of the original image but we are can still improve upon this. When building a Docker image we send with a build context with the Dockerfile.
$ docker build -t howtocodewell/how-to-put-your-docker-image-on-a-diet .
In the above command we are sending the current working directory (.) as the build context. Side note: The build context needs to contain the Dockerfile
There will be a load of things included in the build context that are not required and these can be ignored using a .dockerignore file like so:
.* /tmp/* /log **/*.md **/vendor/**/Tests site/build site/var/cache site/var/logs site/var/sessions site/web/app_dev.php site/bin/symfony_requirements
The above is an example of the .dockerignore file. Any file or directory that matches these glob patterns will be ignored and wont be included in the final docker image.
In this example we are not including any tests from the vendor directory, the cache and logs will not be added as well as other things such as markdown files
We can add many more things to this ignore file and this can be tailored to your application.
The .dockerignore file can also be used to prevent any files with sensitive information from being added to the Docker image. In this example we are ignoring the app_dev.php file which means the dev environment cannot be reached when it is deployed.
Take a look at the docker ignore file in tag 0.6.0