HiveBrain v1.2.0
Get Started
← Back to all entries
patterndockerMinor

Should I chunk my docker images in many small layers instead of one large?

Submitted by: @import:stackexchange-devops··
0
Viewed 0 times
chunkdockerinsteadonelargesmalllayersmanyimagesshould

Problem

When I build docker image for existing applications I try to use as few layers as possible and clean up any unwanted files. For example building an image for moodle:

```
# Dockerfile for moodle instance.
# Forked from Jonathan Hardison's docker version. https://github.com/jmhardison/docker-moodle
#Original Maintainer Jon Auer

FROM php:7.2-apache

# Replace for later version
ARG VERSION=37
ARG DB_TYPE="all"

VOLUME ["/var/moodledata"]
EXPOSE 80

ENV MOODLE_DB_TYPE="${DB_TYPE}"
# Let the container know that there is no tty
ENV DEBIAN_FRONTEND noninteractive \
MOODLE_URL http://0.0.0.0 \
MOODLE_ADMIN admin \
MOODLE_ADMIN_PASSWORD Admin~1234 \
MOODLE_ADMIN_EMAIL admin@example.com \
MOODLE_DB_HOST '' \
MOODLE_DB_PASSWORD '' \
MOODLE_DB_USER '' \
MOODLE_DB_NAME '' \
MOODLE_DB_PORT '3306'

COPY ./scripts/entrypoint.sh /usr/local/bin/entrypoint.sh

RUN echo "Build moodle version ${VERSION}" &&\
chmod +x /usr/local/bin/entrypoint.sh &&\
apt-get update && \
if [ $DB_TYPE = 'mysqli' ] || [ $DB_TYPE = 'all' ]; then echo "Setup mysql and mariadb support" && docker-php-ext-install pdo mysqli pdo_mysql; fi &&\
if [ $DB_TYPE = 'pgsql' ] || [ $DB_TYPE = 'all' ]; then echo "Setup postgresql support" &&\
apt-get install -y --no-install-recommends libghc-postgresql-simple-dev &&\
docker-php-ext-configure pgsql -with-pgsql=/usr/local/pgsql &&\
docker-php-ext-install pdo pgsql pdo_pgsql; \
fi &&\
apt-get -f -y install --no-install-recommends rsync unzip netcat libxmlrpc-c++8-dev libxml2-dev libpng-dev libicu-dev libmcrypt-dev libzip-dev &&\
docker-php-ext-install xmlrpc && \
docker-php-ext-install mbstring && \
whereis libzip &&\
docker-php-ext-configure zip --with-libzip=/usr/lib/x86_64-linux-gnu/libzip.so &&\
docker-php-ext-install zip && \
docker-php-ext-install xml && \
docker-php-ext-install intl && \
docker-php-ext-install soap && \
docker-php-ext-install gd

Solution

Generally speaking, you want to:

  • Combine multiple related actions/files into a single layer so you don't have tons of layers.



  • Separate out unrelated actions/files which are likely to change independently.



  • Order layers such that those least likely to change occur first in the file.



  • Use multi stage builds to clean up if you have lots of garbage in your image/layers that does not need to be in the final image.



If you have to make a change to anything in the docker file, everything below that point has to be redone; caching is useless at that point. You want to avoid this.

So, rather than trying to combine everything into one layer or something odd like that, you should focus on understanding what is likely to change and how you can reduce the amount of layers you have to rebuild if and when that change happens.

The best explained example I've seen of this is here for Spring Boot.

Details are below... but to summarize; they determined that a huge layer/size was the application dependencies from Maven. They figured most app changes don't require new maven dependencies, so caching the maven dependencies in their own layer would generally make builds and image pulls faster by avoiding re-pulling dependencies. Then when you change the application code itself, it just pulls the later/smaller layers for the app itself which is very efficient.

A Better Dockerfile

A Spring Boot fat jar naturally has "layers" because of the way that the jar itself is packaged. If we unpack it first it will already be divided into external and internal dependencies.

Dockerfile

FROM openjdk:8-jdk-alpine
VOLUME /tmp
ARG DEPENDENCY=target/dependency
COPY ${DEPENDENCY}/BOOT-INF/lib /app/lib
COPY ${DEPENDENCY}/META-INF /app/META-INF
COPY ${DEPENDENCY}/BOOT-INF/classes /app
ENTRYPOINT ["java","-cp","app:app/lib/*","hello.Application"]


There are now 3 layers, with all the application resources in the later 2 layers. If the application dependencies don’t change, then the first layer (from BOOT-INF/lib) will not change, so the build will be faster, and so will the startup of the container at runtime as long as the base layers are already cached.

There really isn't a magic answer to this. It really just depends on your application and its dependencies and what is likely to change in your development/deployment environment over time.

Code Snippets

FROM openjdk:8-jdk-alpine
VOLUME /tmp
ARG DEPENDENCY=target/dependency
COPY ${DEPENDENCY}/BOOT-INF/lib /app/lib
COPY ${DEPENDENCY}/META-INF /app/META-INF
COPY ${DEPENDENCY}/BOOT-INF/classes /app
ENTRYPOINT ["java","-cp","app:app/lib/*","hello.Application"]

Context

StackExchange DevOps Q#10919, answer score: 5

Revisions (0)

No revisions yet.