Docker

Ah... the new world of microservices. Docker is the company that stands at the forefront of modular applications, also known as microservices, where each of them can & usually communicate via APIs. This "container" service is wildly popular, because it is OS-agnostic & resource light, with the only dependency is that docker needs to be installed.

Some facts:

All Docker images are linux, usually either alpine or debian
alpine installations will need apk, while debian uses apt
Docker is not the only container service available, but the most widely used

Installation

You may visit Docker's official website for installation in various OS and architectures. Below is for ubuntu.

# Uninstall old versions
sudo apt-get remove docker docker-engine docker.io containerd runc

sudo apt-get update

sudo apt-get install \
    ca-certificates \
    curl \
    gnupg \
    lsb-release

# Add Docker’s official GPG key:
sudo mkdir -p /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg

# set up the repository:
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
  $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

# Install latest Docker Engine
sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-compose-plugin

Basics

There are various nouns that are important in the Docker world.

Noun	Desc
Image	Installed version of an application
Container	Launched instance of an application from an image
Container Registry	a hosting platform to store your images. E.g. Docker Container Registry (DCR), AWS Elastic Container Registry (ECR)
Dockerfile	A file containing instructions to build an image

Also, these are the various verbs that are important in the Docker world.

Verb	Desc
build	Copy & install necessary packages & scripts to create an image
run	Launch a container from an image
rm or rmi	remove a container or image respectively
prune	clear obsolete images, containers or network

Dockerfile

The Dockerfile is the essence of Docker, where it contains instructions on how to build an image. There are four main instructions:

CMD	Desc
`FROM`	base image to build on, pulled from Docker Container Registry
`RUN`	install dependencies
`COPY`	copy files & scripts
`CMD` or `ENTRYPOINT`	command to launch the application

Each line in the Dockerfile is cached in memory by sequence, so that a rebuilding of image will not need to start from the beginning but the last line where there are changes. Therefore it is always important to always (copy and) install the dependencies first before copying over the rest of the source codes, as shown below.

FROM python:3.7-slim

RUN apt-get update && apt-get -y upgrade \
    && apt-get install ffmpeg libsm6 libxext6 -y \
    && rm -rf /var/lib/apt/lists/*

WORKDIR /app
COPY requirements-serve.txt .
RUN pip3 install --no-cache-dir --upgrade pip~=22.3.1 \
    && pip install --no-cache-dir -r requirements-serve.txt

COPY . /app

ENTRYPOINT [ "python", "-u", "serve_http.py" ]

Intel-GPU

If the host has Nvidia GPU, we should make use of it so that the inference time is much faster; x10 faster for this example. We will need to choose a base image that has CUDA & CUDNN installed so that GPU can be utilised.

FROM pytorch/pytorch:1.5.1-cuda10.1-cudnn7-devel

RUN apt-get update && apt-get -y upgrade \
    && apt-get install ffmpeg libsm6 libxext6 -y \
    && rm -rf /var/lib/apt/lists/*

WORKDIR /app
COPY requirements-serve.txt .
RUN pip3 install --no-cache-dir --upgrade pip~=22.3.1 \
    && pip install --no-cache-dir -r requirements-serve.txt

COPY . /app

ENTRYPOINT [ "python", "-u", "serve_http.py" ]

The run command is:

sudo docker run --gpus all --ipc=host -d -p 5000:5000 --name  <containername> <imagename>

ARM-GPU

The ARM architecture requires a little more effort; the below installation is for Nvidia Jetson Series Kit.

# From https://ngc.nvidia.com/catalog/containers/nvidia:l4t-pytorch 
# it contains Pytorch v1.5 and torchvision v0.6.0
FROM nvcr.io/nvidia/l4t-pytorch:r32.4.3-pth1.6-py3

ARG DEBIAN_FRONTEND=noninteractive
RUN apt-get -y update && apt-get -y upgrade
RUN apt-get install -y wget python3-setuptools python3-pip libfreetype6-dev

# Install OpenCV; from https://github.com/JetsonHacksNano/buildOpenCV
RUN apt-get -y install qt5-default
COPY ./build/OpenCV-4.1.1-dirty-aarch64.sh .
RUN ./OpenCV-4.1.1-dirty-aarch64.sh --prefix=/usr/local/ --skip-license && ldconfig

# Install other Python libraries required by Module
COPY requirements.txt .
RUN pip3 install -r requirements-serve.txt

# Copy Python source codes
COPY . .

RUN apt-get clean && rm -rf /var/lib/apt/lists/* && rm OpenCV-4.1.1-dirty-aarch64.sh

ENTRYPOINT [ "python3", "-u", "serve_http.py" ]

The run command is:

sudo docker run -d -p 5000:5000 --runtime nvidia --name <containername> <imagename>

Common Commands

Build

The basic build command is quite straight forward. However, if we have various docker builds in a repo, it is best to name it with and extension representing the function, e.g. Dockerfile.cpu. For that, we will need to direct Docker to this specific file name.

sudo docker build -t <imagename> .
sudo docker build -t <imagename> . -f Dockerfile.cpu

Run

For an AI microservice in Docker, there are five main run commands to launch the container.

Cmd	Desc
`-d`	detached mode
`-p 5000:5000`	OS system-port:container-port
`--log-opt max-size=5m --log-opt max-file=5`	limit the logs stored by Docker, by default it is unlimited
`--restart always`	in case the server crash & restarts, the container will also restart
`--name <containername>`	as a rule of thumb, always name the image & container

The full command is as such.

sudo docker run -d -p 5000:5000 --log-opt max-size=5m --log-opt max-file=5 --restart always --name <containername> <imagename>

We can also stop, start or restart the container if required.

sudo docker stop <container-name/id>
sudo docker start <container-name/id>
sudo docker restart <container-name/id>

Check Status

This checks the statuses of each running container, or all containers.

sudo docker ps
sudo docker ps -a

Shows resource usage for each running container.

sudo docker stats

Clean

To manually remove a container or image, used the below commands. Add -f to forcefully remove if the container is still running.

sudo docker rm <container-id/name>
sudo docker rmi <image-id/name>

Each iteration of rebuilding and relaunching of containers will create a lot of intermediate images, and stopped containers. These can occupy a lot of space, so using prune can remove all these obsolete items at one go.

There are nusances to prune specific types. For example, just doing image prune, the HDD will still eventually be overwhelmed as dirs in /var/lib/docker/overlay2 will not be cleared.

sudo docker image prune
sudo docker container prune
sudo docker volume prune
sudo docker network prune
sudo docker system prune

# force, without the need to reply to yes|no
sudo docker system prune -f

Debug

We can check the console logs of the Flask app by opening up docker logs in live mode using -f. The logs are stored in /var/lib/docker/containers/[container-id]/[container-id]-json.

sudo docker logs <container_name> -f
sudo docker logs <container_name> -f --tail 20

At times, we might need to check what is inside the container.

sudo docker exec -it <container name/id> bash

We can also run specific commands by defining a new entrypoint. the --rm will remove the container after command is completed.

docker run --rm --entrypoint=/bin/sh <image_name> -c "ls -la"

Sometimes, we might want to copy file(s) out to examine them. We can do so with the following commands.

docker cp <container_id>:/path/to/file /path/on/host
docker cp <container_id>:/path/to/directory /path/on/host

Storage

By design, docker containers do not store persistent data. Thus, any data written in containers will not be available once the container is removed. There are three options to persist data, bind mount, volume, or tmpfs mount.

Ports

Arguably one of the more confusing commands. Docker port commands involve two types

EXPOSE: indicating in metadata that this port will be exposed, but it does not do anything. Almost like a form of documentation. It is good practise to add this, even though strictly speaking, it is not required for the application to work
Bind: Actual exposure of a port in a container to communicate with the host

Each container is a separate environment on its own, so you can have multiple containers having the same port.

The same cannot be said for the latter. Since you can only bind to one specific host port, else you will receive an error of duplicate ports.

ENTRYPOINT vs CMD

Another confusing set of commands are ENTRYPOINT and CMD. The former is fixed, while CMD is like an argument which can be changed. An example is:

ENTRYPOINT ["sleep"]
CMD ["300"]

Of course, if you do not require any flexible tweaking, putting the entire command in either of them is fine.

CMD ["sleep", "300"]

Network

For different containers to communicate to each other, we need to set a fixed network as the IP will change with each instance. We can do so by creating a network and setting it in when the container is launched.

docker network create <name>
docker run -d --network <name> --name=<containera> <imagea>
docker run -d --network <name> --name=<containerb> <imageb>

# list networks
docker network ls

Alternatively, we can launch the containers together using docker-compose, and they will automatically be in the same network.

For sending REST-APIs between docker containers in the same network, the IP address will be http://host.docker.internal for Mac & Windows, and http://172.17.0.1 for Linux.

We will at times need to login to a particular container registry to pull the images. To verify that the credentials have been stored we can look for the config.json file.

# login gitlab
docker login registry.gitlab.com -u $username -p $password

# check login
cat ~/.docker/config.json # ubuntu
cat /root/.docker/config.json # centos

It will output the registry you login as well the credentials.

{
  "auths": {
    "registry.gitlab.com": {
      "auth": "base64_encoded_credentials"
    }
  }
}

Security Patches

We can improve our container security by running the os updates. More information is described here.

FROM debian:buster
RUN apt-get update && apt-get -y upgrade

Optimize Image Size

There are various ways to reduce the image size being built.

Slim Build

For python base image, we have many options to choose the python various and build type. As a rule-of-thumb, we can use the slim build as defined below. It has a lot less libraries, and can reduce the image by more than 500Mb. Most of the time the container can run well, though some libraries like opencv can only work with the full image.

Note that the alpine build is the smallest, but more often then not, you will find that a lot of python library dependencies does not work well here.

FROM python:3.8-slim

Disable Cache

By default, all the python libraries installed are cached. This refers to installation files(.whl, etc) or source files (.tar.gz, etc) to avoid re-download when not expired. However, this is usually not required when building your image.

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

Below shows the image size being reduced. The bottom most is the full python image, with cache. The middle is the python slim image, with cache. The top most is the python slim image with no cache.

Multi-Stage

We can split the Dockerfile building by defining various stages. This is very useful when we want to build to a certain stage; for e.g., for testing we only need the dependencies, and we can use the command docker build --target <stage-name> -t <image-name> .

FROM python:3.9-slim AS build

# os patches
RUN apt-get update && apt-get -y upgrade \
    # for gh-actions path-filter
    && apt-get -y install git \
    && rm -rf /var/lib/apt/lists/*

FROM build AS install

# install python requirements
COPY requirements-test.txt .
RUN pip3 install --no-cache-dir --upgrade pip \
    && pip3 install --no-cache-dir -r requirements-test.txt

A second benefit is to reduce the image size, and improving security by ditching unwanted programs or libraries from former stages.

Linting

Linters are very useful to ensure that your image build is well optimized. One of the more popular linters is called hadolint. Below is an example output after scanning using the command hadolint <Dockerfile-name>.

Dockerfile:4 DL3009 info: Delete the apt-get lists after installing something
Dockerfile:5 DL3008 warning: Pin versions in apt get install. Instead of `apt-get install <package>` use `apt-get install <package>=<version>`
Dockerfile:5 DL3015 info: Avoid additional packages by specifying `--no-install-recommends`
Dockerfile:5 DL3059 info: Multiple consecutive `RUN` instructions. Consider consolidation.
Dockerfile:6 DL3059 info: Multiple consecutive `RUN` instructions. Consider consolidation.
Dockerfile:6 DL3015 info: Avoid additional packages by specifying `--no-install-recommends`
Dockerfile:6 DL3008 warning: Pin versions in apt get install. Instead of `apt-get install <package>` use `apt-get install <package>=<version>`
Dockerfile:9 DL3045 warning: `COPY` to a relative destination without `WORKDIR` set.
Dockerfile:10 DL3013 warning: Pin versions in pip. Instead of `pip install <package>` use `pip install <package>==<version>` or `pip install --requirement <requirements file>`
Dockerfile:10 DL3042 warning: Avoid use of cache directory with pip. Use `pip install --no-cache-dir <package>`
Dockerfile:11 DL3059 info: Multiple consecutive `RUN` instructions. Consider consolidation.
Dockerfile:18 DL3009 info: Delete the apt-get lists after installing something

Buildx

Docker introduced a new cli feature called buildx that makes it possible and easy to build and publish Docker images that work on multiple CPU architectures. This is very important due to the prominence of Apple M1 (ARM64), where you need to build to x64 (amd64); as well as build ARM images to leverage cheaper ARM cloud instances (can be up to 30% cheaper than x64).

You can see the various supported architectures below.

docker buildx ls
# linux/amd64, linux/riscv64, linux/ppc64le, linux/s390x, 
# linux/386, linux/arm64/v8, linux/arm/v7, linux/arm/v6

To do that, we can use the buildx build --platform linux/<architecture> command. For ARM, we can omit the version by using linux/arm64.

docker buildx build --platform linux/amd64 -t sassy19a/dummy-flask .

In docker-compose this can be done by specifying the platform key.

version: '2.4'

services:
  testbuild:
    build: .../testbuild
    image: testbuild
    platform: linux/arm64/v8

Docker-Compose

When we need to manage multiple containers, it will be easier to set the build & run configurations using Docker-Compose, in a ymal file called docker-compose.yml. We need to install it first using sudo apt install docker-compose.

The official Docker blog post gives a good introduction on this.

version: "3.5"
services:
    facedetection:
        build: 
            context: ./project
            dockerfile: Dockerfile-api
        # if Dockerfile name is not changed, can just use below
        # build: ./project
        container_name: facedetection
        ports:
            - 5001:5000
        logging:
            options:
                max-size: "10m"
                max-file: "5"
        deploy:
          resources:
            limits:
              cpus: '0.001'
              memory: 50M
        volumes:
            - type: bind
              source: /Users/jake/Desktop/source
              target: /model
        restart: unless-stopped
    maskdetection:
        build: ./maskdetection
        container_name: maskdetection
        ports:
            - 5001:5000
        logging:
            options:
                max-size: "10m"
                max-file: "5"
        environment:
            - 'api_url={"asc":"http://172.17.0.1:5001/product-association",
                        "sml":"http://172.17.0.1:5002/product-similarity",
                        "trd":"http://172.17.0.1:5003/product-trending",
                        "psn":"http://172.17.0.1:5004/product-personalised"}'
            - 'resultSize=10'
        restart: unless-stopped

We can simplify repeated code using anchors, but each anchor must have the prefix of x-.

version: "3.5"

x-common: &common
  logging:
      options:
          max-size: "10m"
          max-file: "5"
  restart: unless-stopped

services:
    facedetection:
        build: 
            context: ./project
            dockerfile: Dockerfile-api
        # if Dockerfile name is not changed, can just use below
        # build: ./project
        container_name: facedetection
        ports:
            - 5001:5000
        deploy:
          resources:
            limits:
              cpus: '0.001'
              memory: 50M
        volumes:
            - type: bind
              source: /Users/jake/Desktop/source
              target: /model
        <<: *common
    maskdetection:
        build: ./maskdetection
        container_name: maskdetection
        ports:
            - 5001:5000
        environment:
            - 'api_url={"asc":"http://172.17.0.1:5001/product-association",
                        "sml":"http://172.17.0.1:5002/product-similarity",
                        "trd":"http://172.17.0.1:5003/product-trending",
                        "psn":"http://172.17.0.1:5004/product-personalised"}'
            - 'resultSize=10'
        <<: *common

The commands follows docker commands closely, with some of the more important ones as follows.

Cmd	Desc
`docker-compose build`	build images
`docker-compose pull`	pull image from a registry; must input key `image`
`docker-compose up`	run containers
`docker-compose up servicename`	run specific containers
`docker-compose up -d`	run containers in detached mode
`docker-compose ps`	view containers' statuses
`docker-compose stop`	stop containers
`docker-compose start`	start containers
`docker-compose down`	remove all containers
`docker-compose down --rmi all --volumes`	remove all containers, images, and volumes

Docker Dashboard

Docker in Windows & Mac comes by default a docker dashboard, which gives you a easy GUI to see and manage your images and containers, rather than within the commandline.

However, this is lacking in Linux. A great free alternative (with more features) is Portainer. We just need to launch it using docker with the following commands, and the web-based GUI will be accessible via localhost:9000. After creating a user account, the rest of it is pretty intuitive.

sudo docker volume create portainer_data
sudo docker run -d -p 8000:8000 -p 9000:9000 --name=portainer --restart=always -v /var/run/docker.sock:/var/run/docker.sock -v portainer_data:/data portainer/portainer-ce

The home page, showing the local docker connection, and some summary statistics.

On clicking that, an overview of the local docker is shown.

Entering the container panel, we have various options to control our containers.

We can even go into the container, by clicking the console link.

Blue-Green Deployment

Normal docker or docker-compose deployments requires downtime during redeployment, using docker stop/start <container> or docker restart <container> commands.

To achieve a zero-downtime deployment, we can use a neat trick with nginx reload functionality, together with docker-compose scale bumping of versions. The details of this blue-green deployment are well explained in these two articles [1, 2].

This is the docker-compose file which contains the api service, as well as nginx. They are tied to the same network, so that they can communicate with each other. Note that the network is not really required to explictly defined, as by default if they are launched together in a compose file, they will be within the same network.

An essential part of this file is that we must not define the container name of the api service.

version: "3.2"
services:
  api:
    build:
      context: .
    networks:
      - bluegreen
  nginx:
    image: nginx
    container_name: nginx
    volumes:
      - ./nginx-conf.d:/etc/nginx/conf.d
    ports:
      - 8080:80
    networks:
      - bluegreen
    depends_on:
      - api

networks:
  bluegreen:
   name: bluegreen

This is the nginx config file which is stored at the root location of nginx-conf.d/bluegreen.conf. The most important thing of this file is that the proxy link to the api service is using the service name itself.

server {
    listen 80;
    location / {
            proxy_pass http://api:5000;
    }
}

This is the bash script for redeployment. In essence, it will first build/pull the first image into docker. Then, it will run another new container of the api service, with the service name being bumped to a new version.

Nginx will then reload to start routing to the new container. After the old container is destroyed, nginx will reload again to stop routing to the old container.

# define container service to redeploy
service_name=api
service_port=5000


# either build new image or pull image
docker-compose build $service_name
# docker-compose pull $service_name


reload_nginx() {  
  docker exec nginx /usr/sbin/nginx -s reload  
}

zero_downtime_deploy() {  

  old_container_id=$(docker ps -f name=$service_name -q | tail -n1)

  # bring a new container online, running new code  
  # (nginx continues routing to the old container only)  
  docker-compose up -d --no-deps --scale $service_name=2 --no-recreate $service_name

  # wait for new container to be available  
  new_container_id=$(docker ps -f name=$service_name -q | head -n1)

  # start routing requests to the new container (as well as the old)  
  reload_nginx

  # take the old container offline  
  docker stop $old_container_id
  docker rm $old_container_id

  docker-compose up -d --no-deps --scale $service_name=1 --no-recreate $service_name

  # stop routing requests to the old container  
  reload_nginx  
}

zero_downtime_deploy

Quite a ingenious way to achieve a zero-downtime deployment I must say.

Cmd	Desc
`docker volume inspect volume_name`	inspect the volume; view mounted directory in docker
`docker volume ls`	view all volumes in docker
`docker rm volume`	delete volume