Fleet job to remove unused docker images

Engagement Engineering, the team that I'm part of at Mozilla, runs two Deis clusters on AWS to host important websites including www.mozilla.org.

Deis is a Heroku-inspired PaaS which utilizes CoreOS and Docker. It's a great open-source project, developed in the public, with a great Community and commercially backed by Engine Yard.

Apps on Deis run within Docker containers which run on CoreOS machines that form the Deis cluster. Each new release of your code, i.e. each new deis pull or git push, creates a new Docker image that is stored in the internal Deis Docker Registry and downloaded onto the CoreOS machines that are scheduled to run your code.

Bedrock (the project name for www.mozilla.org) is a big website. It's Docker image weights about 600MiB with all the translations, images and python dependencies. On every new release of Bedrock, which happens almost every day, the machines of our cluster pulled and stored about half a gig of space only for bedrock.

Since all the Docker images are stored in the internal Deis Docker Registry anyway there is no need to store the stopped older docker containers or their images in the individual CoreOS instances. If we need to revert to an older release of Bedrock the machines will pull the needed layers from the Deis Docker Registry again.

We built and deployed a Fleet Service to remove the unused images:

docker-cleanup.service

[Unit]
Description=Clean up unused docker containers and images

[Service]
Type=oneshot
ExecStart=/bin/bash -c 'CONTAINERS=`docker ps -a -q -f status=exited`; if [[ ! -z $CONTAINERS ]]; then docker rm -v $CONTAINERS; fi; IMAGES=`docker images -f dangling=true -q`; if [[ ! -z $IMAGES ]]; then docker rmi $IMAGES; fi; echo "Completed docker image cleanup."'

[Install]
WantedBy=multi-user.target

[X-Fleet]
Global=true

The Service checks if there're any containers which are exited and removes them. Similarly checks if there are any dangling docker images and removes them too. Prints the Completed docker image cleanup text in the end so we can check through the Dead Mans Snitch that the job run.

The X-fleet directive instructs Fleet to run this service on each and every machine of the cluster.

You can run this service and cleanup the images immediatelly with:

fleetctl start docker-cleanup.service

We also built a timer unit to schedule daily cleanups to keep things clean and tidy without our interaction:

docker-cleanup.timer

[Unit]
Description=Trigger clean up of unused docker containers and images

[Timer]
OnCalendar=daily

[X-Fleet]
Global=true

After loading the cleanup service, load and start the timer

fleetctl load docker-cleanup.service
fleetctl load docker-cleanup.timer
fleetctl start docker-cleanup.timer

and the docker-cleanup.service will run daily.

Note that fleetctl list-units will list your docker-cleanup.service as dead which is OK. It will run as expected when the waiting docker-cleanup.timer unit triggers.

core@ip-10-0-0-0 ~ $ fleetctl list-units
UNIT                    MACHINE         ACTIVE      SUB
docker-cleanup.service          1c919e41.../10.0.0.0    inactive    dead
docker-cleanup.timer            1c919e41.../10.0.0.0    active      waiting
Go Top
>