News from the DevOps trenches

| 1 Comment | 10 minutes read

Regularly involved in consulting, I evangelise the best practices of software engineering and agile development methods. With iterative and shortened development cycles, one of these best practices is to have as many automated tests as possible and of course a continuous integration (CI) system. Things that, as such, are not really ground breaking but oftentimes still lack full implementation at our clients. To push things a little further, one might dare to dream of continuous delivery.

I believe it is good to evangelise these practises but as conscious professionals we should really apply those practices ourselves – again and again – to provide feedback coming from real world experience. Here I describe one of these experiences gained in a project for one of our clients.

The application

ekito was contracted to develop a supervision web application which also exposes a REST API for mobile clients that are used by a mobile workforce executing maintenance tasks on airports.

The technical stack comprises:

  • AngularJS for the web app,
  • Spring Boot for the application server and the REST API (yes we like bleeding edge stuff),
  • MongoDB as database.

In a classically tiered architecture, the application is deployed across two server machines: the application server and the database server.

Spring Boot follows the “no container” trend that appears to be very fashionable at the moment. Other Javaesque web stacks like Play! or Grails (ok, its really Scala or Groovy on the top of a JVM) follow the same path, promising a simplified deployment model à la “java -jar myWebApp.jar”.

The mobile application is developed by another company. This is where continuous delivery comes into play. The idea here is to continuously expose our latest built and tested REST API, so that the other company can develop a mobile application against it. Furthermore, our client can use the latest version of the web app for demo purposes and have his word during Sprint Review meetings. (Yes, we do kind of scrum.)

The software factory

When building a CI system the idea is to have representative and ideally identical environments for development, integration testing and production. Again a best practice – often times talked about and rarely fully applied. With virtualization technology at hand we decided to fully apply this principle.

In our context we distinguish three environments :

  • Development Workstations – Allowing for local unit- and integration testing during development.
  • Integration Platform – The CI server (Jenkins) triggers the build and integration tests against this platform at every source code commit.
  • Pre-production Platform – This platform is available to the public. The CI server automatically deploys every successful build into this platform and restarts the application server.

Building on the work of Arnaud Giuliani, we initially decided to use Vagrant and Docker.

Architecture first try

The idea was to have one virtual machine “Appliance VM” that would contain the application server and a Docker container with the MongoDB database. The MongoDB container would dispose of an external mount point, allowing for highly available storage of database files. This was to be simulated with the second virtual machine “NAS simulator”.

Development workstations have the VirtualBox hypervisor installed. Two guest machines (Appliance VM and Nas Simulator) were to be constructed and started by Vagrant. The provisioning inside these machines was to be done with Vagrant’s Shell provisioner and Docker.

Integration and Pre-production platforms were to be created, provisioned and managed by Vagrant and Docker on our VMWare ESXi hypervisor. The VMWare provider plug-in was to be used for this purpose.

News from the DevOps trenches

It turned out that even for this rather simple setup the automation of the environment definition with Vagrant and the provisioning with Docker was not that easy. On the top of it, it was very time consuming.

What happened? It seemed hard for us to make Vagrant and Docker work together harmoniously.  Vagrant’s shell provisioner installed Docker inside the Appliance VM. Then we used Docker to instantiate a Docker container with the MongoDB database.

#!/bin/bash
# copy hosts file
cat /vagrant/hosts >> /etc/hosts
apt-get update
apt-get install -y curl nfs-common
# mount dir
mkdir -p /data/mongodb
mount -o nolock,soft,sync,intr,rsize=8192,wsize=8192 urubu-nas:/data/mongodb /data/mongodb
echo "urubu-nas:/data/mongodb /data/mongodb nfs nolock,sync,soft,intr,rsize=8192,wsize=8192" >> /etc/fstab
curl -s https://get.docker.io/ubuntu/ | sudo sh
docker build -t ekito/mongodb /vagrant/web/
docker run -d -name mongodb -p 27017:27017 -p 28017:28017 -v /data/mongodb:/data/mongodb ekito/mongodb

Why was this time consuming? Well, scripting, though relatively simple is prone to errors, like any other programming activity. The problem here is cycle time. In order to test that a provisioning script actually works, the VM must be completely reconstructed with Vagrant. In our case, this cycle took up to 4 minutes.

Most time was spent in apt-get update / install commands downloading required software packages from remote repositories – again and again and again. A solution to this problem that we found was to construct another VM image with an apt-get proxy cache server and to configure all subsequent VM provisioning scripts to point to this cache server.

Vagrant definition:

VAGRANTFILE_API_VERSION = "2"

Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
 config.vm.define "apt" do |apt|
 apt.vm.box = "precise64"
 apt.vm.box_url = "http://files.vagrantup.com/precise64.box"
 apt.vm.network :private_network, ip: "192.168.55.9"
 apt.vm.network :forwarded_port, guest: 3142, host: 3142
 apt.vm.provision :shell do |shell|
  shell.inline = "apt-get update && apt-get install -y apt-cacher-ng"
 end
end

config.vm.define "nas" do |nas|
 nas.vm.box = "precise64"
 nas.vm.hostname = "urubu-nas"
 nas.vm.box_url = "http://files.vagrantup.com/precise64.box"
 nas.vm.network :private_network, ip: "192.168.55.11"
 nas.vm.provision "shell", path: "./nas/install.sh"
end
config.vm.define "web" do |web|
 web.vm.box = "precise64"
 web.vm.hostname = "urubu-web-server"
 web.vm.box_url = "http://files.vagrantup.com/precise64.box"
 web.vm.network :private_network, ip: "192.168.55.10"
 web.vm.network :forwarded_port, guest: 27017, host: 27017
 web.vm.network :forwarded_port, guest: 28017, host: 28017
 web.vm.provision "shell", path: "./web/install.sh"
 end
end

Modifications in shell provisioning script:

#!/bin/bash
# copy hosts file
cat /vagrant/hosts >> /etc/hosts
cp /vagrant/01proxy /etc/apt/apt.conf.d

01proxy:

Acquire::http::Proxy "http://apt_cache:3142";

Although we managed to shorten cycle time, it still took about 2 minutes per cycle.

As we were seriously loosing time (and money) on this apparently ridiculously easy task, we decided to take some drastic measures. Not only did Vagrant Shell provisioning turn out to be time consuming, but also would we had have to buy a license in order to provision on our VMWare ESXi platform. Without compromising on our initial objective, we chose to concentrate on the essence of our task – the reliable reproduction of a server setup across multiple platforms. The solution to our problem was to stop using Vagrant, to manually setup two guest machines (CI VM and Appliance VM) on our VMWare ESXi hypervisor and to exclusively use Docker.

archi2

Docker turned out to be our friend. Docker’s incremental build steps allowed for consequent cycle time speedups. When modifying a Docker file, all build steps previous to the line where the modification took place are simply “fast-forwarded”. This is possible, because Docker snapshots every provisioning step during container construction. This can be illustrated with the following command :

$ sudo docker images -tree
└─511136ea3c5a Virtual Size: 0 B
  └─6170bb7b0ad1 Virtual Size: 0 B
    └─9cd978db300e Virtual Size: 204.4 MB Tags: ubuntu:latest
      ├─dcb6c8feb27d Virtual Size: 204.4 MB
      │ └─d1333102e659 Virtual Size: 204.4 MB
      │   └─f70cecb13e02 Virtual Size: 450.4 MB
      │     └─ea0021a0f084 Virtual Size: 450.4 MB
      │       ├─87fab448a75a Virtual Size: 450.4 MB
      │       │ └─4fe520fc4f9f Virtual Size: 450.4 MB
      │       │   └─e618ef5a117a Virtual Size: 450.4 MB
      │       │     └─e2ddf3776190 Virtual Size: 450.4 MB Tags: ekito/mongodb:latest, urubu-ci:5000/ekito/mongodb:latest
      │       ├─c7b0ef19f41f Virtual Size: 450.4 MB
      │       │ └─4e8735e43caa Virtual Size: 450.4 MB
      │       │   └─d46587c2b573 Virtual Size: 450.4 MB
      │       │     └─2f1ec453346f Virtual Size: 450.4 MB
      │       └─a4a31857e5c4 Virtual Size: 450.4 MB
      │         └─dd0bbf19fb5b Virtual Size: 450.4 MB
      │           └─1fdb8ba696e6 Virtual Size: 450.4 MB
      │             └─6a3cc26de2d0 Virtual Size: 450.4 MB
      └─dded3b8ce411 Virtual Size: 308.6 MB
        ├─6651a04eaaa7 Virtual Size: 322.6 MB
        │ └─01efeefa0d1e Virtual Size: 322.6 MB
        │   └─c10e53d4f78b Virtual Size: 322.6 MB Tags: ekito/webapp/:latest
        └─af6863234f50 Virtual Size: 322.6 MB
          └─3f2821b9a17e Virtual Size: 322.6 MB
            └─16fef295aea9 Virtual Size: 322.6 MB Tags: ekito/webapp:latest, urubu-ci:5000/ekito/webapp:latest, urubu-ci:5000:ekito/webapp:latest

And the corresponding docker file for the MongoDB container:

# Dockerfile for mongodb
# Version 1.0
FROM ubuntu:latest

MAINTAINER Arnaud Giuliani <agiuliani@ekito.fr>

RUN apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10 && echo 'deb http://downloads-distro.mongodb.org/repo/ubuntu-upstart dist 10gen' | tee /etc/apt/sources.list.d/mongodb.list

RUN apt-get update; apt-get install -y mongodb-10gen

VOLUME ["/data/mongodb"]

EXPOSE 27017
EXPOSE 28017

ENTRYPOINT ["/usr/bin/mongod"]
CMD ["--port", "27017", "--dbpath", "/data/mongodb", "--smallfiles","--rest"]

Some generic higher level life cycle scripts were created (build, start, stop, clean) for the web application container and the MongoDB container, allowing for easier integration with Jenkins. I would like to draw special attention to the clean script. When frequently building containers with Docker, old image snapshots quickly pile up consuming GBs of disk space. Especially during CI, it is necessary to clean up unused and freed Docker images and stopped container instances at the end of each build job.

#!/bin/bash

E_NOTROOT=87
ROOT_UID=0

# Check if run as root.
if [ "$UID" -ne "$ROOT_UID" ]
then
  echo "You must be root to run this script. Try sudo ./clean.sh "
  exit $E_NOTROOT
fi

echo "remove stopped containers"
docker rm $(docker ps -a | grep Exit | awk '{print $1}')

echo "remove untagged images"
docker rmi $(docker images | grep "^" | awk '{print $3}')
exit 0

About automated no-container remote deployment

No-Container remote deployment requires quite some DevOps craftsmanship. Runtime containers such as Java EE application servers or OSGi containers provide remote deployment facilities. It is relatively easy to automate remote deployment to these containers using build tools like Maven or Gradle. The no-container approach somewhat forces us to craft our own tools.

One way to do this is à la “Heroku”. One writes a git hook on the production server that is triggered when a new version of the code source is pushed to the server. The git hook then builds the project and restarts the production server with the newly built server artefact.

git hook

In our case this approach turned out to be fragile and laborious, because the artefact had to be constructed again from the source code. Furthermore, our naive shell-scripted git hook caused development cycle time to rocket up again. With regards to this approach, I shall pay deep respect to the guys from Heroku, who made git push deployment a very reliable form of deployment for many different programming platforms.

For our project we chose a different approach. The constructed deliverable, a fully functional and integration tested Docker container is deployed to a local Docker repository. The production server is instructed to pull the latest version from the Docker repository and to start the container.

docker registry

One smart and free take away with this approach is that there is no need to automate application start up when the host machine starts (the famous System V init scripts). The installed Docker daemon starts automatically as part of the host machine start up process. It then restarts all containers that were running when the host machine shut down.

Lessons learnt

  1. Every half ways industrial application needs some kind of container for remote deployment and comfortable application start/stop/restart. In our case its a Docker container.
  2. Automated no-container deployment done right is harder than one might think.
  3. When using tools like Vagrant, don’t provision with shell scripting but use higher level automated tools, such as Ansible, Chef, Puppet or the like. We did not manage to make it work directly with Docker provisioning. Probably we are noobs.
  4. Cycle time can be a killer. When developing, use tools that allow for very short cycle times (< 10s feels comfortable) thus providing frequent feedback to developers.
  5. Beware where you install Docker. You need a rather recent Linux kernel in order to benefit from the LXC sugar.

Now we can go out and evangelise again 😉

Share Button

Bert Poller Author: Bert Poller

Architecte SI, Artisan logiciel - J'aime programmer et concevoir des systèmes de toutes sortes.
D'origine allemande, polyglotte pas qu'en langages de programmation. Technophile, avec un regard critique sur le monde.
#JAVA #SCALA #DOCKER #REACTIVE #FP #DEVOPS #MESSAGING #DIGITAL_FABRICATION #SUSTAINABLE #COFFEE

One Comment

  1. Pingback: Run a cron job with Docker - ekito peopleekito people

Leave a Reply

Required fields are marked *.


CommentLuv badge