Friday, July 11, 2014

Continuous Delivery with Docker: Part 1, creating a dockerized application and test suite

Introduction  

Docker is gaining popularity and from a continuous delivery standpoint it is very exciting.  Docker allows us to easily package up an application in a container that can then be moved around from one environment to another in a nice self contained package.  This has many of the benefits of traditional virtual machines without the cost of very large files that are difficult to move and update.  From a developer perspective these containers are nice to work with because they are so light and they have nice features such as layering.  Layering means that if I am simply updating one portion of the stack in the container only that layer (and those below it) get updated.  

From an operations perspective they are very attractive because they are easy to consume, run and have a light footprint.  Where things get especially interesting is in the development of distributed applications or micro-services and I’ll talk about that at another time.    

For this post I simply wanted to explore how I could setup a basic pipeline that took an application, packaged that application in a docker container and then moved it through a set of environments for automated test, integration testing, production and sharing with others.   This post is in two parts: 
  • Part 1: creating a dockerized application and test suite 
  • Part 2: creating a pipeline for my dockerized application
Some motivations 
Within our organization we can reliably provision complicated deployments of our IBM Collaborative Lifecycle Management stack on WebSphere and DB2.  To do this we use IBM Urbancode Deploy and cloud platforms such as IBM Pure Application System.  However, even with automation these topologies take approximately an hour to provisionOur engineers want instant access to the latest builds.  We wanted to build a SimpleTopologyService which would simply receive notifications of new builds, pre-deploy a set of environments and then cache them for our engineers to ‘checkout'.  Overtime we also want to break down our large build processes into a set of services based on 12 factor app ideas.  This will allow us to rapidly innovate on certain areas without living in fear that we will 'break the build'.  
So developing a basic application for a SimpleTopologyService offers up an opportunity to experiment and learn about cool technologies like Docker and form an opinion on what role they should plan in our evolving DevOps story.  

Try it 
If you have not used Docker you should take 5 minutes and do so.  Knowing nothing you will have a good with the online tutorial  


A few up front admissions and conclusions 
I am not a Node developer, nor am I an expert in Docker or even Urbancode Deploy.  I wanted to spend a day looking at how these technologies work together.  Ultimately this wound up being 3 days of work due to typical interruptions and falling into a few rat-holes.  I’ll provide more information on the rat-holes later. 

All of these technologies are a lot of fun, are reasonably easy to pickup due to the great communities building around them.  I believe strongly that Docker will play an increasing role in Continuous Delivery and DevOps efforts.  As we see efforts around Cloud with no boundaries and Docker as a means to quickly package, move, deploy applications we will see (and contribute) many tools and services to make DevOps processes simple.


It is also worth noting that most if not all of the content below is covered in depth in various blogs and sites.  I've included some references at the end of this post.  

My simple application 

I choose Nodejs and MongoDb and leveraged expressjade templates and twitter bootstrap.  I used mocha as a testing framework.  The application has a basic webui and a rest (like) api.  

I won’t get into the details of developing the application itself as there are a lot of existing posts on using these technologies together.  One aspect that is worth noting is the application configuration.  I want the application to be able to use a local mongo database, or to use an external existing one.  I also need the application to be configurable in terms of what port to run on, whether or not to keep test data around etc.  Initially I had these as properties in the node application itself (bad), then moved them into a config.json file which worked ok.  However, based on http://12factor.net/config  I needed to have the configuration elements that vary from one deployment to another set using environment variables.  With this in mind I used https://github.com/flatiron/nconf
 so that the application would default to a config file, but environment variables could be set for aspects that may vary.

In my node application I can configure this order like so:

var fs = require('fs');
var nconf = require('nconf');
nconf.argv().env().file({ file: './config.json'});

This allows me to have defaults on my local host that can then be overridden by the environment so that I can run something like:
$ WEB_PORT=3000 DB_HOSTNAME=mydbhost DB_PORT=27017 node app.js 
Connecting to mongo with:mongodb://mydbhost:27017/simpleTopologyService
Express server listening on port 3000

This approach became useful both for deploying the application but also targeting the test suite to a deployed instance.

Setting up Docker 

There are great docs on setting up Docker.  I’ll quickly describe a few approaches I took. 

Setting up Docker on my personal machine
My development environment is a Mac.  To run docker locally there are a few options.  

The first is to use boot2docker which provides a small Virtual Machine running on VirtualBox.  It also provides a local command (boot2docker) to start and ssh into that virtual machine, and allows for you to use docker commands directly from the command prompt.  In addition it sets up a host only network that automatically allows you to access your deployed containers via mapped ports.  To access your applications you use the ip address of the boot2docker-vm which can be found simply by typing:
$boot2docker ip
The VM's Host only interface IP address is: 192.168.59.103
 A second approach is to use Vagrant to setup an ubuntu image with docker. With this approach you can then setup a shared folder in Virtual Box so that you can share source on your local machine with the Virtual Machine. In order to access your application running in a container you then need to map ports from your local machine, to ports on the running Virtual Machine (which in turn are mapped to the ports exposed by each container).
For the second approach here is the Vagrant file I used: With this I would simply run Vagrant up and then Vagrant ssh to connect to the docker virtual machine and go to work. To map the ports to access I used the following: Both approaches worked well.
Setting up docker on my integration and production machines 
I deployed a RedHat and an Ubuntu Virtual Machines in the Lab. I had no problems following the instructions to install docker on the ubuntu image. On the RedHat image a kernel upgrade was necessary, after which Docker would run but did not behave as expected (my containers could not get out even with iptables configured) - this was a 2 two hour rat-hole and I gave up and simply used ubuntu.
Setting up a private registry for Docker Containers

Could not have been simpler. On a host with Docker installed I simply ran:
$boot2docker ip
     docker run -p 5000:5000 registry 
To push an image to the private registry:
          docker tag [local-image] myregistry.hostname:5000/myname 
          docker push my registry.hostname:5000/myname 
To pull an image in the private registry simply:
          docker pull myregistry.hostname:5000/image:tag 

A few gotchas ... currently there is not a means (I am aware of) to easily remove images from a local registry. For authentication you can have the registry listen on localhost only and then use SSH tunneling to connect.
ssh -i mykey.key -L 5000:localhost:5000 user@myregistry.hostname
The registry is evolving so check here.
Setting up public (or private) registry on DockerHub 
Create an account at https://hub.docker.com/ and setup your registries https://registry.hub.docker.com/ 

Dockerizing the application

Why 
First off, why should I dockerize my application? It is a simple application to run it I simply need to: 
$ mongod 
$npm install 
$ node app.js 
Express server listening on port 3000
However, while this is simple as the application developer it is dependent upon me having mongo installed or access to a remote mongodb, having installed node, having internet access to run npm install and also being good about specifying versions of modules in my package.json etc. dockerizing my application allows me to capture my application and all of it’s dependencies in a container so that all anyone needs to do to run it is:
$ docker run -P rjminsha/simpletopologyservice
To run the container I don’t even need to know what the technology stack is. All I am doing is running the container. If I am consuming this application to test with, or to deploy into production this is really nice.

To get an updated version of the application I can simply pull the application
$docker pull rjminsha/simpletopologyservice
and then run it. The pull will only pull down the layers in the application that have changed. Also really nice.

Getting a basic node application running in a docker container 
The following article and example by Daniel Gasienica was pretty straight forward to follow and got me going within a few minutes.  Now I have a basic template to follow for taking my Node Application and running it within a Docker Container.  

Options
My container needs some flexibility.  It needs to be able to run both node and mongodb in the same container, mongodb in a seperate container or to simply connect to an existing mongodb in some SaaS environment. 

A common approach to running multiple processes in a container is to use supervisord . Supervisord provide a way to manage multiple processes (including restarting etc). This approach is very popular as it also allows a way to run an ssh daemon on the container so that you can login via SSH to the running container.

A second approach to running multiple processes outlined in Alexander Beletsky's Blog . Basically have a startup script that starts mongo in the background if needed, then starts node:

/usr/bin/mongod --dbpath /data/db &
node /app/app.js
The Dockerfile then simply runs this script.
CMD ["/app/start.sh”]
I tried both; both worked easily. Currently I am going with the second approach because it makes viewing the logs on my process simple and someone yelled at me that I should not treat containers like a virtual machine. There appears to be some differing opinions on one process per container vas light weight vm.
$docker logs simpletopologyservice
WEB_PORT:3001
DB_HOSTNAME:localhost
DB_PORT:27017
Starting Mongo DB
Starting node application
While I could/should have re-used some existing docker images my Dockerfile looks like the this: Not very polished but good enough right now to continue experimenting.

Testing with Docker containers   

Mocha and BDD 
To test my application I used a framework called mocha.  It is similar to Cucumber and allows you to write automated tests following Behavior Driven (BDD) Development concepts.  I really enjoyed using Mocha.  In BDD you using natural language to describe expected behavior, you implement the test case, then you implement the code.  A very simple example is: 

describe('SimpleTopologyService Webui Tests', function() {
  describe('GET /topology/topologies', function() {
    it('should return a 200 response code', function(done) {
      http.get({ hostname: topologyHostname, path: '/topology/topologies', port: topologyPort }, function(res) {
        assert.equal(res.statusCode, 200,
           'Expected: 200 Actual: ' + res.statusCode);
        done();
      });
    });
  });
One very nice thing is that you can have a test suite that has pending tests. For example:
describe('Topology Pool responds to a notification that there is a new build', function() {
    it('Topology provides REST API to notify of a new build');
    it('Topology creates event when new build is received');
    it('Topology Pool recieves event when a new build is created');
    it('When a new build event is recieved the Pool should purge old instances');
    it('When a new build even is recieved the Pool should new new instances of the build');
    it('When a new build event is received the Pool should notify existing users that a new instance is available');
});
These tests do not have an implementations. I can write a set of descriptions of the expected behavior up front which helps us think about how we want the application to behave.  It is also a great way to to collaborate with team members.  Pending tests are reported as pending rather than failed so the general flow is to  write your test cases up front, implement code and work your way through the function making sure you have not introduced regressions.
$ mocha --reporter=spec  
… 
    POST /api/v1/topology/topologies

      ✓ should return at 200 response code on success and return json 
      ✓ should return 400 response code if I try to create a document with the same name 
      ✓ should be able to delete new topology record and recieve 200 response code in response 
      ✓ should no longer be able to locate record of the topology I just removed so should recieve a 400 response code 
      ✓ should return at 400 response code if there is a validation error 
    GET /api/v1/topology/topologies:id
      ✓ topology name should be correct 
      ✓ topology should list of URIs for pools of this topology 
      ✓ topology should list of providers for this topology which include type, username, password 
    PUT /api/v1/topology/topologies:id
      ✓ should return at 200 response code if a valid referenceURL is passed 
      ✓ should return at 400 response code if invalid data is passed 
      ✓ should return at 404 response code if the tasks does not exist 
      ✓ can update pools with new pool URL 
SimpleTopologyService Topology API after removed test data
  35 passing (2s)
  15 pending
Testing with my dockerized application
Testing locally is fine but I needed to be able to test my application running in a container.  To do this I can pass in properties telling mocha about the location of my running container.  So if I run my container: 
 

$ docker run -P -d rjminsha/simpletopologyservice
And inspect the running container 
CONTAINER ID        IMAGE                                   COMMAND             CREATED             STATUS              PORTS                                                                         NAMES
137a72b6aca1        rjminsha/simpletopologyservice:latest   /app/start.sh       32 seconds ago      Up 30 seconds       0.0.0.0:49153->27017/tcp, 0.0.0.0:49154->28017/tcp, 0.0.0.0:49155->3001/tcp   drunk_goodall 
I see that the application is running, and the port 49155 is mapped to my application running in the container on port 3001. By running
boot2docker ip 
The VM's Host only interface IP address is: 192.168.59.103
I see my host only network is 192.168.59.103. So my application can be reached at http://192.168.59.103:49153/
curl http://192.168.59.103:49155 
<!--   Licensed under the Apache License, Version 2.0 (the "License");——>
To run my test suite against the application in the container (using the local db in container) I can pass in this information
$ env WEB_PORT=49155 WEB_HOSTNAME=192.168.59.103 DB_HOSTNAME=192.168.59.103 DB_PORT=49153 --reporter=spec mocha 
  35 passing (1s)
  15 pending
This helped me find some bugs in my test cases which in some places had assumed port information. The ability to develop code and write tests at the same time that are more than unit tests and don’t assume the location of the application is critical for having a continuous test process later in the pipeline.  

Dockerizing my test cases
I would like others to also be able to run the tests for my application as a part of the deployment process. Dockerizing my tests within a container is a nice way for me to hand off the test suite with the application. This has the added benefit of removing the need to understand all the information the tests need to know about the location of the application. When you link containers docker will update /etc/hosts with the ip address of the linked container and set environment variables about the container such as the exposed ports and addresses. This allows me to leverage those environment variables to automatically run the test suite against the linked container. I created a simple Dockerfile very similar to the application that installed node, and ran npm install to get dependencies such as mocha. This time the Dockerfile runs a shell script to invoke mocha using the alias information Docker provides when linking containers. The last CMD in my docker file executes the shell script
CMD ["/app/runtests.sh”] 
which contains
env
cd /app
echo "Running full test suite"
env WEB_PORT=$STS_ALIAS_PORT_3001_TCP_PORT WEB_HOSTNAME=$STS_ALIAS_PORT_3001_TCP_ADDR DB_PORT=$STS_ALIAS_PORT_27017_TCP_PORT DB_HOSTNAME=$STS_ALIAS_PORT_27017_TCP_ADDR mocha --reporter spec

Now to run my tests I first need to start the application
$docker run -P -d --name simpletopologyservice rjminsha/simpletopologyservice 
(note this time I named it simpletopologyservice)
$docker ps 
CONTAINER ID        IMAGE                                   COMMAND             CREATED             STATUS              PORTS                                                                         NAMES
d29ae62d874a        rjminsha/simpletopologyservice:latest   /app/start.sh       11 minutes ago      Up 11 minutes       0.0.0.0:49153->27017/tcp, 0.0.0.0:49154->28017/tcp, 0.0.0.0:49155->3001/tcp   silly_wilson/sts_alias,simpletopologyservice
And then run a linked tests container:
$ docker run --link simpletopologyservice:sts_alias rjminsha/simpletopologyservicetest
Connecting to mongo with:mongodb://192.168.59.103:49153/simpleTopologyService
topologyPort:49155
topologyHostname:192.168.59.103

  35 passing (2s)
  15 pending
This dockerized test suite is going to be useful later on if I integrate the application with a health manager.

At this point I have a basic application, I have dockerized that application and can run tests locally against it. I have also dockerized my test suite and can run those using an attached container or by pointing the test suite at a remote instance of the application. It is time to setup a Pipeline for the application so that I can check-in code and have those changes tested, shared and deployed.  

References
Things I read to learn a bit about Node
  • http://www.ibm.com/developerworks/library/wa-nodejs-polling-app/
  • http://pixelhandler.com/posts/develop-a-restful-api-using-nodejs-with-express-and-mongoose
  • http://www.andreagrandi.it/2013/02/24/using-twitter-bootstrap-with-node-js-express-and-jade/
  • http://www.vinaysahni.com/best-practices-for-a-pragmatic-restful-api
  • http://madhatted.com/2013/3/19/suggested-rest-api-practices
Posts on ways to run node and mongo in containers 
  • Multiple containers: http://luiselizondo.net/blogs/luis-elizondo/how-create-docker-nodejs-mongodb-varnish-environment 
  • Single container: http://beletsky.net/2013/12/run-several-processes-in-docker-container.html
  • Using supervisord: http://docs.docker.com/examples/using_supervisord/ 
 

2 comments:

  1. Robbie,

    I would like to build an image which runs WAS over AIX. Then publish this image and later use it as a container to deploy our applications. How can we do this?

    Navin

    ReplyDelete
  2. we are offering best devops online training with job support and high quality training facilities and well expert faculty .
    to Register you free demo please visit ,devops training in hyderabad

    ReplyDelete