Devops musings: Running NetflixOSS and Cloud Fabric on Docker

Previously I had setup a docker to run IBM Cloud Services Fabric powered in part by NetflixOSS prototype. Andrew has a lot of great posts on these topics.

It is time however, for me to refresh my environment. I'll use the public git repository that was open sourced earlier this year. In this post I'll simply capture my experiences. All of this is well documented in Andrew's youtube video, the git repository readme and on Andrews blog.

Conclusions

This was really easy based on instructions on the Git repository. Total time was approximately 4 hours but the majority of that was waiting for the docker images to build. It would be good to restructure the build layers so that this is faster when making changes, and would also be good to publish images on dockerHub.

While the times to detect and recover were much longer than expected working within this environment is a lot of fun, and it is very interesting to be able to test reasonably complex scenarios on distributed environments. Very valuable as a producer of micro-services package within docker containers.

I look forward to scenarios that allow me to test locally like this, then pass an application container to a pipeline that tests and deploys new versions of the application into a production environment that supports these Cloud Fabric and NetflixOSS services.

Setup

Review topology

Setup docker virtual machine

Setup a clean docker environment so that I can compare this environment to my previous. To do this I'll simply use vagrant to stand up a virtual machine.

$ vagrant up
$ vagrant halt

Updated my script to forward all the ports docker containers like to use to the virtual machine. In Virtual Box I changed the settings to update the memory to 4096, and 4 cores. Then re-started the virtual machine.

$ vagrant ssh

Pull git repository and build containers

The GIT repository lives here. On my docker virtual machine:

$ mkdir acme-air
$ git clone https://github.com/EmergingTechnologyInstitute/acmeair-netflixoss-dockerlocal.git 
$ cd acmeair-netflixoss-dockerlocal/bin

Review the env.sh to make sure it mapped to my environment
Update the id_rsa.pub and id_rsa with my SSH keys

$ ./acceptlicenses.sh
$ ./buildimages.sh

This took a long time ... like a couple of hours.

Starting the containers

There is a useful script for starting the containers which only took about a minute to run.

$ ./startminimum.sh

Basic testing

Open up two terminals to the docker virtual machine, in one monitor the instances and the other to test.
Monitor Health Manager:

$ ./showhmlog.sh
Welcome to Ubuntu 14.04 LTS (GNU/Linux 3.13.0-27-generic x86_64)
D, [2014-07-17T15:56:11.156272 #16] DEBUG -- : HM ---> target=1, actual=1, stalled=0 asg=acmeair_webapp account=user01
D, [2014-07-17T15:56:11.157862 #16] DEBUG -- : HM ---> target=1, actual=1, stalled=0 asg=acmeair_auth_service account=user01

Execute some basic validation tests

vagrant@vagrant-ubuntu-trusty-64:~/acme-air/acmeair-netflixoss-dockerlocal/bin$ ./testauth.sh 
200 HTTP://172.17.0.30/rest/api/authtoken/byuserid/uid0@email.com
id=71600f08-c220-44d2-9004-51ce1d5ffa8a
200 HTTP://172.17.0.30/rest/api/authtoken/71600f08-c220-44d2-9004-51ce1d5ffa8a
vagrant@vagrant-ubuntu-trusty-64:~/acme-air/acmeair-netflixoss-dockerlocal/bin$ ./testwebapp.sh 
200 HTTP://172.17.0.31/rest/api/login
200 HTTP://172.17.0.31/rest/api/customer/byid/uid0@email.com
200 HTTP://172.17.0.31/rest/api/flights/queryflights
200 HTTP://172.17.0.31/rest/api/login/logout?login=uid0@email.com
200 HTTP://172.17.0.31/rest/api/login

By default this is testing directly against a single instance in the autoscaling group.

vagrant@vagrant-ubuntu-trusty-64:~/acme-air/acmeair-netflixoss-dockerlocal/bin$ ./showasginstances.sh 
Pseudo-terminal will not be allocated because stdin is not a terminal.
Welcome to Ubuntu 14.04 LTS (GNU/Linux 3.13.0-27-generic x86_64)
"logging in @http://localhost:56785/asgcc/ user=user01 key=***** ..."
"OK"
"listing instances for autoscaling group: acmeair_auth_service"
INSTANCE_ID  | STATUS  | AVAILABILITY_ZONE | PRIVATE_IP_ADDRESS | HOSTNAME                       
-------------|---------|-------------------|--------------------|--------------------------------
9625eaaa3028 | RUNNING | docker-local-1a   | 172.17.0.30        | acmeair-auth-service-731744efdd

"listing instances for autoscaling group: acmeair_webapp"
INSTANCE_ID  | STATUS  | AVAILABILITY_ZONE | PRIVATE_IP_ADDRESS | HOSTNAME                 
-------------|---------|-------------------|--------------------|--------------------------
e57c9d74c97c | RUNNING | docker-local-1a   | 172.17.0.31        | acmeair-webapp-9023aa3b81

It is also worth noting that the scripts are SSH'ing directly into the running instances of the containers to get information. So if you are using boot2docker then you need to be running them directly from the boot2docker virtual machine not on the local mac command line.

Next lets test the web application hitting the zuul edge service to make sure that it has gotten the location of the web application from eureka.
Find the ip address of the zuul edge service, and then pass that into the basic test script

vagrant@vagrant-ubuntu-trusty-64:~/acme-air/acmeair-netflixoss-dockerlocal/bin$ ./showipaddrs.sh 
172.17.0.19 skydns
172.17.0.20 skydock
172.17.0.22 cassandra1
172.17.0.25 eureka
172.17.0.26 zuul
172.17.0.27 microscaler
172.17.0.28 microscaler-agent
172.17.0.29 asgard
172.17.0.30 acmeair-auth-service-731744efdd
172.17.0.31 acmeair-webapp-9023aa3b81
vagrant@vagrant-ubuntu-trusty-64:~/acme-air/acmeair-netflixoss-dockerlocal/bin$ ./testwebapp.sh 172.17.0.26
200 HTTP://172.17.0.26/rest/api/login
200 HTTP://172.17.0.26/rest/api/customer/byid/uid0@email.com
200 HTTP://172.17.0.26/rest/api/flights/queryflights
200 HTTP://172.17.0.26/rest/api/login/logout?login=uid0@email.com

Experimenting with the environment

Testing the health manager and service lookup

Next lets test what happens with auto-scaling groups, failover and recovery. Currently we just have a single instance in the auto-scaling group and we will experiment with more later, working with a single instance should make cause and effect pretty straight forward. To do this we can start up the containers with autoscaling groups, run some tests continuously, and then kill the application containers. We should expect that the health manager will detect that a container is no longer responding, so will start another instance of the container.

I'll have 4 windows open

window running continuous tests directly against instance
window running continuous tests against zuul
window tailing on the health manager logs
command prompt from which I can monitor and kill containers

I copied testwebapp.sh to conttestwebapp.sh and made a quick adjustment so that it will simply keep on running

#!/bin/sh

. ./env.sh

webapp_addr=$1

if [ -z "$webapp_addr" ]; then
  webapp_addr=$($docker_cmd ps | grep 'acmeair/webapp' | head -n 1 | cut -d' ' -f1 | xargs $docker_cmd inspect --format '{{.NetworkSettings.IPAddress}}')
fi
while true; do
        sleep 2
        for i in `seq 0 1 10`
        do
         curl -sL -w "%{http_code} %{url_effective}\\n" -o /dev/null -c cookie.txt --data "login=uid0@email.com&password=password" $webapp_addr/rest/api/login
         curl -sL -w "%{http_code} %{url_effective}\\n" -o /dev/null -b cookie.txt $webapp_addr/rest/api/customer/byid/uid0@email.com
         curl -sL -w "%{http_code} %{url_effective}\\n" -o /dev/null -b cookie.txt --data "fromAirport=CDG&toAirport=JFK&fromDate=2014/03/31&returnDate=2014/03/31&oneWay=false" $webapp_addr/rest/api/flights/queryflights
         curl -sL -w "%{http_code} %{url_effective}\\n" -o /dev/null -b cookie.txt $webapp_addr/rest/api/login/logout?login=uid0@email.com
        done
done

Window1: get IP address of web-app instance, and test continuously

$ ./showipaddrs.sh | grep zuul 
172.17.0.19 skydns
172.17.0.20 skydock
172.17.0.22 cassandra1
172.17.0.25 eureka
172.17.0.26 zuul
172.17.0.27 microscaler
172.17.0.28 microscaler-agent
172.17.0.29 asgard
172.17.0.30 acmeair-auth-service-731744efdd
172.17.0.33 acmeair-webapp-dec27996fa
$ ./conttestwebapp.sh 172.17.0.33
200 HTTP://172.17.0.33/rest/api/login

Window2: test continuously against zuul

$ ./showipaddrs.sh | grep zuul 
172.17.0.26 zuul
$ ./conttestwebapp.sh 172.17.0.26
200 HTTP://172.17.0.33/rest/api/login

Window3: monitor health manager

$ ./showhmlog.sh 
Pseudo-terminal will not be allocated because stdin is not a terminal.
Welcome to Ubuntu 14.04 LTS (GNU/Linux 3.13.0-27-generic x86_64)

 * Documentation:  https://help.ubuntu.com/
stdin: is not a tty
D, [2014-07-17T16:26:54.349445 #16] DEBUG -- : HM ---> target=1, actual=1, stalled=0 asg=acmeair_webapp account=user01
D, [2014-07-17T16:26:54.351620 #16] DEBUG -- : HM ---> target=1, actual=1, stalled=0 asg=acmeair_auth_service account=user01
D, [2014-07-17T16:27:14.361714 #16] DEBUG -- : HM ---> target=1, actual=1, stalled=0 asg=acmeair_webapp account=user01

Window3: kill something

$ docker ps | grep webapp 
df371d9a62e9        acmeair/webapp-liberty:latest         /usr/bin/supervisord   7 minutes ago       Up 7 minutes        0.0.0.0:49180->22/tcp, 0.0.0.0:49181->9080/tcp, 0.0.0.0:49182->9443/tcp   acmeair-webapp-dec27996fa         

$ docker stop df371d9a62e9 
df371d9a62e9

$ docker rm df371d9a62e9
df371d9a62e9

Now we can see the instance become stale in the health manager logs, the tests failing and a new instance starting.

After a few seconds the instance will be up and running and the tests will begin to work. The direct tests to the web application will need to be re-directed to the new instance as the IP address has changed, while the tests against zuul will automatically pick up the new instance and begin to pass again.

Pretty neat. The netflixOSS libraries picked up that the instance was no longer and started another one using the micro-scaler for docker.

Default detection and recover times

Killing webapp

~15 seconds for health manager to detect stale instance
~29 seconds for health manager to successfully started a new instance
~30 seconds for direct tests to pass
~2 minutes 10 seconds for zuul tests to pass indicating it has a new reference to the new instance of the web application

Killing authentication service
Very similar test. In this case I killed the authentication service, ran tests against the authentication service directly, while at the same time running webapp tests against zuul. As expected I see a new container started, and the webapplication tests (dependent upon auth) fail for some period of time until the webapp detects a new instance of the authentication service is available.

~15 seconds for HM to detect the stale instance
~30 seconds for the HM to start a new auth instance
~55 seconds until the new auth service is successfully responding to tests
~1 minute 43 seconds for the web application (via zuul) to pass indicating that it had successfully picked up a new reference to the authentication service via eureka

Working with auto-scaling groups via asgard

In this post I won't explore too many scenarios, but would like to make sure that the environment is setup and functioning correctly. Asgard provides the interface to manage auto-scaling groups (which are scaled using the micro-scaler). So lets have a look at the asgard console and play around.
Find the IP address of asgard, and quickly test that I can access it

$ ./showipaddrs.sh | grep asgard 
172.17.0.12 asgard

$ curl -sL -w "%{http_code}" -o /dev/null http://172.17.0.12
200

I can now open up a web browser and navigate Asgard to see my auto-scaling groups. However, I do not appear to be able to modify the existing auto-scaling groups currently in the web-ui.

Working with auto-scaling groups via command line

Because I am unable to update the auto-scaling groups via asgard currently, I'll steal from the command line scripts and configure it directly on the micro-scaler. I created a simple script which will login to the micro-scaler via SSH and update the min, max and desired size of the auto-scaling groups.

 
$ ./updateasg.sh -m 3 -x 6 -d 4 -a acmeair_auth_service 
setting MIN=3, MAX=6 and DESIRED=4
{"name"=>"acmeair_auth_service", "min_size"=>3, "max_size"=>6, "desired_capacity"=>4}
"updating autoscaling group {\"name\"=>\"acmeair_auth_service\", \"min_size\"=>3, \"max_size\"=>6, \"desired_capacity\"=>4} ..."
"OK"


$ ./updateasg.sh -m 1 -x 4 -d 2 
setting MIN=1, MAX=4 and DESIRED=2
{"name"=>"acmeair_webapp", "min_size"=>1, "max_size"=>4, "desired_capacity"=>2}
"updating autoscaling group {\"name\"=>\"acmeair_webapp\", \"min_size\"=>1, \"max_size\"=>4, \"desired_capacity\"=>2} ..."
"OK"


$ ./showasginstances.sh 
"listing instances for autoscaling group: acmeair_auth_service"
INSTANCE_ID  | STATUS  | AVAILABILITY_ZONE | PRIVATE_IP_ADDRESS | HOSTNAME                       
-------------|---------|-------------------|--------------------|--------------------------------
4fc7096f16db | RUNNING | docker-local-1a   | 172.17.0.13        | acmeair-auth-service-68810614db
07ec4db6ee74 | RUNNING | docker-local-1a   | 172.17.0.17        | acmeair-auth-service-c181cfb84d
df72cc31d9a7 | RUNNING | docker-local-1a   | 172.17.0.21        | acmeair-auth-service-783abcf2d8
cf7683e75d22 | RUNNING | docker-local-1a   | 172.17.0.22        | acmeair-auth-service-1575c11f44

"listing instances for autoscaling group: acmeair_webapp"
INSTANCE_ID  | STATUS   | AVAILABILITY_ZONE | PRIVATE_IP_ADDRESS | HOSTNAME                 
-------------|----------|-------------------|--------------------|--------------------------
3c9adb9e1c90 | RUNNING  | docker-local-1a   | 172.17.0.19        | acmeair-webapp-60b3d9c19c
ab3bbbb8dc58 | STOPPING | docker-local-1a   | 172.17.0.18        | acmeair-webapp-f4eb98e58b
425fd6f254c2 | STOPPING | docker-local-1a   | 172.17.0.16        | acmeair-webapp-3c41308a4c
7a86ce9cb5cb | RUNNING  | docker-local-1a   | 172.17.0.20        | acmeair-webapp-5e66cb688e

which is now reflected in the asgard console

Various Problems

Including as perhaps this information is useful to help other debug and use the containers

Problem: accessing asgard from the local machine

Typically to access docker containers I leverage port forwarding. In virtualBox I setup my network to forward all ports that Docker maps onto the virtual machine, and then I have my docker daemon maps ports using the -P or -p options when running the container. This way I can access the exposed ports in my container from outside of the docker host. However, since the the docker containers do not expose ports I'll follow the instructions on setting up a host only network posted by Andrew. This allows me to directly access the IP addresses or any running containers from my local machine. This is pretty nice and behaves as a small private IaaS cloud. It is however, not clear to me why not simply expose certain ports in the docker containers and then leverage port forwarding out of the box. I'll explore this more later.

Problem (open): confusing ports on webapp

It is not clear to me why the webapp container exposes port 9080 when the liberty server is listening on port 80.

 docker ps 
CONTAINER ID        IMAGE                                 COMMAND                CREATED             STATUS              PORTS                                                                     NAMES
e9e9031f0632        acmeair/webapp-liberty:latest         /usr/bin/supervisord   11 minutes ago      Up 11 minutes       0.0.0.0:49168->22/tcp, 0.0.0.0:49169->9080/tcp, 0.0.0.0:49170->9443/tcp   acmeair-webapp-5b9d28bae2

It is also not clear to me why the docker containers are not exposing ports. For example, the webapp-liberty container should perhaps have 'EXPOSE 80' in it's Dockerfile.

Problem: Containers would not start after docker vm was shut down

After rebooting my machine no containers were running. Starting a new set of containers failed as the names were all reading taken by the now stopped containers. I could start each of the containers again but it was simpler to simply remove then and start again. To remove all my stopped containers:

$ docker rm $(docker ps -a -q)

Now I can run

$ ./startminimum.sh

Problem: No asgs running and incorrect network configuration for my docker host

This happened because I did not read anything or watch the video fully, if I had I would have saved myself time (classic). I'm including this in the post as the debug process may itself be interesting to new users of this environment.

After building and starting the containers I would expect to have a set of containers running for the netflixoss services, and to have two auto-scaling groups running (one for the webapp, and one for the authentication service. To validate this I'll look at the auto scaling groups

vagrant@vagrant-ubuntu-trusty-64:~/acme-air/acmeair-netflixoss-dockerlocal/bin$ ./showasgs.sh 
Pseudo-terminal will not be allocated because stdin is not a terminal.
Welcome to Ubuntu 14.04 LTS (GNU/Linux 3.13.0-27-generic x86_64)

 * Documentation:  https://help.ubuntu.com/
stdin: is not a tty
"logging in @http://localhost:56785/asgcc/ user=user01 key=***** ..."
"OK"
NAME                 | STATE   | AVAILABILITY_ZONES | URL | MIN_SIZE | MAX_SIZE | DESIRED_CAPACITY | LAUNCH_CONFIGURATION
---------------------|---------|--------------------|-----|----------|----------|------------------|---------------------
acmeair_webapp       | started | ["docker-local-... | N/A | 1        | 4        | 1                | acmeair_webapp      
acmeair_auth_service | started | ["docker-local-... | N/A | 1        | 4        | 1                | acmeair_auth_service

which did not seem to match the running containers

vagrant@vagrant-ubuntu-trusty-64:~/acme-air/acmeair-netflixoss-dockerlocal/bin$ docker ps 
CONTAINER ID        IMAGE                              COMMAND                CREATED             STATUS              PORTS                                                NAMES
a45cb9300bc0        acmeair/asgard:latest              /usr/bin/supervisord   8 minutes ago       Up 8 minutes        22/tcp, 80/tcp, 8009/tcp                             asgard              
f9844069ffa0        acmeair/microscaler-agent:latest   /usr/bin/supervisord   8 minutes ago       Up 8 minutes        22/tcp                                               microscaler-agent   
41cc3ce7c103        acmeair/microscaler:latest         /usr/bin/supervisord   9 minutes ago       Up 9 minutes        22/tcp                                               microscaler         
a9bb6e8b86de        acmeair/zuul:latest                /usr/bin/supervisord   9 minutes ago       Up 9 minutes        22/tcp, 80/tcp, 8009/tcp                             zuul                
1b43bfb7aa49        acmeair/eureka:latest              /usr/bin/supervisord   9 minutes ago       Up 9 minutes        22/tcp, 80/tcp, 8009/tcp                             eureka              
2a2b68aeae1a        acmeair/cassandra:latest           /usr/bin/supervisord   10 minutes ago      Up 10 minutes       22/tcp                                               cassandra1          
276039ec2255        crosbymichael/skydock:latest       /go/bin/skydock -ttl   10 minutes ago      Up 10 minutes                                                            skydock             
c26630a74fa5        crosbymichael/skydns:latest        skydns -http 0.0.0.0   10 minutes ago      Up 10 minutes       172.17.42.1:53->53/udp, 172.17.42.1:8080->8080/tcp   skydns

so I'll check the health manager to see if the asgs were started as expected

vagrant@vagrant-ubuntu-trusty-64:~/acme-air/acmeair-netflixoss-dockerlocal/bin$ ./showhmlog.sh 
Pseudo-terminal will not be allocated because stdin is not a terminal.
Welcome to Ubuntu 14.04 LTS (GNU/Linux 3.13.0-27-generic x86_64)

 * Documentation:  https://help.ubuntu.com/
stdin: is not a tty
D, [2014-07-16T19:25:42.138373 #14] DEBUG -- : HM ---> asg=acmeair_webapp account=user01 is in COOLDOWN state! no action taken until cooldown expires.
D, [2014-07-16T19:25:42.140839 #14] DEBUG -- : HM ---> target=1, actual=0, stalled=0 asg=acmeair_auth_service account=user01
D, [2014-07-16T19:25:42.141929 #14] DEBUG -- : HM ---> scale-up asg=acmeair_auth_service account=user01
D, [2014-07-16T19:25:42.146276 #14] DEBUG -- : HM ---> starting 1 instances for acmeair_auth_service account=user01
D, [2014-07-16T19:25:42.148137 #14] DEBUG -- : launching instance for asg=acmeair_auth_service and user=user01 with az=docker-local-1a
D, [2014-07-16T19:25:42.151127 #14] DEBUG -- : {"name"=>"acmeair_auth_service", "availability_zones"=>["docker-local-1a"], "launch_configuration"=>"acmeair_auth_service", "min_size"=>1, "max_size"=>4, "desired_capacity"=>1, "scale_out_cooldown"=>300, "scale_in_cooldown"=>60, "domain"=>"auth-service.local.flyacmeair.net", "state"=>"started", "url"=>"N/A", "last_scale_out_ts"=>1405538742}
D, [2014-07-16T19:25:42.156640 #14] DEBUG -- : cannot lease lock for account user01 and asg acmeair_auth_service
W, [2014-07-16T19:25:42.156809 #14]  WARN -- : could not acquire lock for updating n instances
D, [2014-07-16T19:26:02.192764 #14] DEBUG -- : HM ---> asg=acmeair_webapp account=user01 is in COOLDOWN state! no action taken until cooldown expires.

nope :-(
Looks as though I have setup my network incorrectly as then I look at the logs for asgard I can see connection refused exceptions

172.17.0.107 skydns
172.17.0.108 skydock
172.17.0.110 cassandra1
172.17.0.112 eureka
172.17.0.113 zuul
172.17.0.114 microscaler
172.17.0.115 microscaler-agent
172.17.0.116 asgard
vagrant@vagrant-ubuntu-trusty-64:~/acme-air/acmeair-netflixoss-dockerlocal/bin$ ssh -i id_rsa root@172.17.0.116^C
vagrant@vagrant-ubuntu-trusty-64:~/acme-air/acmeair-netflixoss-dockerlocal/bin$ clear

vagrant@vagrant-ubuntu-trusty-64:~/acme-air/acmeair-netflixoss-dockerlocal/bin$ ./showipaddrs.sh 
172.17.0.107 skydns
172.17.0.108 skydock
172.17.0.110 cassandra1
172.17.0.112 eureka
172.17.0.113 zuul
172.17.0.114 microscaler
172.17.0.115 microscaler-agent
172.17.0.116 asgard
vagrant@vagrant-ubuntu-trusty-64:~/acme-air/acmeair-netflixoss-dockerlocal/bin$ ssh -i id_rsa root@172.17.0.116
Welcome to Ubuntu 14.04 LTS (GNU/Linux 3.13.0-27-generic x86_64)

 * Documentation:  https://help.ubuntu.com/
Last login: Wed Jul 16 19:32:52 2014 from 172.17.42.1
root@asgard:~# more /opt/tomcat/logs/asgard.log 
[2014-07-16 19:31:05,268] [localhost-startStop-1] grails.web.context.GrailsContextLoader    Error initializing the application: Error creating bean with name 'com.netflix.asgard.LoadingFilters': Initialization of bean failed; nested exception is org.springframework.bean
s.factory.BeanCreationException: Error creating bean with name 'initService': Initialization of bean failed; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'dockerLocalService': Cannot create inner bean '(inner
 bean)' while setting bean property 'target'; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name '(inner bean)#7': Invocation of init method failed; nested exception is org.apache.http.conn.HttpHostConnectException
: Connection to http://172.17.42.1:2375 refused
org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'com.netflix.asgard.LoadingFilters': Initialization of bean failed; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'initSer
vice': Initialization of bean failed; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'dockerLocalService': Cannot create inner bean '(inner bean)' while setting bean property 'target'; nested exception is org.s
pringframework.beans.factory.BeanCreationException: Error creating bean with name '(inner bean)#7': Invocation of init method failed; nested exception is org.apache.http.conn.HttpHostConnectException: Connection to http://172.17.42.1:2375 refused
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)

The URL http://172.17.42.1:2375 is based on information that I setup in the env.sh for the base docker url. From my container instances I can not even ping the ip address of the docker host (172.17.42.1). I need to enable remote API access of Docker daemon via TCP socket which was btw clearly called out in the instruction of which I did not read.

vagrant@vagrant-ubuntu-trusty-64:~/acme-air/acmeair-netflixoss-dockerlocal/bin$ sudo vi /etc/default/docker

added:

DOCKER_OPTS="-H tcp://0.0.0.0:2375 -H unix://var/run/docker.sock"

restart docker and my containers

$ ./stopall.sh 
$ sudo service docker restart 
$ startminimum.sh
$ ./showasginstances.sh 
Pseudo-terminal will not be allocated because stdin is not a terminal.
Welcome to Ubuntu 14.04 LTS (GNU/Linux 3.13.0-27-generic x86_64)

 * Documentation:  https://help.ubuntu.com/
stdin: is not a tty
"logging in @http://localhost:56785/asgcc/ user=user01 key=***** ..."
"OK"
"listing instances for autoscaling group: acmeair_auth_service"
INSTANCE_ID  | STATUS  | AVAILABILITY_ZONE | PRIVATE_IP_ADDRESS | HOSTNAME                       
-------------|---------|-------------------|--------------------|--------------------------------
e255c316a75a | RUNNING | docker-local-1a   | 172.17.0.13        | acmeair-auth-service-886ff9d9ee

"listing instances for autoscaling group: acmeair_webapp"
INSTANCE_ID  | STATUS  | AVAILABILITY_ZONE | PRIVATE_IP_ADDRESS | HOSTNAME                 
-------------|---------|-------------------|--------------------|--------------------------
a02d0b2bdcd0 | RUNNING | docker-local-1a   | 172.17.0.14        | acmeair-webapp-a66656d356

$ ./showhmlog.sh 
Pseudo-terminal will not be allocated because stdin is not a terminal.
Welcome to Ubuntu 14.04 LTS (GNU/Linux 3.13.0-27-generic x86_64)

 * Documentation:  https://help.ubuntu.com/
stdin: is not a tty
D, [2014-07-16T20:00:14.447052 #14] DEBUG -- : HM ---> target=1, actual=1, stalled=0 asg=acmeair_webapp account=user01
D, [2014-07-16T20:00:14.449903 #14] DEBUG -- : HM ---> target=1, actual=1, stalled=0 asg=acmeair_auth_service account=user01

Looks good.

2 comments:

nasreen basuNovember 13, 2015 at 10:51 PM
interesting information. This is just the kind of information that i had been looking for, i'm already your rss reader now and i would regularly watch out for the new posts,Thanks a million once again, Regards ,devops training in hyderabad
soumyaMarch 19, 2018 at 2:51 AM
Thank you for sharing valuable information with us. Keep share more content on Devops Online Training Hyderabad

Devops musings

Wednesday, July 16, 2014

Running NetflixOSS and Cloud Fabric on Docker