The TL;DR on Immutable Infrastructure

After having worked on developing Kolla, an Immutable deployment tool using Ansible to deploy Docker containers containing OpenStack since September 2014, I have come to a clear conclusion there is some confusion about what precisely immutability is and how to best achieve it.

What is this immutable infrastructure thing I keep hearing about?  Well first two definitions from google define:

  1. unchanging over time or unable to be change


  1. the basic physical and organizational structures and facilities (e.g., buildings, roads, and power supplies) needed for the operation of a society or enterprise.

First I’ll dissect immutable.  A running container consists at a high level of two things.  It consists of a complete filesystem and running application first and foremost.  More importantly, and this is where everyone gets hung up around immutability, it includes the application’s configuration options.  Some hard-core computer science nerds may think that thee only way to achieve immutability is to pass the configuration options through the environment so from container instantiation until container destruction, the configuration options always remain consistent.  This is not the only solution.

Why is immutability important?  The reason immutability is desireable is to turn any stateless imperative system into a declarative system.  In an imperative system, there are many steps required to achieve a successful (or failed) outcome.  By wrapping an imperative system in a container where the configuration never changes, that imperative (read: more complex) system has been turned into declarative system.  A declarative system has two outcomes (success or fail) and is always deterministic, meaning it always will have the same outcome after instantiation.  I use always with a grain of salt.  A cosmic ray could blow up system ram, the hard disk could fail in some way, a kernel bug could trigger an oops on the call path, or an ateroid could hit the Phoenix datacenters!  Lets just assume for a moment we throw out these failure scenarios and look to the positive side of things, which is, our infrastructure on which our immutability engine runs will never fail!

Now I’ll dissect infrastructure.  Infrastructure is all of the software that goes into making up a system.  In the case of Kolla and OpenStack, very little of OpenStack is actually stateless, but immutability still comes to the rescue in many cases.  For one, no administrator can muck around with the config options of a running system and crater the environment and have no idea what went wrong.  In a properly designed infrastructure, the administrator will configure all options in one place and that configuration will be distributed through the system causing all of the config-option related infrastructure to fail, or all of it to succeed.  The container infrastructure of Kolla includes 89 containers, many of which require some form of state, and most of  which depend on a database which can result in non-declarative behavior.

Immutability sounds pretty hot huh?  The problem is configuring software via environment variables is a huge pain in the ass.  Just looking at the reference implementation of docker registry v2.0, significant complexity goes into reading the environment variables without actually altering the contents of the virtual disk.  This is really the gold standard for an immutable infrastructure component, but is not the only way to solve the problem.

Remember, we are after a pragmatic declarative system (they why of immutability) not some gold standard where absolutely nothing in the filesystem changes.  While a completely unchanged filesystem contents meets the definition of immutability, the spirit of immutability can be met in different ways.

During kola development we have tried pretty much every method to solve this problem.  I will enumerate the solutions:

  1. Encode the configuration into the build of the container: This method delivers the immutability similar to docker registry, meaning that nothing on the disk changes, ever.  The problem with this approach is any configuration change requires a pokey container rebuild and causes deployment (the config options come from the deployment system) and image building to be mixed, violating separation of concerns.
  2. Encode some environment variables with important information and use crudini to set the on-disk configuration in /etc/service.  This delivers near-immutability but trades off complete customization.  I say near-immutability because the crudini operation would have to be deterministic for immutability to be preserved, which is hard to guarantee.  Encoding the thousands of config options that make up the big tent is hard to manage and if we did that we would want oslo.config to read the config from the environment, not the filesystem.  The result is only *some* options end up being added to the environment, the critical ones, limiting configurability.
  3. Create the configuration file that the OpenStack service runs against outside the container.  Originally I highly disliked this idea, but I think it was kfox1111 who came to the rescue and suggested “what if you just configure the container one time?”  It took me a few days to process that, but what that means is after the container starts, it runs code which host bind-mounts the configuration file, and then configures the container one time.  After the container is configured, no further alterations of the configuration are permitted without a redeploy from a central location, meaning arbitrary administrator tinkering won’t damage the deployment.  Does this deliver immutability?  Absolutely.  From container instantiation (which finishes with the configuration options are locked into place) to container destruction, the contents of the disk never change.  Immutability preserved, which zero tradeoff – no pokey build on deploy (which can take several hours with v2 registry), still maintain separation of concerns and most importantly Operators maintain complete customization over their environment.
  4. Encode the configuration file generated by the deployment tool into a JSON blob which sets the environment or configuration files appropriately.  Then use crudini to set the config options on each boot.  This would work but its not very declarative because of the crudini interaction – it was our first attempt, but we found other options to be more viable.

With Kolla we started with #2, briefly tried #4, and finished with technique #3.  I’m interested to hear in the comments of this blog post how other people achieved immutability in their infrastructure components without using the onerous environment variable to pass in hundreds of configuration options.

It would be interesting if docker added some type of immutable file loading that was built in and only read the external file(s) (and installed it) the first time the container was run.  Alas there is no such thing.

The next step in immutable infrastructure is ensuring a security breakout of the process has limited ability to modify the filesystem.  For example, if external software were running as root and could somehow modify files at a whim, it could modify /etc/sudoers and easily escalate privileges to root inside the container.  Then there would be a real problem!  This same problem can happen on bare metal, but containers insert an extra layer to break through so they actually increase security compared to bare metal.  We solved this problem in Kolla by running the containers as regular users and limiting their scope to modify system files which are only owned by the processes UID/GID.  While it would be possible for some minimal damage to be done as a non-root user, at-least the immutability of the files not controlled by the process would be preserved.

I’ve wrote about what immutability is, and a little about why you would want immutability.  Besides the warping of a process from imperative to declarative, there are benefits that trickle from that reality.

  1. An operator has to try hard (using docker exec) to modify the contents of the filesystem.  Immutability protects vendors from cowboy coders like myself – since someone that goes around and mucks with the internals of a container is unlikely to call technical support.
  2. A  technical support agent doesn’t have to guess what software has been installed on the system which may cause conflicts with that vendor’s software.  An immutable deployment target should only have immutable software on it.
  3. Upgrades and downgrades work flawlessly since the full state of the system including its configuration is recorded in the containers.
  4. Immutable infrastructure will change the world.  We are just in the beginning phase of the conversion to immutability.  Companies are kicking the tires on their favorite immutability engine (mine is Docker).  The immutability software is a bit green.  Still, I feel completely comfortable deploying OpenStack in n-way active H/A mode using Kolla using docker as our immutability engine.  Kolla doesn’t use any complex features of Docker; everything we use has been in use for a year or more in the field.

I hope folks find this blog post helpful in your journey towards implementing immutable datacenter.  That is the next big thing in computing, and will take years to achieve

Announcing the Release of Kolla Liberty

Hello OpenStackers!

The Kolla community is pleased to announce the release of the Kolla Liberty.  This release fixes 432 bugs and implements 58 blueprints!

During Liberty, Kolla joined the big tent governance!  Our project can be found in the OpenStack Governance repository.

Kolla is an opinionated OpenStack deployment system unless the operator has opinions!  Kolla is completely customizable but comes with consumable out of the box defaults for use with Ansible deployment.  To understand the Kolla community’s philosophy towards deployment, please read our customize deployment documentation.

Kolla includes the following features:

  • AIO and multinode deployment using Ansible with n-way active high availability.
  • Vastly improved documentation.
  • Tools to build Docker images and deploy OpenStack via Ansible in Docker containers.
  • Build containers for CentOS, Oracle Linux, RHEL, and Ubuntu distributions.
  • Build containers from both binary packaging and directly from source.
  • Development environments using Heat, Vagrant, or bare-metal.
  • All “core” OpenStack services implemented as micro-services in Docker containers.
  • Minimal host deployment target dependencies requiring only docker-engine and docker-py.

The following services can be deployed via Ansible in 12 to 15 minutes with 3 node high availability:

  • ceph for glance, nova, cinder  
  • cinder (only ceph is implemented as a backend at this time)
  • glance
  • haproxy
  • heat
  • horizon
  • ironic (tech preview)
  • keystone
  • mariadb with galera replication
  • memcached
  • murano
  • neutron
  • nova
  • rabbitmq
  • swift

Kolla’s implementation is stable and the core reviewers feel Kolla is ready for evaluation by operators and third party projects.  We strongly encourage people to evaluate the included Ansible deployment tooling and are keen for additional feedback.

Preserving contaner properties via volume mounts

In the Kolla project, we were heavily using host bind mounts to share filesystem data with different containers.  A host bind mount is an operation where a host directory, such as /var/lib/mysql is mounted directly into the container at some specific location.

The docker syntax for this operation is:

sudo docker run -d -v /var/lib/mysql:/var/lib/mysql -e MARIADB_ROOT_PASSWORD=password kollaglue/centos-rdo-mariadb-app

This pulls and starts the kollaglue/centos-rdo-mariadb-app container and bind mounts /var/lib/mysql from the host into the container at the same location.  This allows all containers to share the host’s /var/lib/mysql that are started with this bindmount.

Through months of trial and error, we found bind mounting host directories to be highly suboptimal.

Containers exhibit three magic properties.

  • Containers are declarative in nature. A container either starts or fails to start, and should do so consistently. Even though containers typically run imperative code, the imperative nature is abstracted behind a declarative model. So it is possible that an imperative change in the how the container starts could remove this spectacular property. If the service relies on a database, or data stored on the filesystem, the system becomes non-deterministic. Determinism is a major advantage of declarative programming.
  • Containers are immutable. The contents, once created can not be modified except by the container software itself. It is almost like composing an entire distribution including compilers and library runtimes as one binary to be run.
  • Containers should be idempotent. A container should be able to be re-run consistently without failing if it started correctly the first time.

Using a host bind mount weakens or destroys the three magic properties of containers.  Docker, Inc. is intuitively aware this was a problem so they implemented docker data volume containers.  A docker data container is is a container that is started once and creates a docker volume.  A docker volume is permanent persistent storage created by the VOLUME operation in a Dockerfie or the –volume command.  Once the data container is created, it’s docker volume is always available to other docker containers using the volumes-from operation.

The following operation starts a data container based upon the centos image, creates a data volume called /var/lib/myql, and finally runs /bin/true which exits quickly:

docker run -d --name=mariadb_data --volume=/var/lib/mysql centos true

Next the container ID must be retrieved to start the application container:

sudo docker ps -a
CONTAINER ID   IMAGE           COMMAND  CREATED         STATUS                    PORTS  NAMES
56361937ac79    centos:latest  "true"   10 minutes ago  Exited (0) 10 minutes ago        mariadb_data

Next we run the mariadb-app container using the –volumes-from feature. Note docker allows short-hand specification of container id, so in this example 56 is the centos data container 56361937ac79:

sudo docker run -d --volumes-from=56 -e MARIADB_ROOT_PASSWORD=password kollaglue/centos-rdo-mariadb-app

When using data volume containers, all the correct permissions are sorted out by docker automatically. Data is shared between containers. Most importantly it is more difficult to modify the container’s volume contents from outside the container. All of these benefits help preserve the declarative, immutable, and idempotent properties of containers.

We also use data containers for nova-compute in Kolla.  We still continue to use bind mounts in some circumstances.  For example, nova-api needs to run modprobe to load kernel modules.  To support that we allow bind mounting of /var/lib/modules:/var/lib/modules with the :ro (read only) flag.

We also continue to have some container-writeable bind mounts.  The nova-libvirt container requires /sys/fs/cgroups:/sys/fs/cgroups to be bind mounted.  Some types of super privileged containers cannot get away from bind mounts, but most of the Kolla system now runs without them.