The Heat API – A template based orchestration framework

Over the last year, Angus Salkeld and I have been developing a IAAS high availability service called Pacemaker Cloud.  We learned that the problem we were really solving was orchestration.  Another dev group was also looking at this problem inside Red Hat from the launching side.  We decided to take two weeks off from our existing work and see if we could join together to create a proof of concept implementation from scratch of AWS CloudFormation for OpenStack.  The result of that work was a proof of concept project which provided launching of a WordPress template, as had been done in our previous project.

The developers decided to take another couple weeks to determine if we could get a more functional system that would handle composite virtual machines.  Today, we released that version, our second iteration of  the Heat API.  Since we have many more developers, and a project that exceeded our previous functionality of Pacemaker Cloud, the Heat Development Community has decided to cease work on our previous orchestration projects and focus our efforts on Heat.

A bit about Heat:  The Heat API implements the AWS Cloud Formations API.  This API provides a rest interface for creating composite VMs called Stacks from template files.  The goal of the software is to be able to accurately launch AWS CloudFormation Stacks on OpenStack.  We will also enable good quality high availability based upon the technologies we created in Pacemaker Cloud including escalation.

Given that C was a poor choice of implementation language for making REST based cloud services, Heat is implemented in Python which is fantastic for REST services.  The Heat API also follows OpenStack design principles.  Our initial design after our POC shows the basics of our architecture and our quickstart guide can be used with our second iteration release.

mailing list is available for developer and user discussion.  We track milestones and issues using github’s issue tracker.  Things are moving fast – come join our project on github or chat with the devs on #heat on freenode!

Corosync 2.0.0 released!

A few short weeks after Corosync 1.0.0 was released, the developers huddled for our future planning of Corosync 2.0.0.  The major focus of that meeting was “Corosync as implemented is too complicated”.  We had threads, semaphores, mutexes, an entire protocol, plugins, a bunch of unused services, a backwards compatability layer, multiple cryptographic engines.

Going for us, we did have a world class group communication system implementation (if not a little complicated) developed by a large community of developers, battle hardened by thousands of field deployments, tested by tens of thousands of community members.

As a result of that meeting, we decided to keep the good and throw out the bad, as we did between the openais and corosync transitions.  Gone are threads.  Gone are compatibility layers.  Gone are plugins.  Gone are unsupported encryption engines.  Gone are a bunch of other user-invisible junk that was crudding up the code base.

Shortly after Corosync 2.0.0 development was started, Angus Salkeld had the great idea of taking the infrastructure in corosync (IPC, Logging, Timers, Poll loop, shared memory, etc) and putting that into a new project called libqb.  The objective of this work was obvious:  To create a world-class infrastructure library specifically focused on the needs of cluster developers with a great built-in make-check test suite.

This helped us reach even closer to our goals of simplification.  As we pushed the infrastructure out of base Corosync, we could focus more on protocols/APIs.  You would be surprised to find that implementing the infrastructure took about as much effort as the rest of the system (APIs and Totem).

All of this herculean effort wouldn’t be possible without our developer and user community.  I’d especially like to acknowledge Jan Friesse in his leadership role of helping to coordinate the upstream release process and drive the upstream feature set to 2.0.0 resolution.  Angus Salkeld was invaluable in his huge libqb effort which occurred on time and with great quality.  Finally I want to thank Fabio Di Nitto for beating various parts of the Corosync code base into submission and his special role in designing the votequorum API.  There are many other contributors including developers and tested who I won’t mention individually, but I’d also like to thank for their improvements to the code base.

Great job devs!!  Now its up to the users of Corosync to tell us if we delivered on our objective we set out with 18 months ago – making Corosync 2.0 faster, simpler, smaller, and most importantly higher quality.

The software can be downloaded from Corosync’s Website.  Corosync 2.0, as well as the rest of the improved community developed cluster stack will show up in distros as they refresh their stacks.