Adding second monitoring method to Pacemaker Cloud – sshd

Recently Angus Salkeld and I have decided to start working on a second approach to Pacemaker Cloud monitoring. Today we monitor with Matahari. We would also like the ability to monitor with OpenSSH’s sshd.  With this model, sshd becomes a second monitoring agent in addition to Matahari.  Since sshd is everywhere, and everyone is comfortable with the SSH security model, we believe this makes a superb alternative monitoring solution.

To help kick that work off, I’ve started a new branch in our git repository where this code will be located called topic-ssh.

To summarize the work, we are taking the dped binary and making a second libssh2 specific binary based on the work of the dped. We will also integrate directly with libdeltacloud as part of this work. The output of this topic will be the major work in the 0.7.0 release.

We looked at python as the language for dped, but testing showed that not to be particularly feasible without drastically complicating our operating model. With our model of running thousands of dpe processes on one system, one dpe per deployable, we would need python to have a small footprint. Testing showed that python consumes 15 times as much memory per dpe instance vs a comparable C binary.

We think there are many opportunities for people without a strong C skillset, but with a strong python skillset to contribute tremendously to the project in the CPE component. We plan to rework the CPE process into a python implementation.

If you want to get involved in the project today, working on the CPE C++ to python rework would be a great place to start!

Release schedule for Corosync Needle (2.0)

Over the last 18 months, the Corosync development community has been hard at work making Corosync Needle (version 2.0.0) a reality.  This release offers an evolutionary step in Corosync by adding several community requested features, removing the troubling threads and plugins, and tidying up the quorum code base.

I would like to point out the dilligent work of Jan Friesse (Honza) for tackling the 15 or so feature backlog items on our feature list.  Angus Salkeld has taken the architectural step of moving the infrastructure (ipc, logging, and other infrastructure components) of Corosync into a separate project (http://www.libqb.org).  Finally I’d like to point out the excellent work of Fabio Di Nitto and his cabal for tackling the quorum code base to make it truly usable for bare metal clusters.

The release schedule is as follows:

Alpha		January 17, 2012	version 1.99.0
Beta		January 31, 2012	version 1.99.1
RC1		February 7, 2012	version 1.99.2
RC2		February 14, 2012	version 1.99.3
RC3		February 20, 2012	version 1.99.4
RC4		February 27, 2012	version 1.99.5
RC5		March 6, 2012		version 1.99.6
Release 2.0.0	March 13, 2012		version 2.0.0