How we use CloudInit in OpenStack Heat

Many people over the past year have asked me how exactly to use CloudInit while the Heat developers have implemented OpenStack Heat.  Since CloudInit is the default virtual machine bootstrapping system on Debian, Fedora, Red Hat Enterprise Linux, Ubuntu and likely more distros, we decided to start with CloudInit as our base bootstrapping system.  I’ll present a code walk-through of how we use CloudInit inside OpenStack Heat.

Reading the CloudInit documentation is helpful, but it lacks programming examples of how to develop software to inject data into virtual machines using CloudInit.   The OpenStack Heat project implements injection in Python for CloudInit-enabled virtual machines.  Injection occurs by passing information to the virtual machine that is decoded by CloudInit.

IaaS paltforms require a method for users to pass data into the virtual machine.  OpenStack provides a metadata server which is co-located with the rest of the OpenStack infrastructure   When the virtual machine is booted, it can then make a HTTP request to a specific URI and return the user data passed to the instance during instance creation.

CloudInit’s job is to contact the metadata server and bootstrap the virtual machine with desired configurations.  In OpenStack Heat, we do this with three specific files.

The first file is our CloudInit configuration file:

runcmd:
- setenforce 0 > /dev/null 2>&1 || true
user: ec2-user

cloud_config_modules:
- locale
- set_hostname
- ssh
- timezone
- update_etc_hosts
- update_hostname
- runcmd

# Capture all subprocess output into a logfile
# Useful for troubleshooting cloud-init issues
output: {all: '| tee -a /var/log/cloud-init-output.log'}

This file directs CloudInit to turn off SELinux, install ssh keys for the user ec2-user, setup the locale, hostname, ssh, timezone, modify /etc/hosts with correct information and output the results of all cloud-init data to /var/log/cloud-init-output.log

There are many cloud config modules which provide different functionality.  Unfortunately they are not well documented, so the source must be read to understand their behavior.  For a list of cloud config modules, check the upstream repo.

Another file required by OpenStack Heat’s support for CloudInit is a part handler:

#part-handler
import os
import datetime
def list_types():
return(["text/x-cfninitdata"])

def handle_part(data, ctype, filename, payload):
if ctype == "__begin__":
try:
os.makedirs('/var/lib/heat-cfntools', 0700)
except OSError as e:
if e.errno != errno.EEXIST:
raise
return

if ctype == "__end__":
return

with open('/var/log/part-handler.log', 'a') as log:
timestamp = datetime.datetime.now()
log.write('%s filename:%s, ctype:%s\n' % (timestamp, filename, ctype))

if ctype == 'text/x-cfninitdata':
with open('/var/lib/heat-cfntools/%s' % filename, 'w') as f:
f.write(payload)

The part-handler.py file is executed by CloudInit to separate the UserData provided by the MetaData server in OpenStack.  CloudInit executes handle_part() for each part of a multi-part mime message which CloudInit doesn’t know how to decode.  This is how OpenStack Heat passes unique information for each virtual machine to assist in the orchestration process.  The first ctype is always set to __begin__. which triggers handle_part() to create the directory /var/lib/heat-cfntools.

The OpenStack Heat instance launch code uses the mime type of x-cfninitdata  In OpenStack Heat.  OpenStack Heat passes several files via this mime subtype each of which is decoded and stored in /var/lib/heat-cfntools.

The final file required is a script which runs at first boot:

#!/usr/bin/env python

path = '/var/lib/heat-cfntools'

def chk_ci_version():
    v = LooseVersion(pkg_resources.get_distribution('cloud-init').version)
    return v >= LooseVersion('0.6.0')

def create_log(path):
    fd = os.open(path, os.O_WRONLY | os.O_CREAT, 0600)
    return os.fdopen(fd, 'w')

def call(args, log):
    log.write('%s\n' % ' '.join(args))
    log.flush()
    p = subprocess.Popen(args, stdout=log, stderr=log)
    p.wait()
    return p.returncode

def main(log):

    if not chk_ci_version():
        # pre 0.6.0 - user data executed via cloudinit, not this helper
        log.write('Unable to log provisioning, need a newer version of'
                  ' cloud-init\n')
        return -1

    userdata_path = os.path.join(path, 'cfn-userdata')
    os.chmod(userdata_path, 0700)

    log.write('Provision began: %s\n' % datetime.datetime.now())
    log.flush()
    returncode = call([userdata_path], log)
    log.write('Provision done: %s\n' % datetime.datetime.now())
    if returncode:
        return returncode

if __name__ == '__main__':
    with create_log('/var/log/heat-provision.log') as log:
        returncode = main(log)
        if returncode:
            log.write('Provision failed')
            sys.exit(returncode)

    userdata_path = os.path.join(path, 'provision-finished')
    with create_log(userdata_path) as log:
        log.write('%s\n' % datetime.datetime.now())

This script logs the output of the execution of /var/lib/heat-cfn/cfnuserdata.

These files are co-located with OpenStack Heat’s engine process which loads these files and combines them plus other OpenStack Heat specific configuration blobs into one multipart mime message.

OpenStack Heat’s UserData generator:

    def _build_userdata(self, userdata):
        if not self.mime_string:
            # Build mime multipart data blob for cloudinit userdata

            def make_subpart(content, filename, subtype=None):
                if subtype is None:
                    subtype = os.path.splitext(filename)[0]
                msg = MIMEText(content, _subtype=subtype)
                msg.add_header('Content-Disposition', 'attachment',
                               filename=filename)
                return msg

            def read_cloudinit_file(fn):
                return pkgutil.get_data('heat', 'cloudinit/%s' % fn)

            attachments = [(read_cloudinit_file('config'), 'cloud-config'),
                           (read_cloudinit_file('part-handler.py'),
                            'part-handler.py'),
                           (userdata, 'cfn-userdata', 'x-cfninitdata'),
                           (read_cloudinit_file('loguserdata.py'),
                            'loguserdata.py', 'x-shellscript')]

            if 'Metadata' in self.t:
                attachments.append((json.dumps(self.metadata),
                                    'cfn-init-data', 'x-cfninitdata'))

            attachments.append((cfg.CONF.heat_watch_server_url,
                                'cfn-watch-server', 'x-cfninitdata'))

            attachments.append((cfg.CONF.heat_metadata_server_url,
                                'cfn-metadata-server', 'x-cfninitdata'))

            # Create a boto config which the cfntools on the host use to know
            # where the cfn and cw API's are to be accessed
            cfn_url = urlparse(cfg.CONF.heat_metadata_server_url)
            cw_url = urlparse(cfg.CONF.heat_watch_server_url)
            is_secure = cfg.CONF.instance_connection_is_secure
            vcerts = cfg.CONF.instance_connection_https_validate_certificates
            boto_cfg = "\n".join(["[Boto]",
                                  "debug = 0",
                                  "is_secure = %s" % is_secure,
                                  "https_validate_certificates = %s" % vcerts,
                                  "cfn_region_name = heat",
                                  "cfn_region_endpoint = %s" %
                                  cfn_url.hostname,
                                  "cloudwatch_region_name = heat",
                                  "cloudwatch_region_endpoint = %s" %
                                  cw_url.hostname])

            attachments.append((boto_cfg,
                                'cfn-boto-cfg', 'x-cfninitdata'))

            subparts = [make_subpart(*args) for args in attachments]
            mime_blob = MIMEMultipart(_subparts=subparts)

            self.mime_string = mime_blob.as_string()

        return self.mime_string

This code provides two functions:

  • make_subpart: Takes a list of attachments and creates mime subparts out of them
  • read_cloudinit_file: Reads OpenStack Heat’s CloudInit three files above

The rest of the function generates the UserData OpenStack Heat needs based upon the attachments list.  These attachments are then turned into a mime message which is passed to the instance creation:

       server_userdata = self._build_userdata(userdata)
        server = None
        try:
            server = self.nova().servers.create(
                name=self.physical_resource_name(),
                image=image_id,
                flavor=flavor_id,
                key_name=key_name,
                security_groups=security_groups,
                userdata=server_userdata,
                meta=tags,
                scheduler_hints=scheduler_hints,
                nics=nics,
                availability_zone=availability_zone)
        finally:
            # Avoid a race condition where the thread could be cancelled
            # before the ID is stored
            if server is not None:
                self.resource_id_set(server.id)

This snippet of code creates the UserData and passes it to the nova server create operation.

The flow is then:

  1. Create user data
  2. Heat creates nova server instance with user data
  3. Nova creates the instance
  4. CloudInit distro initialization occurs
  5. CloudInit reads config from OpenStack metadata server UserData information
  6. CloudInit executes part-handler.py with __start__
  7. CloudInit executes part-handler.py for each x-cfninitdata mime type
  8. part-handler.py writes the contents of each x-cfninitdata mime subpart to /var/lib/heat-cfntools on the instance
  9. CloudInit executes part-handler.py with __end__
  10. CloudInit executes the configuration operations defined by the config file
  11. CloudInit runs the x-shellscript blob which in this case is loguserdata.py
  12. loguserdata.py logs the output of  /var/lib/heat-cfn/cfnuserdata which is the initialization script set in the OpenStack Heat templates

This code walk-through will help developers understand how OpenStack Heat integrates with CloudInit and provide a better understanding of how to use CloudInit in your own Python applications if you roll your own bootstrapping process.

5 thoughts on “How we use CloudInit in OpenStack Heat

  1. Is it enough to install heat-cfntools and cloud-init in the image I use (ubuntu 12.04) to use this with heat or does image have to be modified further for this to work? Do I have to further configure cloud-init or anything else?

    • Cloud-init is configured by heat automatically on instance start when heat-cfntools is in the instance and cfn-init is called in the template.

      • Thanks. I found out I only had to enable EC2 data source in cloud-init, which wasn’t the case by default.

  2. Let me know, by the way, if there’s anything I can do to polish the experience with using cloud-init in Fedora. Would it be helpful to have heat-cfntools in the image to start? (It adds about 550k. I’m trying to pare the image down as much as possible but we can probably spare half a meg. The only new dep added is python-psutil.) (Or do you install that as part of the cloud-init bootstrapping?)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s