Monday, August 10, 2015

Just What Is OpenStack Infra?

I work for HP doing two things. By day I work inside the HP firewall setting up and running a CI system for testing HP's OpenStack technology. We call this system Gozer. (By the way we are hiring). By night I work upstream (in the Open Source world) with the OpenStack Infrastructure Team setting up and running a CI system for OpenStack developers.

This blog post concerns my work upstream.

One of my chief initiatives since joining the team two years ago is to make the Puppet codebase used by infra more in-line with standards, more reusable, and generally better. I have never attempted to use infra as a testbed for experimental uses of Puppet, I've always tried to apply the best practices known in the community. Of course there are exceptions to this (see all the Ansible stuff). This initiative is codified in a few different specifications accepted by the team (you don't need to read these):

One mark of the success of this ongoing initiative is that I am now in a place where I am recommending parts of our code to other people in my community. Those are the people for whom I intend this blog post. Someone sees a neat part of the Puppet OpenStack 'stuff' and wants to use it, but it needs a patch or a use case covered. This blog post is supposed to provide a high level overview of what we do, who 'we' are, and the bigger pieces and how they interact with each other. We'll start with a long series of names and definitions.


Naming things is hard

So what is OpenStack? OpenStack is an Open Source software collection based around providing cloud software. The OpenStack Foundation is a nonprofit organization that provides centralized resources to support the effort, this comes in both technical (sysadmins) and other forms (legal, conference organizing, etc). OpenStack is made up of many components, the simplest is that 'nova' provides a compute layer to the cloud i.e. kvm or xen management.

OpenStack can be installed with Puppet. The Puppet code that does this is called "OpenStack Puppet Modules." These modules install OpenStack services such as nova, glance, and cinder. Their source code is available by searching for openstack/puppet-*. The team that develops this code is called the OpenStack Puppet Module Team. This team uploads to the forge under the namespaces 'openstack' or 'stackforge.'

I do not work with these modules on a daily basis.

I work with the OpenStack Infrastructure Team. This team deploys and maintains the CI system used by OpenStack upstream developers. We have our own set of Puppet modules that are completely unrelated to the OpenStack Puppet Modules. Their source code can be found by searching for openstack-infra/puppet-*. These modules are uploaded under the forge namespaces 'openstackci' and 'openstackinfra.' We use these modules to deploy services like Gerrit, Jenkins, and Drupal. We also have a number of utility modules useful for generic Linux administration. We have Precise, Trusty, Centos 6, and various Fedora flavors in our infrastructure, so our modules often have good cross-platform support.

Central Nexus


All the openstack-infra/puppet-* modules are consumed from master by our 'central nexus' repository: system-config. System-config uses a second repository for flat-files: project-config. System-config contains node definitions, public hiera data(soon), a few utility scripts, a modules.env file, a single module to stick 'roles' in called 'openstack_project'. The more 'core' roles in openstack_project call out to another repo called: puppet-openstackci. The secrets are stored in a hiera directory that is not public.


Crude Drawing


The crude drawing above shows a typical flow. A node definition lives in site.pp, which include a role class from openstack_project, which includes a role class from the openstackci module, which then uses resources and classes from the other modules, in this case puppet-iptables.

There are other code paths too. Sometimes, often in fact, an openstack_project role will include openstack_project::server or openstack_project::template, these classes wrap up most of the 'basics' of linux administration. Template or server will go on to include more resources.

There are multiple places to integrate here. At the most basic, a Puppet user could include our puppet-iptables module in their modulepath and start using it. An individual who wants a jenkins server or another server like ours could use openstackci and it's dependencies and write their own openstack_project wrapper classes to include openstackci classes.

We do not encourage site.pp or openstack_project classes to be extended at this time, we instead encourage features or compatibility extensions to be put into openstackci or the service-specific modules themselves. This is a work in progress and some important logic still lives in openstack_project and should be moved out. A stretch-goal is to move to a place where all of openstack infra runs out of openstackci, providing only a hiera yaml file to set parameters.

Continuous Deployment

A note about modules.env: OpenStack-infra has a modules.env file instead of a Puppetfile. This file contains the location, name, and ref of git repositories to put inside the modulepath on the Puppetmaster. OpenStack infra deploys all of its own Puppet modules from master, so any change to any module can break the whole system. We counteract this danger by having lots of testing and code review before any change goes through.

A note about project-config: One of the patterns we use in OpenStack Infra is to push our configuration into flat files as much as possible. We have one repository, project-config, which holds files that control the behaviour of our services, Puppet's job is only to copy files out of the repo and into the correct location. This makes it easier for people to find these often-changed files, and means we can provide more people access to merge code there than we would with our system-config repository.

A note about puppet agent: We run puppet-agent, but it is fired from the Puppetmaster by an ansible run. We hope to move to puppet apply triggered by ansible soon.


The part where I give you things

There are two modules right now that you might be interested in using yourself. The first is our puppet-httpd module. This module was forked from puppetlabs-apache at version 0.0.4. It has seen some minor improvements from us but nothing major, other than a name change from 'apache' to 'httpd'. You can see why we forked in the Readme of the project but the kicker is that this module allows you to use raw 'myhost.vhost.erb' templates with apache. You no longer need to know how to translate the apache syntax you want into puppetlabs-apache parameters. Let's see what this looks like:

openstack_project/templates/status.vhost.erb:
# ************************************
# Managed by Puppet
# ************************************

NameVirtualHost <%= @vhost_name %>:<%= @port %>
<VirtualHost <%= @vhost_name %>:<%= @port %>>
  ServerName <%= @srvname %>
<% if @serveraliases.is_a? Array -%>
<% @serveraliases.each do |name| -%><%= " ServerAlias #{name}\n" %><% end -%>
<% elsif @serveraliases != '' -%>
<%= " ServerAlias #{@serveraliases}" %>
<% end -%>
  DocumentRoot <%= @docroot %>

  Alias /bugday /srv/static/bugdaystats
  <Directory /srv/static/bugdaystats>
      AllowOverride None
      Order allow,deny
      allow from all
  </Directory>

  Alias /reviews /srv/static/reviewday
  <Directory /srv/static/reviewday>
      AllowOverride None
      Order allow,deny
      allow from all
  </Directory>

  Alias /release /srv/static/release

  <Directory <%= @docroot %>>
    Options <%= @options %>
    AllowOverride None
    Order allow,deny
    allow from all
  </Directory>

  # Sample elastic-recheck config file, adjust prefixes
  # per your local configuration. Because these are nested
  # we need the more specific one first.
  Alias /elastic-recheck/data /var/lib/elastic-recheck
  <Directory /var/lib/elastic-recheck>
      AllowOverride None
      Order allow,deny
      allow from all
  </Directory>

  RedirectMatch permanent ^/rechecks(.*) /elastic-recheck
  Alias /elastic-recheck /usr/local/share/elastic-recheck
  <Directory /usr/local/share/elastic-recheck>
      AllowOverride None
      Order allow,deny
      allow from all
  </Directory>


  ErrorLog /var/log/apache2/<%= @name %>_error.log
  LogLevel warn
  CustomLog /var/log/apache2/<%= @name %>_access.log combined
  ServerSignature Off
</VirtualHost>



::httpd::vhost { 'status.openstack.org':
  port     => 80,
  priority => '50',
  docroot  => '/srv/static/status',
  template => 'openstack_project/status.vhost.erb',
  require  => File['/srv/static/status'],

}


If you don't need a vhost and just want to serve a directory, you can:

::httpd::vhost { 'tarballs.openstack.org':
  port     => 80,
  priority => '50',
  docroot  => '/srv/static/tarballs',
  require  => File['/srv/static/tarballs'],

}

The second is puppet-iptables, which provides the ability to spit direct iptables rules into a Puppet class and have those rules set. You can also specify the ports to open up. Again this is an example of weak modeling. Concat resources around specific rules are coming soon in this change. Let's see what using the iptables module looks like:

class { '::iptables':
  public_tcp_ports => ['80', '443', '8080'],
  public_udp_ports => ['2003'],
  rules4           => ['-m state --state NEW -m tcp -p tcp --dport 8888 -s somehost.openstack.org -j ACCEPT'],
  rules6           => ['-m state --state NEW -m tcp -p tcp --dport 8888 -s somehost.openstack.org -j ACCEPT'],


This enables you to manage iptables the way you view iptables. It is easy to debug, easy to reason about, and extensible. We think it provides a significant advantage over the puppetlabs-firewall module. Unfortunately, the puppet-iptables module currently is hardcoded to open up certain openstack hosts, that should be fixed very soon (possibly by you!). Both of these modules try to be as simple as possible.

Getting these modules right now is done through git. If you don't want to ride the 'master' train with us, you can hop in #openstack-infra on freenode and ask for a tag to be created at the revision you need. We're working on getting forge publishing in to the pipeline, it's not a priority for us right now but if you need it you can ask for it and we can see about increasing focus there.

There are two generic modules that advance the puppet ecosystem coming out of OpenStack Infra and we hope there will be more to come. If you'd like to help us develop these modules we'd love the help. You can start learning how to contribute to OpenStack here.

5 comments: