Playing with puppets on debian

This article is about puppet or puppetmaster, a newish tool for “data center automation and configuration management”. If you have to administrate more then one server, you probably end up repeating the same tasks over and over again on multiple machines, deploying the same configurations (maybe with minor changes) to your systems and spend and enormous amount of time in keeping everything in sync. Well, this is not my definition of fun. But there is hope. I played around for some time with FAI, which is great for install a large base of servers. Then wrote my own customized automation to keep the configurations up to date. However, now i’ll give puppet a try. As always, i stick to debian and all installation instructions will be therefore as they can be performed on a debian server.

Introduction^

Puppet is written in ruby, but it also implements an own – quite mighty – configuration language. It knows classes and defines (kind of functions), is aware of inheritance, can load/include configurations, does handle dependencies, can install (software) packages as well as take care of file contents and rights, is able to mange system users and so on. You can extend it by writing “modules”, which allows you (and others!) to share their configurations on an abstract level (you do not give away any informations about your internal structure). Ah, yeah, and it works with nearly any linux distri (which has some kind of package manager), BSD and Mac (i’ve read windows is planned).

When do i need puppet ?^

First off, you require a larger amount of servers (at least three). Two are real servers, implementing some services (if you have one, no need to automate..) and one is the puppetmaster. Now there are two usage scenarios: deployment / setup of a new server (with identical or alike configurations) and keeping the configs up2date for your existing machines. Puppet does both, in the same manner.

How does it work ?^

There is the puppetmaster (or multiple instances of them), which is kind of the “brain”. Here you deposit all the configurations and keep track of you administrated servers (there is a web-gui). Then there are your nodes, which have to run the “puppet” client. The client connects on a regular basis (default: 30min) to the master and checks for any changed files / packages. Of course, you can enforce an upgrade at any given time. There is also a dry run mode, which let’s you see in advance what would be done and you can work with branches, like stable or testing, which allows you to test configurations beforehand.

And what is this tutorial about ?^

There is already a great documentation, to which i cannot add very much. This text focuses on a practical implementation of a given scenario: maintain two apache web servers with each having an ssh server and some apache modules which share the same configuration, but also some custom virtual hosts. It will cover anything from installation to a working setup.

We will cover both apache servers and the puppetmaster. Let’s begin with the master. I used VirtualBox for my testings and do not suggest you experiment in your live environment. I assume you have a minimal installation on your test system – i started with the businesscard iso from current debian stable (lenny).

The Puppetmaster^

Installation^

On debian stable, i would suggest you use backports. Thats quite easy, append the following to your sources.list file:

deb http://www.backports.org/debian lenny-backports main contrib non-free

Then update and install the required master packages:

#root> aptitude update && aptitude install -t lenny-backports puppetmaster

Thats all. Your master is ready to be configured.

Configuration^

There are many configuration options which can be set, but let’s try it step by step. First of, we want our apache servers to have the OpenSSH server installed, some configurations changed and our authorized_keys file deployed.

One thing in advance: we will name our apache nodes (hostname) “node-puppet-1″ and “node-puppet-2″ later on, the master will be called “puppetmaster”.

Go to the directory /etc/puppet/manifests, create and edit the site.pp file like so:

#
# SSH CONFIGURATION
#

class ssh {

    package { 'openssh-server':
        ensure => installed
    }

    exec { 'mkdir /root/.ssh':
        path => '/usr/bin:/usr/sbin:/bin',
        onlyif => 'test ! -d /root/.ssh'
    }

    file { 'sshdconfig':
        name => '/etc/ssh/sshd_config',
        owner => root,
        group => root,
        mode  => 644,
        #content => template( 'ssh/sshd_config.erb' ),
        source => "puppet://$server/files/ssh/sshd_config",
        require => Package['openssh-server']
    }

    file { 'authorized_keys':
        name => '/root/.ssh/authorized_keys',
        owner => 'root',
        group => 'root',
        mode => 600,
        content => template( 'ssh/root_key.erb' ),
        require => Package['openssh-server']
    }

    service { 'ssh':
        subscribe => File[ 'sshdconfig' ],
        require => [ Package['openssh-server'], File['sshdconfig', 'authorized_keys'] ]
    }

}
#
# NODE CONFIGURATION
#

node /puppet-node/ {
    include ssh
}

Ok, lets have a look at what we’ve done here, cause it introduces a whole bunch of concepts.

class^

This section defines a new class we will use and gives it the “ssh”. This is important, cause we will back-reference to it later on. A class is basically a set of configurations in puppet, which can contain definitions, such as files, services, packages and so on.

package^

Now it’s getting interesting. This is the first definition that actually does something. We define the “openssh-server” package, which references to the debian “openssh-package”. Then we set “ensure” to “installed” which basically says what it says: ensure, that openssh-server is installed!

exec^

Another definition, which does simply execute some code we provide. In this case, make a directory in the root folder, called “.ssh”. We also have to set the path-env, this is important. With the “onlyif” statement, we assure the execution will only perform, if the directory does not already exist. Because each node will perform this each time it checks whether it is up2date, it is important, that the execution runs without any error.

file^

There are two file statements. Each define the existence of a given file. One references to a certain content for each file via the template-method (or definition, as it is called in puppet), the other via the source-statement.

Before i’ll describe template and source, you should have a look at the “require => Package[ 'openssh' ]” statement, which does implement the following: the files are to be created only if and when the openssh-server package is installed. With the require statement, puppet will “wait” until the openssh-server package is installed and then execute the file-statements. You should keep this behavior in mind, cause it will be omnipresent.

template^

The lookup path for template is in /etc/puppet/templates, therefore there has to be two files:

  • /etc/puppet/templates/ssh/root_key.erb

(SSH keys could be realized somewhat other, more elegantly, but for the sake of example, let’s do it this way)

I will go more in detail in the actual apache class, for now it is only important, that you know: there are two methods to provide contents for a file.

source^

The source here shows two important things:

  • Usage of variables, in this case $server, which contains the server hostname as the client sees it!
  • Usage of the fileserver, puppet://

Let’s begin with the latter. The fileserver is the place where you keep your “static” files, which does not use any variables (as templates might). In our case, the sshd_config, i simply stripped all comments from the debian original file and then put it under /etc/puppet/files/ssh/sshd_config. However, this will not work until you have modified your fileserver.conf in /etc/puppet to allow access to the “store” under “files”.

Here is the modified fileserver.conf:

# This file consists of arbitrarily named sections/modules
# defining where files are served from and to whom

# Define a section 'files'
# Adapt the allow/deny settings to your needs. Order
# for allow/deny does not matter, allow always takes precedence
# over deny
[files]
path /etc/puppet/files
# MODIFICATION START {
allow 192.168.0.0/16
# MODIFICATION END }

[plugins]
#  allow *.example.com
#  deny *.evil.example.com
#  allow 192.168.0.0/24

As you can see, i have modified the allow statement, so that it allows access from any LAN IP within 192.168.0.0/16. You could use “*” for all or a certain IP or whatver subnet you require.

service^

This is a more abstract concept. In short it says: if any of those files changes (in this case, sshdconfig), restart the ssh server.

node^

Well, here we define our nodes. Cause we did use a regular expression, this will apply to any node name matching /puppet-node/, which will be true for our puppet-node-1 and puppet-node-2. Then we say: “include ssh”, which references and implements the class ssh. Easy, isn’t it ?

Instead of a regular expression, we could use the full node name. Assuming your local domain is domain.tld, this might look like this:

node 'puppet-node-1.domain.tld', 'puppet-node-2.domain.tld' {
    include ssh
}

Some basics about the config^

Before going on to setup our first node, let me highlight some of the techniques we used in this configuration and clarify some pitfalls in advance.

File, Package, ..^

Those are not accidentally written in uppercase. Anytime, you use an uppercase first letter for a statement, you reference to the content. Anytime, it is lowercase, you’ll change it’s contents. Thus, when i write “File[ 'sshdconfig' ] i refrence the “file { ‘sshdconfig’: .. }” context. The example in the original docs tries to show this like so:

Exec { path => '/usr/bin:/bin' }

.. will set the path (as default) to ‘usr/bin:/bin’

exec { 'mkdir /root/.ssh' }

.. will perform the execution, and in this case, use the previous defined path.

Format^

All statements are in the following form (all but “class” and “node”):

statement { 'name': param => 'value', param2 => "value" }

The Puppet Node^

Now we should setup our first client node. I will call it “puppet-node-1″. Here again, i have started with an empty businesscard-installation. Nothing additional installed, only the debian-backports-sources are added.

Installation^

All you have to do is install the puppet package:

#root> aptitude install -t lenny-backports puppet

This shall take about 50MB (depending on which dependent packages you’ve already installed).

Configuration^

None for now.

First contact^

Now let’s bring those two together. Before i show you how, let’s make a short excursion into puppet’s security.

Security^

There are two attempts for security in puppet. First of, there are the certificates. Each client has to have a signed certificate. Normally this has to performed manually: The client connects to the server and thereby generating a certificate request which is send to the server. On the server, you have to manually sign the certificate. This could be automated (any connecting client’s certificate will be auto-signed), but this requires of course a very secure network. In this example, i will stick to the manual part. So, it works like this:

#root@client> puppetd --test --waitforcert 60 --server server-hostname

This means: build a test-connection to the server and wait for 60 seconds for the cert. While the client waits, you connect to the puppet master and run the list command:

#root@master> puppetca --list

Which will show you the hostnames of the clients requesting a certificate. Then you can sign the cert like so:

#root@master> puppetca --sign client.domain.tld

Thats for that. The other security measurements are based on firewall-like allows and disallows as you might know them from /etc/host.allow or /etc/host.deny. Those are all set in /etc/puppet/fileserver.conf. Per default, this is quite good commented and should be no problem.

For more on this topic read here.

The contact^

As described above, we first have to get the client’s cert request signed by the master. Having this done, we are ready to setup execute the first sync.

#root@puppet-node-1> puppetd --server puppetmaster.domain.tld

You can always revoke/disallow a certificate the same way:

#root@puppetmaster> puppetca --revoke puppet-node-1.domain.tld
puppet-node-2.fritz.box
notice: Revoked certificate with serial 3

Troubleshooting^

If you receive the error/warning: “hostname was not match with the server certificate“, you might have to adjust the hostname in /etc/hosts correctly (eg puppetmaster.localdomain instead of puppetmaster).

Setup the apache class^

Before we setup the next class, let’s talk some structure. You probably will end up with a lot of classes an other configurations. You should start outsourcing them into single files. The site.pp file is kind of “the main config file”, which will be loaded. From there, you can include any other class.

Structure^

The structure i used in my first steps is very simple: i’ve added a classes directory below /etc/puppet/manifestes.

#> ls -1R /etc/manifests/
/etc/puppet/manifests/:
classes
site.pp

/etc/puppet/manifests/classes:
ssh.pp

Depending on the complexity of your server structure, you can imagine, this sub-directories will grow more complex over time.

This is the new content of site.pp:

#
# IMPORT ALL CLASSES
#

import "classes/*"

#
# NODE CONFIGURATION
#

node /puppet-node/ {
    include ssh
}

I will not use the node-expressions in the further examples, only the class with the actual class implementations.

apache.pp^

Let’s start with the result:

#
# VHOST CONFIGURATIONS
#

define vhost_default() {
    $vhost_name = $name

    file { "/etc/apache2/sites-available/${vhost_name}.conf":
        owner => 'www-data',
        group => 'www-data',
        mode  => 644,
        content => template( 'http/default_vhost.erb' ),
        require => [ Package['apache2'], Package['libapache2-mod-suphp'] ]
    }

    file { "/etc/apache2/sites-enabled/${vhost_name}.conf":
        owner => 'www-data',
        group => 'www-data',
        ensure => "/etc/apache2/sites-available/${vhost_name}.conf",
        require => File[ "/etc/apache2/sites-available/${vhost_name}.conf" ]
    }
}

#
# APACHE CONFIGURATION
#

class apache {

    package { [ 'apache2', 'apache2-mpm-worker', 'apache2-suexec', 'libapache2-mod-removeip', 'libapache2-mod-fcgid', 'libapache2-mod-suphp' ]:
        ensure => installed
    }

    exec {
        'a2enmod removeip':
            onlyif => 'test ! -e /etc/apache2/mods-enabled/removeip.load',
            require => Package[ 'libapache2-mod-removeip' ];
        'a2enmod fcgid':
            onlyif => 'test ! -e /etc/apache2/mods-enabled/fcgid.load',
            require => Package[ 'libapache2-mod-fcgid' ];
        'a2enmod mime_magic':
            onlyif => 'test ! -e /etc/apache2/mods-enabled/mime_magic.load',
            require => Package[ 'apache2' ];
        'a2enmod suphp':
            onlyif => 'test ! -e /etc/apache2/mods-enabled/suphp.load',
            require => Package[ 'libapache2-mod-suphp' ]
    }

    service { 'apache2':
        subscribe => Exec[ 'a2enmod removeip', 'a2enmod fcgid', 'a2enmod mime_magic', 'a2enmod suphp' ],
        require => [ Package['apache2'] ]
    }

}

Ok, this shows a lot of new stuff. Let’s not begin from the top, but with the apache class.

apache class^

package – multi-definition^

Have a look at the package-statement. Here you can see a very quick way to apply configurations to multiple statements in a single line. Basically, you could write for each of those statements a single “package { ‘name’: ensure => ‘installed’ }“, but of course, this is much faster and cleaner (i think).

exec – compressed-definitions^

Here is another technique introduced. I would call it “compressed statement”, because you do not repeat all over the “exec” statements, but put them all into a single one. Because we have different parameters for each, we cannot use the even faster style as shown in “package” above.

service^

Here is another example of the subscribe parameter, which can be read as: “do restart apache each time any module is newly enabled”. If you use puppet to initial setup a node, this will assure apache is restarted, after the modules are installed and enabled. If you use puppet to re-configure a node or even add another apache module, this will assure your apache servers will be restarted on each node.

vhost_default definition / method^

This is completely new. You can understand a define kind of a method or a user-define statement (eg file and exec are built-in, not user defined). Before i explain what this define will do, have a look at the modified site.pp file:

#
# IMPORT ALL CLASSES
#

Exec { path => '/usr/bin:/usr/sbin:/bin' }

import "classes/*"

#
# NODE CONFIGURATION
#
node 'puppet-node-1' {
    include ssh, apache
    vhost_default { 'something.tld': }
    vhost_default { 'another.tld': }
}

Here you see the execution of the vhost_default statements. Each take a single parameter (the naming giving) and implement an apache virtual host. Back to our apache.pp file.

$vhost_name = $name^

This is a simple example of how you use variables. In this case, i assign the content for $name (is available in most statements, contains whatever you’ve put before the “:” .. here “something.tld” or “another.tld”).

file { “/etc/apache2/sites-available/${vhost_name}.conf”:^

This uses the given $vhost_name variable and defines a file which uses this. The interesting part here is the template where we define the contents of the file. I’ll explain the contents of this template later on in depth, because it is a rather complex matter and i don’t want to jump around in the explanations.

file { “/etc/apache2/sites-enabled/${vhost_name}.conf”:^

Here we define a symlink from “/etc/apache2/sites-available/${vhost_name}.conf” to “/etc/apache2/sites-enabled/${vhost_name}.conf“. The parameter you should have a look is “ensure” – it defines the target of the symlink.

The template “http/default_vhost.erb”^

Before i wrote this tutorial, i’ve never encountered puppet, nor did i write any ruby code, so please excuse any stylistic or “worst practice” mistakes i might have made (corrections welcome). Here we go:

<%
yaml_file = '/etc/puppet/node-configs/http/' + name + '.yml'
if File::exists?( yaml_file )
    require "yaml";
    config = YAML::load_file( yaml_file )
%>
<VirtualHost *:80>
    ServerName <%= config[ 'server_name' ] %>
    <% if config.has_key?( 'aliases' ) %>
    ServerAlias <%= config[ 'aliases' ].join( " " ) %>
    <% end %>
    DocumentRoot <%= config[ 'document_root' ] %>
    <% if config.has_key?( 'enable_php' ) && config[ 'enable_php' ] == 1 %>
    suPHP_Engine      On
    suPHP_ConfigPath  /var/php-config/<%= config[ 'server_name' ] %>/ini
    suPHP_UserGroup   <%= config[ 'php_user' ] %> <%= config[ 'php_group' ] %>
    suPHP_AddHandler  application/x-httpd-php     .php .php4 .php5 .php3
    <% end %>
</VirtualHost>
<% else %>
# missing YAML file
<% end %>

So, what does this do ? In short: it loads a YAML file (if not existing, the content of the vhost file will be “# missing YAML file”) and renders the configuration.

I’ve asked at puppet’s IRC about the best practice for those kind of “huge configuration variables sources”. Nobody seems to really have used this before (or probably rather: nobody have used this before was there at the time). Thats why i decided to stick with YAML. Maybe there is a better way, but this will work.

Here is the content of an example YAML file (/etc/puppet/node-configs/http/something.tld.yml):

---

server_name: 'www.something.tld'
aliases: [ 'something.tld', 'blaa.tld' ]
document_root: '/var/www/xxx.something.tld'
enable_php: 1
php_user: webuser1
php_group: webgroup1

And here is the rendered result of the vhost file:

<VirtualHost *:80>
    ServerName www.something.tld

    ServerAlias something.tld blaa.tld

    DocumentRoot /var/www/xxx.something.tld

    suPHP_Engine      On
    suPHP_ConfigPath  /var/php-config/www.something.tld/ini
    suPHP_UserGroup   webuser1 webgroup1
    suPHP_AddHandler  application/x-httpd-php     .php .php4 .php5 .php3

</VirtualHost>

Implications^

So what we can take from this ? In my opinion: puppet is a very mighty and extendable server administration / configuration tool and there are not many scenarios i can think off that are not possible.

Adding the second node^

Well, this is quickly explained:

  • Modify the contents of site.pp on the master:
    #
    # IMPORT ALL CLASSES
    #
    
    Exec { path => '/usr/bin:/usr/sbin:/bin' }
    
    import "classes/*"
    
    #
    # NODE CONFIGURATION
    #
    node 'puppet-node-1' {
        include ssh, apache
        vhost_default { 'something.tld': }
        vhost_default { 'another.tld': }
    }
    
    node 'puppet-node-2' {
        include ssh, apache
        vhost_default { 'otherdomain.tld': }
    }
  • Login to the second node.
  • Install puppet packages
    #root@puppet-node-2> echo "deb http://www.backports.org/debian lenny-backports main contrib non-free" >> /etc/apt/sources.list"
    #root@puppet-node-2> aptitude update
    #root@puppet-node-2> aptitude install -t lenny-backports puppet
  • Get certificate
    #root@puppet-node-2> puppetd --server puppetmaster.domain.tld --waitforcert 60 --test
    #root@master> puppetca --sign puppet-node-2.domain.tld
  • Get the config / setup the node
    #root@puppet-node-2> puppetd --server puppetmaster.domain.tld
  • Done

Advanced topics and Experiences^

This topic contains some informations about usage / configuration / “program behavior” as i experienced it and handles also some advanced topics, which i did not fit in within the guided part, cause it might confuse the reader. For the experiences, I do not claim that those are correct, but nevertheless they might be helpful to others.

Handling large file sets^

I’d like to deploy a large set of files, representing an apache icon set. It consists of 1000 PNG files and i did not want to build a class with 1000 file entries. I’ve checked with puppet’s IRC and my initial idea was confirmed: don’t do this, use some packaging. So, i see two possibilities (there are probably more, depending on the specific case):

  • Use tar to package them, then exec which depends on the file and untars it.
  • Use the package manager (dpkg or rpm or whatever), put this in your local repo and use the package-statement.

Live & staging – multiple environments^

This might be a very common scenario: you want to implement some new feature, it has to be tested. Puppet calls this “environments” and can handle it easily. To understand, how this works have a look at the puppet.conf file in /etc/puppet on the master server (example is from debian default config):

[main]
logdir=/var/log/puppet
vardir=/var/lib/puppet
ssldir=/var/lib/puppet/ssl
rundir=/var/run/puppet
factpath=$vardir/lib/facter
pluginsync=true
templatedir=$confdir/templates

The [main] does declare the default environment, called main. If you require another environment, lets call it “development”, you can add it like so:

[development]
logdir=/var/log/puppet
vardir=/var/lib/puppet
ssldir=/var/lib/puppet/ssl
rundir=/var/run/puppet
factpath=$vardir/lib/facter
pluginsync=true
templatedir=$confdir/templates
manifest=/usr/share/puppet/development.pp

The only difference in this is the name (in []) and the new position of the manifest file. The rest is up to you. Because you can use “import” in puppet, you could just write your extensions in the development.pp file and import the site.pp, or write a complete new configuration ..

Further reading^

manifest   = /usr/share/puppet/site.pp

Leave a Reply

CAPTCHA image