Try this:
curl -L http://git.io/cqaaQQ | sh
curl -L http://git.io/cqaaQQ | sh
Debian Squeeze:# wget http://apt.puppetlabs.com/puppetlabs-release-squeeze.deb# dpkg -i puppetlabs-release-squeeze.deb# apt-get updateUbuntu Precise:# wget http://apt.puppetlabs.com/puppetlabs-release-precise.deb# dpkg -i puppetlabs-release-precise.deb# apt-get update
I was asked a pretty reasonable question about puppet:
Can I get access to the time in Puppet without resorting to writing a fact?
This, seemingly reasonable task, is not so easy to do. Puppet does not define this for you. However, we can use the inline_template() function for great good.
For the uninitiated, inline_template() calls out to the ERB templating processor without having to have a template file. This is very useful for simple file resources or conjunction with the file_line resource from the Puppet Labs stdlib module.
file { '/etc/motd': ensure => file, content => inline_template("Welcome to <%= @hostname %>"), }
However we can abuse this by doing anything in ERB that we can do in Ruby. A friend of mine, famililar with the jinja templating system from Python, remarked: 'so its like PHP, and I'm not even being derogatory.' This means we can acces time using Ruby's built in time modules.
$time = inline_template('<%= Time.now %>')However this is being evaluated on the Puppet Master, not the node. So if the two are in different timezones, what then? The first way to improve this is to set it to utc.
$time = inline_template('<%= Time.now.utc %>')But we can actually go further and define two variables, one for time in UTC of catalog compilation and one for local time for the checking in node. While we don't have a fact for the time on the node, we do have a fact for its timezone.
$time_utc = inline_template('<%= Time.now.utc %>') $time_local = inline_template("<%= (@time + Time.zone_offset(@timezone)).strftime('%c') %>")We use the strftime('%c') to strip the UTC timezone label off of the time.
Going further:
We can take this a step further by using the inline_template() function to do time comparisons:
$go_time = "2013-09-24 08:00:00 UTC" # We could seed this in hiera! $time = inline_template("<%= Time.now.utc %>") if str2bool(inline_template("<%= Time.now.utc > Time.parse(@go_time) %>")){ notify { "GO GO GO: Make the changes!": } }What the above code does is gate changes based on time. This allows us to 'light the fuses' and only make changes to production after a certain time, after a downtime window begins for instance. Note that the Puppet clients are still going to check in on their own schedule, but since we know what time our downtime is starting, we can use Puppet to make sure they kick off a Puppet run at the right time.
$go_time = "2013-09-24 08:00:00 UTC" # We could seed this in hiera! $done_time = "2013-09-24 12:00:00 UTC" # We could seed this in hiera, too! $time = inline_template("<%= Time.now.utc %>") if str2bool(inline_template("<%= Time.now.utc > Time.parse(@go_time) %>")){ notify { "GO GO GO: Make the changes!": } cron { 'fire off the puppet run': command => 'puppet agent --no-daemonize', day => '24', # we can seed the date here in hiera, albiet more verbosely hour => '8', minute => '1', user => 'root', ensure => 'absent', } } else { cron { 'fire off the puppet run': command => 'puppet agent --no-daemonize', day => '24', # we can seed the date here in hiera, albiet more verbosely hour => '8', minute => '1', user => 'root', ensure => 'present', } }What is this doing? We put this code out at 4pm. Get some dinner. Log in at 7:30 pm and wait for our 8:00 pm downtime. In Puppet runs before 8:00 pm a cronjob will be installed that will effecat a Puppet run precisely one minute after the downtime begins. In all Puppet runs before 8:00 pm, the resources that are potentially hazardous are passed over. But in all Puppet runs after 8:00 pm, the new state is ensured and the cronjob is removed. Then this code, which should define the new state of the system, can be hoisted into regular classes and defined types.
:hierarchy: - defaults - %{clientcert} - %{environment} - globalIn the new way the hierarchy has been renamed categories, and each level of it is a category.
We can define category precedence in the system wide hiera.yaml, the module specific hiera.yaml, and the binder_config.yaml
The following binder_config.yaml will effectively insert the species category into the category listing:
--- version: 1 layers: [{name: site, include: 'confdir-hiera:/'}, {name: modules, include: ['module-hiera:/*/', 'module:/*::default'] } ] categories: [['node', '${fqdn}'], ['environment', '${environment}'], ['species', '${species}'], ['osfamily', '${osfamily}'], ['common', 'true'] ]This means we can use the species category if one is defined in a module. An example hiera.yaml from such a module is:
--- version: 2 hierarchy: [['osfamily', '$osfamily', 'data/osfamily/$osfamily'], ['species', '$species', 'data/species/$species'], ['environment', '$environment', 'data/env/$environment'], ['common', 'true', 'data/common'] ]Which means when we run Puppet...
root@hiera-2:/etc/puppet# FACTER_species='human' puppet apply modules/startrek/tests/init.pp Notice: Compiled catalog for hiera-2.green.gah in environment production in 1.07 seconds Notice: janeway commands the voyager Notice: /Stage[main]/Startrek/Notify[janeway commands the voyager]/message: defined 'message' as 'janeway commands the voyager' Notice: janeway is always wary of the section 31 Notice: /Stage[main]/Startrek/Notify[janeway is always wary of the section 31]/message: defined 'message' as 'janeway is always wary of the section 31' Notice: Finished catalog run in 0.11 secondsYou can see full example code in the startrek module.
You can pre-order my book, Pro Puppet 2nd Ed, here.
... case $::kernel { 'linux': { ...This caught my eye because I had been explicitly capitalizing the 'L' in the $::kernel fact for years. I thought to myself "Is the fact capitalized?"
zeratul:~# facter -p kernel LinuxWhat's going on here? Is the case operator insensitive?
case $::kernel { 'sunos': { notify { $::kernel: }} } notice: SunOS notice: /Stage[main]//Notify[SunOS]/message: defined 'message' as 'SunOS'Wow. Is the '==' operator in Puppet case-insensitive as well?
if $::kernel == 'sunos' { notify { 'lasers': } } notice: lasers notice: /Stage[main]//Notify[lasers]/message: defined 'message' as 'lasers'Is this a problem with facter or puppet?
if "YES" == "yes" { notify { "false is true": } } notice: false is true notice: /Stage[main]//Notify[false is true]/message: defined 'message' as 'false is true'Seriously? Yep. Turns out the '==' operator is case-insensitive. The '=~' is case-sensitive, but you have to use regular expression syntax in order to use it:
if "YES" =~ /^yes$/ { notify { "false is true": } } notice: Finished catalog run in 1.30 secondsNote that we should use '^$' to enclose the string so we don't accidentally get a substring match.
Tested on Puppet 2.7.x and 3.2.x
The command syntax is:
openssl openssl x509 -inA full example:-text
nibz@host $ openssl x509 -in /etc/ssl/certs/Verisign_Class_1_Public_Primary_Certification_Authority.pem -text Certificate: Data: Version: 1 (0x0) Serial Number: 3f:69:1e:81:9c:f0:9a:4a:f3:73:ff:b9:48:a2:e4:dd Signature Algorithm: sha1WithRSAEncryption Issuer: C=US, O=VeriSign, Inc., OU=Class 1 Public Primary Certification Authority Validity Not Before: Jan 29 00:00:00 1996 GMT Not After : Aug 2 23:59:59 2028 GMT Subject: C=US, O=VeriSign, Inc., OU=Class 1 Public Primary Certification Authority Subject Public Key Info: Public Key Algorithm: rsaEncryption Public-Key: (1024 bit) Modulus: 00:e5:19:bf:6d:a3:56:61:2d:99:48:71:f6:67:de: b9:8d:eb:b7:9e:86:80:0a:91:0e:fa:38:25:af:46: 88:82:e5:73:a8:a0:9b:24:5d:0d:1f:cc:65:6e:0c: b0:d0:56:84:18:87:9a:06:9b:10:a1:73:df:b4:58: 39:6b:6e:c1:f6:15:d5:a8:a8:3f:aa:12:06:8d:31: ac:7f:b0:34:d7:8f:34:67:88:09:cd:14:11:e2:4e: 45:56:69:1f:78:02:80:da:dc:47:91:29:bb:36:c9: 63:5c:c5:e0:d7:2d:87:7b:a1:b7:32:b0:7b:30:ba: 2a:2f:31:aa:ee:a3:67:da:db Exponent: 65537 (0x10001) Signature Algorithm: sha1WithRSAEncryption 58:15:29:39:3c:77:a3:da:5c:25:03:7c:60:fa:ee:09:99:3c: 27:10:70:c8:0c:09:e6:b3:87:cf:0a:e2:18:96:35:62:cc:bf: 9b:27:79:89:5f:c9:c4:09:f4:ce:b5:1d:df:2a:bd:e5:db:86: 9c:68:25:e5:30:7c:b6:89:15:fe:67:d1:ad:e1:50:ac:3c:7c: 62:4b:8f:ba:84:d7:12:15:1b:1f:ca:5d:0f:c1:52:94:2a:11: 99:da:7b:cf:0c:36:13:d5:35:dc:10:19:59:ea:94:c1:00:bf: 75:8f:d9:fa:fd:76:04:db:62:bb:90:6a:03:d9:46:35:d9:f8: 7c:5b -----BEGIN CERTIFICATE----- MIICPDCCAaUCED9pHoGc8JpK83P/uUii5N0wDQYJKoZIhvcNAQEFBQAwXzELMAkG A1UEBhMCVVMxFzAVBgNVBAoTDlZlcmlTaWduLCBJbmMuMTcwNQYDVQQLEy5DbGFz cyAxIFB1YmxpYyBQcmltYXJ5IENlcnRpZmljYXRpb24gQXV0aG9yaXR5MB4XDTk2 MDEyOTAwMDAwMFoXDTI4MDgwMjIzNTk1OVowXzELMAkGA1UEBhMCVVMxFzAVBgNV BAoTDlZlcmlTaWduLCBJbmMuMTcwNQYDVQQLEy5DbGFzcyAxIFB1YmxpYyBQcmlt YXJ5IENlcnRpZmljYXRpb24gQXV0aG9yaXR5MIGfMA0GCSqGSIb3DQEBAQUAA4GN ADCBiQKBgQDlGb9to1ZhLZlIcfZn3rmN67eehoAKkQ76OCWvRoiC5XOooJskXQ0f zGVuDLDQVoQYh5oGmxChc9+0WDlrbsH2FdWoqD+qEgaNMax/sDTXjzRniAnNFBHi TkVWaR94AoDa3EeRKbs2yWNcxeDXLYd7obcysHswuiovMaruo2fa2wIDAQABMA0G CSqGSIb3DQEBBQUAA4GBAFgVKTk8d6PaXCUDfGD67gmZPCcQcMgMCeazh88K4hiW NWLMv5sneYlfycQJ9M61Hd8qveXbhpxoJeUwfLaJFf5n0a3hUKw8fGJLj7qE1xIV Gx/KXQ/BUpQqEZnae88MNhPVNdwQGVnqlMEAv3WP2fr9dgTbYruQagPZRjXZ+Hxb -----END CERTIFICATE-----
First create yourself a rather small key:
ssh keygen -t rsa -b 1024It will ask you some questions, hopefully you've seen this dialog before. If you need help please feel free to comment or privately message me.
After the key has been created, copy the public string into your copybuffer.
> cat .ssh/nibz_cisco@shadow.cat.pdx.edu.pub ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAAgQDMuKvC5ZVRuQw6YF5xnMZLopBVbQv5jxgHcR6BWfws3lTaqfSrKUlp3BulxA7P2snphcavf4TS+bNHFd9PKGRVpoQ8ERZtXn1+f008XUN3cxYMZXLB18ae7kfm8Sxk/bO4xWGaQAKc7jkIQY4OLIE0TsKTZGux241N6BNeLGmuLQ== nibz@shadow.cat.pdx.edu
Now add the key to cisco. This assumes the user has already been created properly. It also assumes you are running the following version of IOS:
* 1 52 WS-C3750G-48TS 15.0(2)SE C3750-IPBASEK9-MI have tried this on a 15.0(1) and it didn't work. Configuration commands:
fab6017a#conf t Enter configuration commands, one per line. End with CNTL/Z. fab6017a(config)#ip ssh pubkey fab6017a(config)#ip ssh pubkey-chain fab6017a(conf-ssh-pubkey)#username nibz fab6017a(conf-ssh-pubkey-user)#key-string fab6017a(conf-ssh-pubkey data)#$snphcavf4TS+bNHFd9PKGRVpoQ8ERZtXn1+f008XUN3cxYMZXLB18ae7kfm8Sxk/bO4xWGaQAKc7jkIQY4OLIE0TsKTZGux241N6BNeLGmuLQ== nibz@shadow.cat.pdx.edu fab6017a(conf-ssh-pubkey-data)#exit fab6017a(conf-ssh-pubkey-user)#endSome notes on the above: Paste the whole public key once you get the (conf-ssh-pubkey-data) prompt. This includes the 'ssh-rsa' header and comment footer. Use the exit keyword on the (conf-ssh-pubkey-data) line, any other word will be sandwiched onto the end of the key. You can use this feature to split your key into multiple lines and input it that way. After this, cisco will hash your key and the configuration will look like:
username nibz key-hash ssh-rsa 2F33A5AE2F505B42203276F9B2313138 nibz@shadow.cat.pdx.eduThis configuration can be put in other cisco configs elsewhere in your infrastructure. Happy hacking. This was performed on a Cisco3750G running IOS 15.0(2)SE
The fast way:
# move primaries off node gnt-node migrate $node # move secondaries off node gnt-node evacuate --secondary-only $nodeSometimes nodes fail to migrate. I don't know why. I just use the gnt-instance failover command to move them which requires a reboot. Sometimes the secondary disks don't move either. This is more annoying because hail crashes before making a plan for you. I have written the following script to, stupidly, move all the secondaries off of one node onto other nodes so you can reboot a node in your cluster without fear.
#!/bin/bash evac_node=$1 if [ -z $evac_node ]; then echo "Please specify a node to evac" exit 1 fi echo "Evacuating secondaries from $evac_node" HYPERVISORS=`wget --no-check-certificate -O- https://localhost:5080/2/nodes 2>/dev/null | grep id | awk '{print $NF}' | tr -d \",` for instance in `gnt-instance list -o name,snodes | grep $evac_node | cut -d " " -f 1` do current_primary=`gnt-instance list -o name,pnode | grep $instance | awk '{print $NF}'` target_node=`echo $HYPERVISORS | sed 's/ /\n/g' | grep -v $evac_node | grep -v $current_primary | sort -R | head -n 1` echo "gnt-instance replace-disks -n $target_node $instance" gnt-instance replace-disks -n $target_node $instance doneHappy hacking!
I work at an organization that teaches people new to computers how to administer them. I've give a talk with blkperl, my boss, many times called Zero to Root. Teaching in our organization is very personal and doesn't really translate to other environments. We can't say 'Here, use our curriculum, its baws.'
Ops School strives to be the curriculum for people who want to become operations people. It doesn't assume any previous knowledge, but doesn't hide the details from the users either. Its kindof like a tldp.org for this generation of hackers. I have been slowing in my blog postings because some of that energy I use to write down things for other people has been siphoned off into opsschool.
I encourage everyone out there to contribute, there are huge chunks of this document that still need to be written.
You can minimize the effect of this by using redundant mount information.
Before:
/usr/local -ro,hard,intr,suid bunny.cat.pdx.edu:/disk/forest/localAfter:
/usr/local -ro,hard,intr,suid bunny.cat.pdx.edu:/disk/forest/local,caerbannog.cat.pdx.edu:/volumes/cave/misc/usr-localIt is best to maintain an idea of a primary and secondary, at least for administration. Modify only the primary, and rsync to the secondary. Use readonly mounting to mount. This mount appears in ``mount`` like this:
/usr/local on caerbannog.cat.pdx.edu:/volumes/cave/misc/usr-local,bunny.cat.pdx.edu:/disk/forest/local remote/read only/setuid/devices/rstchown/hard/intr/xattr/dev=5a0000e on Wed Mar 27 15:35:21 2013Note that this is mounted on both servers. Packets get sent to both servers and the first to respond with valid information is reported to the system. This can make for some bizarre weirdness if you use readwrite mounts.
It is totally possible to use something like drbd between nfs servers (not on Solaris, obviously), to make this doable with readwrite mounts. I have not done this personally.
This weekend we attended CasItConf13. I had a blast at met a lot of really cool people. I attended presentations on Logstash, IPv6, Chef and more. Jordan Sissel, in particular, did a great job of presenting Logstash. After his talk we met up and had a neat conversation. He showed me an app he had created called fingerpoken. Its a bit out of date and we had to do some hacks but I was able to get it up and running in a half-hour lunch break and still have time to demolish some tasty lunch provided by the wonderful folks over at Puppet Labs. Fingerpoken is an app that lets you send mouse and keyboard events to a computer with a smartphone.
And thats really what its all about. Is the tool simple and easy enough that you can get it going in a crunch? Are all the nonintuitive parts ripped out and replaced with sane defaults and the tool just 'goes'? In fingerpoken's case not really. We had to do some:
sudo ln -s /usr/lib/libxdo.so.2 /usr/lib/libxdo.soBut, what is the point of having the author of your tool nearby if not to tell you to do that? And yes, the abi is evidently close enough to just work in that case.
I am very impressed that I was able to get such high-level functionality out of a tool in a short period of time and under pressure. If your tool passes the 'setup at lunch at a conference' test, you're doing pretty dang good. If it doesn't, look for places to streamline it. I'm happy to test your random tool, please let me know.
My talk, on the Computer Action Team/Braindump, is available on my github and you can download the pdf from here.
In other news, it seems that github no longer allows you to download the raw files out of repositories if they are above a certain size. Possibly more on that later.
Git-sync is a script in ruby we use at work for managing git repos. It is covered in an earlier post. I got tired of ensuring it as a file in puppet and decided to make a debian package. Here is the summary of how to make a simple debian package containing just a single file. Note that the answer to this stack overflow question is the source of most of my knowledge, so this will just be annotations and extensions to that.
Debian/Ubuntu packaging (on an ubuntu system) required me to install a single package: devscripts.
At a high level, debian packaging involves creating a 'debian' folder in your source tree and putting several metadatafiles in it. Figuring out the precise contents of these files is the challenge of packaging. I recommend you use the 'apt-get source git' command to get the source of a working package (git in this case) to compare to your own metadatafiles.
Debian/Ubuntu packaging using debuild creates files one level above your current working directory(wtf debian). So the first step is to build a build directory like:
cd ~/devel mkdir git-sync-buildProcure the source:
nibz@darktemplar:~/devel/git-sync-build$ git clone git@github.com:pdxcat/git-sync nibz@darktemplar:~/devel/git-sync-build$ ls git-sync nibz@darktemplar:~/devel/git-sync-build$ cd git-sync nibz@darktemplar:~/devel/git-sync-build/git-sync$ mkdir debianAll of the metadata files that debuild, the utility that will actually build the .deb, needs are going to be in the debian directory.
The first file to create is the debian/changelog file. This file is created with the dch utility. Run it from the git-sync directory. It will open vim and it will look like this. Many fields here need to be changed.
dch --create PACKAGE (VERSION) UNRELEASED; urgency=low * Initial release. (Closes: #XXXXXX) -- Spencer KrumPACKAGE refers to the name of the package. Replace the word PACKAGE with the name you want your package to register itself as. In my git-sync case I will use 'git-sync'. The package name must be lower case. The VERSION must be replaced with a version number. I'm using 1.0.1 for this, since it is the second release of git-sync, but the changes are very minor. There are long articles on the internet about version numbering. It's not my place to comment here. The RELEASE variable needs to be replaced with a debian or ubuntu codename such as 'precise' or 'wheezy'. I have no idea what urgency is, but setting it to low doesn't seem to hurt anything. Maybe this is how you tell apt/dpkg about security updates. The initial release stuff is fine. The name is a bit tricky. Later on we will gpg sign the package. Make sure the name and email in the changelog match exactly the name and email on your gpg key, else the debuild utility won't attempt to have you gpg sign it at all. My changelog looks like this:Thu, 07 Mar 2013 01:40:18 -0800 git-sync (1.0.0) precise; urgency=low * Initial release. -- Spencer KrumNext create a debian/copyright file:Wed, 06 Mar 2013 16:46:14 -0800 Format: http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/ Upstream-Name: myScript Upstream-Contact: Name,I elected for the apache2 license and to use the two paragraph version of that license. I also gave credit where it was due here. Fill out this file as you see fit.Files: * Copyright: 2011, Name, License: (GPL-2+ | LGPL-2 | GPL-3 | whatever) Full text of licence. . Unless there is a it can be found in /usr/share/common-licenses Next create a debain/compat file:
nibz@darktemplar:~/devel/git-sync-build/git-sync/debian$ echo 7 > compatNext create the rules file. This file seems to be the work-doer in debian packaging. It is evaluated by make, which is picky, so make sure that indented line is a real tab(copying from my blog will probably fail). The --with python is... well I have no idea. I traced it to a python.pem (pem is a perlism) deep within /usr. Since I am packaging a ruby script I just removed it.Example from stackoverflow
#!/usr/bin/make -f %: dh $@ --with python2git-sync version#!/usr/bin/make -f %: dh $@Next make the control file. Make the natural substitutions here. I guessed on section and it just sorta worked.nibz@darktemplar:~/devel/git-sync-build/git-sync/debian$ cat control Source: git-sync Section: ruby Priority: optional Maintainer: Spencer Krum,Next make the install file. I went with the default in the stackoverflow post. I attempted to make some simple modifications to it(moving the file to /usr/local/bin) and that made it fail so this file is evidently pretty finicky.Build-Depends: debhelper (>= 7), ruby (>= 1.8.7) Standards-Version: 3.9.2 X-Ruby-Version: >= 1.8.7 Package: git-sync Architecture: all Section: ruby Depends: ruby, ${misc:Depends}, ${python:Depends} Description: Git syncing script, pull based Git-sync allows git repositories to be kept in sync via git hooks or other means. Pull based, able to handle force pushes and submodules nibz@darktemplar:~/devel/git-sync-build/git-sync$ cat debian/install git-sync usr/binNow you can build the debian package.
nibz@darktemplar:~/devel/git-sync-build/git-sync$ debuild --no-tgz-checkIf all went well, it should ask you to decrypt your gpg key twice and build a package in the directory one level up.nibz@darktemplar:~/devel/git-sync-build/git-sync$ ls .. git-sync git-sync_1.0.0_amd64.build git-sync_1.0.0.dsc git-sync_1.0.0_all.deb git-sync_1.0.0_amd64.changes git-sync_1.0.0.tar.gzYou now have a shiny .deb file that can be installed with dpkg -i git-sync_1.0.0_all.debIt is easy to put this in a launchpad PPA if you have a launchpad account. From your launchpad homepage (a shortcut is https://launchpad.net/~ if you are signed in). Press the "Create new PPA". Fill out the form.
Next build a source package. Launchpad PPAs take source packages and build binary packages on launchpad servers. Build it with:
nibz@darktemplar:~/devel/git-sync-build/git-sync$ debuild -SIt should go through the gpg motions again and build a source file. Then you should be able to run something like(with your launchpad username and name of PPA):dput ppa:krum-spencer/git-sync-ppa git-sync_1.0.0_source.changesHappy Packaging!
Where I work(read as: play) we use a lot of git. As an operator we often have a service running with its configs in git. A common pattern we use is to have a post-receive hook on the git repository set up to update the git checkout on a remote server. We accomplish this through a post-receive hook that sshes into the remote server and calls a script called git-sync with some options. The git-sync script github project forked from the puppet-sync script project that we use specifically for puppet dynamic git environments. Hunner <3 More dynamic git environments with puppet. Finch <3.
A hook for a project goes in the hooks/post-receive file of the git server's bare repo. Lets look at one now:
Example git post-receive hook
#!/bin/sh ## File: akwardly incorrect PWD=`pwd` REPONAME=`basename $PWD | sed 's/.git$//'` REPO="git@gitserver.cat.pdx.edu:$REPONAME.git" DEPLOY="/etc/nagios3/conf.d/flat" SSH_ARGS="-i /shadow/home/git/.ssh/git@gitserver.cat.pdx.edu" GITSYNC="nagios@nagios.cat.pdx.edu" SYNC_COMMAND="/usr/local/bin/git-sync" while read oldrev newrev refname do BRANCH=`echo $refname | sed -n 's/^refs\/heads\///p'` if [ $BRANCH != "master" ]; then echo "Branch is not master, therefore not pushing to nagios" exit 0 fi [ "$newrev" -eq 0 ] 2> /dev/null && DELETE='--delete' || DELETE='' ssh $SSH_ARGS "$GITSYNC" "$SYNC_COMMAND" \ --branch "$BRANCH" \ --repository "$REPO" \ --deploy "$DEPLOY" \ $DELETE done ssh nagios@gitserver.cat.pdx.edu '/etc/init.d/nagios3 reload'
The hook will exit before doing anything if the branch is not
masterand if it is, will run the git-sync script remotely on the nagios host, then go back in to bounce the nagios service.
The git-sync script essentially performs a
git fectch; git checkout HEADIt doesn't worry itself with merging, and it is submodule aware.
A file, .git-sync-stamp, must be created by the administrator of the system. This is how git-sync knows it is in charge of managing the repository. It is definitely not recommended that you add this file to git. However, that should more or less work if you never want to think about it. I also wrote this puppet defined type to manage the stamp, initial vcsrepo, and public_key_file for you.
A puppet defined type to initalize gitsync managed folders
define gitsync::gitsync( $deploy, $repo, $user, $source, $public_key, $public_key_type ){ ssh_authorized_key { "${user}-${name}-gitsync": user => $user, ensure => $present, type => $public_key_type, } vcsrepo { $deploy: ensure => present, provider => git, user => $user, source => $source require => Ssh_authorized_key["${user}-${name}-gitsync"], } file { "${deploy}/.git-sync-stamp": ensure => present, owner => $user, mode => 0644, require => Vcsrepo[$deploy], }
The last thing to note is that I didn't write git-sync. I've modified it but it was mostly written by Reid Vandewielle and others. Marut <3
Today (well yesterday) our primary router ran out of memory. We haven't fixed the problem yet, I hope that will be the subject of a follow up post, but for right now I want to take you through detection, characterization, and mitigation.
Detection. The way I found out about the problem was via ssh.
Attempting to ssh into the router running out of memory.
(nibz@observer:~) > ssh multiplexor.seas nibz@multiplexor.seas's password: Permission denied, please try again. nibz@multiplexor.seas's password: Connection closed by 2610:10:0:2::210For anyone familiar with sshing into ciscos this is not how it normally looks. Usually you get three attempts with just 'Password' and one with your user visible.
Attempting to ssh into a router not running out of memory.
(nibz@observer:~) > ssh nibz@wopr.seas Password: Password: Password: nibz@wopr.seas's password: Connection closed by 131.252.211.3I verified that it wasn't a knowing-the-password problem by using another account on the router. I connected a serial port to the router. Immediately found out of memory logs.
Console logs on the router.
10w0d: %AAA-3-ACCT_LOW_MEM_UID_FAIL: AAA unable to create UID for incoming calls due to insufficient processor memory
Logs sent to syslog.
Mar 2 00:43:52 multiplexor 4309463: 10w1d: %SYS-2-MALLOCFAIL: Memory allocation of 128768 bytes failed from 0x1A8C110, alignment 0 Mar 2 00:44:24 multiplexor 4309499: 10w1d: %SYS-2-MALLOCFAIL: Memory allocation of 128768 bytes failed from 0x1A8C110, alignment 0 Mar 2 00:47:37 multiplexor 4309643: 10w1d: %SYS-2-MALLOCFAIL: Memory allocation of 395648 bytes failed from 0x1AA03FC, alignment 0 Mar 2 02:18:33 multiplexor 4313756: 10w1d: %SYS-2-MALLOCFAIL: Memory allocation of 395648 bytes failed from 0x1AA03FC, alignmentI ran the 'show proc mem' command on the router to get a picture of the memory use of the router.
Show proc mem.
multiplexor#show proc mem Processor Pool Total: 177300444 Used: 174845504 Free: 2454940 I/O Pool Total: 16777216 Used: 13261296 Free: 3515920 Driver te Pool Total: 4194304 Used: 40 Free: 4194264 PID TTY Allocated Freed Holding Getbufs Retbufs Process 0 0 108150192 43169524 58720200 0 0 *Init* 0 0 12492 2712616 12492 0 0 *Sched* 0 0 399177972 389135628 8911036 14228691 1490354 *Dead* 0 0 0 0 102305848 0 0 *MallocLite* 1 0 973921416 973821100 224768 0 0 Chunk Manager 2 0 232 232 4160 0 0 Load Meter 3 0 0 0 7076 0 0 DHCPD Timer 4 0 4712 6732 11692 0 0 Check heaps 5 0 7862444 49770056 13540 6270020 28190703 Pool Manager 6 0 0 0 7160 0 0 DiscardQ Backgro 7 0 232 232 7160 0 0 Timers 8 0 0 0 4160 0 0 WATCH_AFS 9 0 284 728 7160 0 0 License Client N 10 0 2332421068 2332422156 7168 0 0 Licensing Auto U 11 0 1482732 1483016 7160 0 0 Image License br 12 0 2344349400 3601318192 169876 157356 0 ARP Input 13 0 550320160 550382256 7160 0 0 ARP Background 14 0 0 0 7168 0 0 CEF MIB API 15 0 0 0 7160 0 0 AAA_SERVER_DEADTThis shows that the router is indeed running very low on memory. How did we get here? Monitoring + SNMP + RRDtool to the rescue!
Doing some quick estimation on this it looks like it loses about a MB of free ram every 18 hours. RRDtool isn't the best, and getting the big picture graph is hard to do, but basically it has been losing free ram at this rate for a couple of weeks.
Finally we get a show tech-support off of this thing.
multiplexor# show tech-support | redirect tftp://10.0.21.2/cisco/mux-tech-support-mar1The redirect to tftp is a really cool pattern for getting information off of a cisco device. The tech-support run was about 50000 lines.
I will do a follow up post when I figure out whats going on.
Update 3-7-13:
The router ran completely out of memory. Even on console all I got was:
%% Low on memory; try again later %% Low on memory; try again later %% Low on memory; try again laterIt was happily switching and routing at this point, however. We rebooted it because it was Saturday evening and better to have it happen at a time of our choosing than to break iscsi unexpectedly later in the week. Upon reboot, the system returned to full functionality, but we can tell from the zenoss graphs that it is still leaking memory at a rate of 1Mb every 18hrs. At this rate it will need to be rebooted again in 10 weeks. We have opened a case with TAC. I will update again if anything comes from this.
Update 5-17-13:
We still have not fixed the problem. The router can go about 10 weeks before it reboots. This is in an educational setting where there are 12 week terms, meaning we need to reboot our core router a least once a term. Wheeee. We've been on the horn with Cisco who has had numerous techs look at it, and has even replaced the hardware, but the problem remains. Anyone with some ideas is welcome to contact me privately.
# the following is for enabling write access to the web gui if $readonly_web == false { file_line {. '/etc/nagios3/nagios.cfg-external-commands-yes': line => "check_external_commands=1", path => '/etc/nagios3/nagios.cfg', notify => Service['nagios3'], require => Package['nagios3']; '/etc/nagios3/cgi.cfg-all_service_commands': line => "authorized_for_all_host_commands=nagiosadmin", path => '/etc/nagios3/cgi.cfg', notify => Service['nagios3'], require => Package['nagios3']; '/etc/nagios3/cgi.cfg-all_host_commands': line => "authorized_for_all_service_commands=nagiosadmin", path => '/etc/nagios3/cgi.cfg', notify => Service['nagios3'], require => Package['nagios3']; } user { 'nagios': groups => ['nagios', 'www-data'], membership => minimum, require => Package['nagios3']; } file { '/var/lib/nagios3/rw': owner => 'nagios', group => 'www-data', mode => '2710', ensure => directory, require => Package['nagios3']; } file { '/var/lib/nagios3': owner => 'nagios', group => 'nagios', mode => '0751', ensure => directory, require => Package['nagios3']; } }
network={ ssid="NANOG-secure" scan_ssid=1 key_mgmt=WPA-EAP pairwise=CCMP TKIP group=CCMP TKIP eap=TTLS PEAP TLS identity="nanog" password="nanog" phase1="peaplabel=0" }
For the duration of the meeting conference, NANOG provides a dual-stack IPv4/v6 meeting network. IP address allocation is available by DHCP for IPv4 and neighbor discovery for IPv6. No NAT or translation protocols are utilized, in addition local NANOG DNS servers offer DNSSEC capability.Woot!
network={ ssid="NANOG-a-secure" scan_ssid=1 key_mgmt=WPA-EAP pairwise=CCMP TKIP group=CCMP TKIP eap=TLS identity="nanog" identity="nanog" }I will report on the wireless at NANOG tomorrow. I'll also see if I can find the certificate used by the access points.
... Info: Loading facts in /var/lib/puppet/lib/facter/nvidia_graphics_device.rb Could not retrieve dns_ip_6: undefined method `each' for "yermom.cat.pdx.edu has address 131.252.222.10\n":String Could not retrieve dns_ip_6: undefined method `each' for "yermom.cat.pdx.edu has address 131.252.222.10\n":String ...The unfortunate cause of this (other than that I didn't write 1.9 compatible code) is that string.each has been removed from ruby as of ruby 1.9. :(
puppet config print all | grep vardir vardir = /var/lib/puppetChange directory to the puppet vardir and modify the facts in place. Then run the facter utility with the '-p' argument. The '-p' argument tells facter to run all its normal facts as well as facts loaded in from puppet.
root@yermom:/var/lib/puppet/lib/facter# facter -p | grep dns dns_ip_4 => ["131.252.222.10"] dns_ip_6 => []Fantastic. All is well again.
storeconfigs = truein puppet.conf you can do exported/collected resource magics. When doing this with nagios resources I've been able to export and collect resources flawlessly. The problems came up when I tried to modify a resource. Since we make heavy use of dynamic git environments with puppet I was running something like
puppet agent --test --environment=nagioson a host at random and
puppet agent --test --environment=nagioson the nagios server, hoping to collect exported resources. The problem was they were not changing. As it turns out puppetdb can cache old exported resources for up to an hour. My advice for others having problem getting nagios or other exported resources to change or purge is to give it time. Run a big ssh for loop or use mcollective to hit all your boxes and hit the coffee cart for a quick pick me up. Chances are good you just need to give it time.
I'm kind of obsessed with writing irc bots in python using twisted.words.protocols. A longer example of how to do that may come later but for now I want to show you one of the best ways to debug your twisted irc bot and a vector to get really cool behavior not intended by the twisted developers. On an IRC server I frequent channels that are secured by forcing users to first login with NickServ then to ask ChanServ for an invite. The problem is you must join the channel after your receive the invitation from ChanServ. My solution to this problem is below, using irc_UNKNOWN and the ircLogBot.py example in the twisted.words documentation:
class BeerBot(irc.IRCClient): """A logging IRC bot.""" """That also does beer things.""" ... def signedOn(self): """Called when bot has succesfully signed on to server.""" self.logger.log('identifying to nickserv') self.msg("NickServ", "identify %s" % config.password) self.logger.log('requesting channel invite') self.msg("ChanServ", "invite %s" % config.channel) self.join(self.factory.channel) self.logger.log('channel joinied') ... def irc_unknown(self, prefix, command, params): self.logger.log("{0}, {1}, {2}".format(prefix, command, params)) if command == "INVITE": self.join(params[1])
The irc_unknown is great because it simulates the 'window 1' on most irc clients(well most command line irc clients[and by that I mean weechat and irssi{and by that I mean irssi-for-life!}]). You can add if statements to grab the other 'named' irc control messages. The others are numbered and you can split those out as well. One of the bad things about irc is different irc servers behave differently. It must be a frustrating and thankless task for the maintainers of irssi/weechat/t.p.w to provide such a universal interface to all irc servers. (lol jk, irssi hasn't had an update since like 2010.) [but no really, thank you irssi devs *internet hug*].
The source code for beerbot can be found at my github.
For our ircd needs we use patched versions of Charybdis and Atheme. I discovered the other day that one of our users had been trying to use the nickname 'help'. It was discovered he was just a beginner trying to find the help for the /nick command. The interesting thing was that this was tripping alarms for another user. Nickserv will warn you when someone tries to use your nick. Another user had messaged me that someone was attempting to take their nick. After doing some digging I realized that the second user had registered the nick 'help' with NickServ.
Allowing users to use nicks like 'help' and 'support' open the door to social engineering attacks. I set out to block them at a services/ircd level. To my suprise this was done at the ircd level not at the services level. Big shoutout to 'grawity' on #atheme on irc.atheme.org.
Make sure you are logged in as an oper and that you have OperServ enabled. Get help on the command:
/msg OperServ SQLINE helpAdd a sqline with:
/msg OperServ SQLINE add help !P abuseThe !P means permanent(you can use !T
fab20a#test cable-diagnostics tdr interface Gi1/0/25 TDR test started on interface Gi1/0/25 A TDR test can take a few seconds to run on an interface Use 'show cable-diagnostics tdr' to read the TDR results. fab20a#show cable-diagnostics tdr int Gi1/0/25 TDR test last run on: January 22 20:41:52 Interface Speed Local pair Pair length Remote pair Pair status --------- ----- ---------- ------------------ ----------- -------------------- Gi1/0/25 1000M Pair A 49 +/- 4 meters Pair A Normal Pair B 45 +/- 4 meters Pair B Normal Pair C 48 +/- 4 meters Pair C Normal Pair D 45 +/- 4 meters Pair D NormalAnother example: this shows normal use and an open circuit(most likely meaning no host is on the other side.
fab60a#test cable-diagnostics tdr interface GigabitEthernet 1/0/31 TDR test started on interface Gi1/0/31 A TDR test can take a few seconds to run on an interface Use 'show cable-diagnostics tdr' to read the TDR results. fab60a#show cable-diagnostics tdr int GigabitEthernet 1/0/31 TDR test last run on: January 08 14:45:40 Interface Speed Local pair Pair length Remote pair Pair status --------- ----- ---------- ------------------ ----------- -------------------- Gi1/0/31 auto Pair A 3 +/- 4 meters N/A Open Pair B 2 +/- 4 meters N/A Open Pair C 0 +/- 4 meters N/A Open Pair D 3 +/- 4 meters N/A OpenNote that you must specify 'int' between tdr and the interface identifier. Presumably so you could shoot electrons at something that isn't an interface, like the door or something. An example of a broken pair:
fab20a#test cable-diagnostics tdr interface Gi1/0/25 TDR test started on interface Gi1/0/25 A TDR test can take a few seconds to run on an interface Use 'show cable-diagnostics tdr' to read the TDR results. fab20a#show cable-diagnostics tdr int Gi1/0/25 TDR test last run on: January 22 18:33:07 Interface Speed Local pair Pair length Remote pair Pair status --------- ----- ---------- ------------------ ----------- -------------------- Gi1/0/25 100M Pair A 49 +/- 4 meters Pair A Normal Pair B 45 +/- 4 meters Pair B Normal Pair C 48 +/- 4 meters Pair C Normal Pair D 0 +/- 4 meters Pair D OpenHere the "D" pair is broken. You can see from the 'pair length' column that it is broken at the beginning of the cable. This means we got lucky. We were able to replace the patch cable instead of having a contractor rewire the wire in the conduit. Notice that with only 3 pairs, the link is still up at 100Mbit.