Tuesday, January 29, 2013

Puppet Fact Fix after Ruby 1.9 upgrade

Stardate: 90683.2
We recently upgraded our entire infrastructure to Puppet 3. As if that wasn't ambitious enough (though I suppose Puppet 3.1 has an rc), we are slowly bringing ruby to 1.9 from 1.8.7. Surprisingly, puppet actually works under ruby 1.9. This blows my mind since I remember having to do some seriously ugly hacks to get puppet working under ruby 1.9. Unfortunately, and embarrassingly, one of my custom facts was not forward compatible.
...
Info: Loading facts in /var/lib/puppet/lib/facter/nvidia_graphics_device.rb
Could not retrieve dns_ip_6: undefined method `each' for "yermom.cat.pdx.edu has address 131.252.222.10\n":String
Could not retrieve dns_ip_6: undefined method `each' for "yermom.cat.pdx.edu has address 131.252.222.10\n":String
...
The unfortunate cause of this (other than that I didn't write 1.9 compatible code) is that string.each has been removed from ruby as of ruby 1.9. :(
For those of you unfamiliar with puppet hacking, most testing should be done through git dynamic environments. Unfortunately, there is a bug in puppet that prevents facts from being able to be tested on any branch but production. You can add the facts to git and push them to the environments directory on the puppet master, but unless they are in branch production you won't see them get filebucketted or run.
So what to do? The answer is to develop on the box itself, usually as root, though sudo is an idea. (Hopefully I can get a friend of mine to guest post on why sudo is the correct way to attain root privileges for administration, and I can counterpost on why su - is the correct way.) On ubuntu (as of 12.04, anyways) the puppet configuration dir is /etc/puppet, but the puppet var dir is /var/lib/puppet. Facts live in /var/lib/puppet/lib/facter. The best way to get information on a current puppet's installation and configuration is through:
puppet config print all | grep vardir
vardir = /var/lib/puppet
Change directory to the puppet vardir and modify the facts in place. Then run the facter utility with the '-p' argument. The '-p' argument tells facter to run all its normal facts as well as facts loaded in from puppet.

root@yermom:/var/lib/puppet/lib/facter# facter -p | grep dns
dns_ip_4 => ["131.252.222.10"]
dns_ip_6 => []

Fantastic. All is well again.

Monday, January 28, 2013

PuppetDB/Storeconfigs Cache expiry

Stardate: 90682.98
After a couple of weeks of getting frustrated with puppet's Storeconfigs/puppetdb features, I have emerged victorious. PuppetDB is the newer, better, postgressier backend for puppet Storeconfigs. PuppetDB sports some really nice features including a fancy status/metrics web dashboard:
As you can see this is some interesting and potentially beneficial feedback. It is updated live and is mobile browser compatible. Personally, I'm happy to get graphs of this data any way I can, but I would prefer not to be locked into their dashboard. I would rather be able to get these data out of an often updated file or udp port so that I could send it to graphite for real time graphing and correlation with other metrics. I also don't see the point of having it be mobile friendly, since most everyone will have their puppetmaster/puppetdb server firewalled heavily and mobile devices have no business on the internal network. Some of the metrics can lead to actually tuning and performance boosts: mostly this is in the increasing the number of threads and the max jvm heap size.
The punchline here is that with
storeconfigs = true
in puppet.conf you can do exported/collected resource magics. When doing this with nagios resources I've been able to export and collect resources flawlessly. The problems came up when I tried to modify a resource. Since we make heavy use of dynamic git environments with puppet I was running something like
 puppet agent --test --environment=nagios 
on a host at random and
 puppet agent --test --environment=nagios 
on the nagios server, hoping to collect exported resources. The problem was they were not changing. As it turns out puppetdb can cache old exported resources for up to an hour. My advice for others having problem getting nagios or other exported resources to change or purge is to give it time. Run a big ssh for loop or use mcollective to hit all your boxes and hit the coffee cart for a quick pick me up. Chances are good you just need to give it time.

Thursday, January 24, 2013

Irc Bots in Twisted with Invite-only Channels

Stardate: 90672.08

I'm kind of obsessed with writing irc bots in python using twisted.words.protocols. A longer example of how to do that may come later but for now I want to show you one of the best ways to debug your twisted irc bot and a vector to get really cool behavior not intended by the twisted developers. On an IRC server I frequent channels that are secured by forcing users to first login with NickServ then to ask ChanServ for an invite. The problem is you must join the channel after your receive the invitation from ChanServ. My solution to this problem is below, using irc_UNKNOWN and the ircLogBot.py example in the twisted.words documentation:

class BeerBot(irc.IRCClient):
    """A logging IRC bot."""
    """That also does beer things."""

...


    def signedOn(self):
        """Called when bot has succesfully signed on to server."""
        self.logger.log('identifying to nickserv')
        self.msg("NickServ", "identify %s" % config.password)
        self.logger.log('requesting channel invite')
        self.msg("ChanServ", "invite %s" % config.channel)
        self.join(self.factory.channel)
        self.logger.log('channel joinied')

...


    def irc_unknown(self, prefix, command, params):
        self.logger.log("{0}, {1}, {2}".format(prefix, command, params))
        if command == "INVITE":
          self.join(params[1])

The irc_unknown is great because it simulates the 'window 1' on most irc clients(well most command line irc clients[and by that I mean weechat and irssi{and by that I mean irssi-for-life!}]). You can add if statements to grab the other 'named' irc control messages. The others are numbered and you can split those out as well. One of the bad things about irc is different irc servers behave differently. It must be a frustrating and thankless task for the maintainers of irssi/weechat/t.p.w to provide such a universal interface to all irc servers. (lol jk, irssi hasn't had an update since like 2010.) [but no really, thank you irssi devs *internet hug*].

The source code for beerbot can be found at my github.

Blacklisting Usernames in Charybdis

Stardate: 90671.98

For our ircd needs we use patched versions of Charybdis and Atheme. I discovered the other day that one of our users had been trying to use the nickname 'help'. It was discovered he was just a beginner trying to find the help for the /nick command. The interesting thing was that this was tripping alarms for another user. Nickserv will warn you when someone tries to use your nick. Another user had messaged me that someone was attempting to take their nick. After doing some digging I realized that the second user had registered the nick 'help' with NickServ.

Allowing users to use nicks like 'help' and 'support' open the door to social engineering attacks. I set out to block them at a services/ircd level. To my suprise this was done at the ircd level not at the services level. Big shoutout to 'grawity' on #atheme on irc.atheme.org.

Make sure you are logged in as an oper and that you have OperServ enabled. Get help on the command:

/msg OperServ SQLINE help
Add a sqline with:
/msg OperServ SQLINE add help !P abuse
The !P means permanent(you can use !T

Tuesday, January 22, 2013

Cisco diagnostics

Stardate: 90666.54 Cisco switches and routers (Catalyst 3750 series in this example) support some really cool diagnostics. These diagnostics come in handy when trying to determine if a Layer 1 fault may be involved and where it is. This technology, known as Time domain reflectometry, is available on all Cisco 3750 models including the new Catalyst 3750X.
An initial example: This shows normal use an a no-fault return.

fab20a#test cable-diagnostics tdr interface Gi1/0/25
TDR test started on interface Gi1/0/25
A TDR test can take a few seconds to run on an interface
Use 'show cable-diagnostics tdr' to read the TDR results.
fab20a#show cable-diagnostics tdr int Gi1/0/25
TDR test last run on: January 22 20:41:52

Interface Speed Local pair Pair length        Remote pair Pair status
--------- ----- ---------- ------------------ ----------- --------------------
Gi1/0/25  1000M Pair A     49   +/- 4  meters Pair A      Normal
                Pair B     45   +/- 4  meters Pair B      Normal
                Pair C     48   +/- 4  meters Pair C      Normal
                Pair D     45   +/- 4  meters Pair D      Normal
Another example: this shows normal use and an open circuit(most likely meaning no host is on the other side.
fab60a#test cable-diagnostics tdr interface GigabitEthernet 1/0/31
TDR test started on interface Gi1/0/31
A TDR test can take a few seconds to run on an interface
Use 'show cable-diagnostics tdr' to read the TDR results.

fab60a#show cable-diagnostics tdr int GigabitEthernet 1/0/31
TDR test last run on: January 08 14:45:40

Interface Speed Local pair Pair length        Remote pair Pair status
--------- ----- ---------- ------------------ ----------- --------------------
Gi1/0/31  auto  Pair A     3    +/- 4  meters N/A         Open
                Pair B     2    +/- 4  meters N/A         Open
                Pair C     0    +/- 4  meters N/A         Open
                Pair D     3    +/- 4  meters N/A         Open
Note that you must specify 'int' between tdr and the interface identifier. Presumably so you could shoot electrons at something that isn't an interface, like the door or something. An example of a broken pair:

fab20a#test cable-diagnostics tdr interface Gi1/0/25
TDR test started on interface Gi1/0/25
A TDR test can take a few seconds to run on an interface
Use 'show cable-diagnostics tdr' to read the TDR results.
fab20a#show cable-diagnostics tdr int Gi1/0/25
TDR test last run on: January 22 18:33:07

Interface Speed Local pair Pair length        Remote pair Pair status
--------- ----- ---------- ------------------ ----------- --------------------
Gi1/0/25   100M Pair A     49   +/- 4  meters Pair A      Normal
                Pair B     45   +/- 4  meters Pair B      Normal
                Pair C     48   +/- 4  meters Pair C      Normal
                Pair D     0    +/- 4  meters Pair D      Open
Here the "D" pair is broken. You can see from the 'pair length' column that it is broken at the beginning of the cable. This means we got lucky. We were able to replace the patch cable instead of having a contractor rewire the wire in the conduit. Notice that with only 3 pairs, the link is still up at 100Mbit.

Sunday, January 20, 2013

Stardate: Unknown

Stardates are how I would like to keep time for this blog but Star Trek doesn't have a consistent sense of how to translate real time into stardates. During and after the Next Generation a certain amount of sanity appeared but nothing that can be rolled backwards to figure out what 2013 would have been. The most consistent and repeatable way to represent current time in stardates is by using the stardate calendar for Star Trek Online, which has a direct mapping from the current age into stardates of the future(which actually takes place after the events of ST:Nemesis. Current stardate: 90660.85.

Calculator is here.