Roman Sandals

February 12, 2009

What does Pee Wee Herman have to do with YAML?

Filed under: Uncategorized — Tags: , , — rchanter @ 11:14 am

Indeed, I often feel that XML documents, when compared with equivalent YAML files, demonstrate all the grace and calm reserve of a Pee-wee Herman chase scene (complete with rope swing, speedboat, sleigh, and man in a Godzilla costume).

That’s a quote from this book by André Ben Hamou, and it’s pretty much exactly how I feel about data serialisation. YAML has become my go-to format for just about everything. Why I love YAML:

  • It’s really easy to map out and visualise complex data structures, especially in languages like Perl where this can be a bit of a pain.
  • It’s completely cross-platform. so I can transport stuff between all the languages I write (yeah, OK, both of them if you don’t count 37 dialects of shell).
  • It’s safe — no eval required
  • Once your code is built to marshal/unmarshal using YAML, adding support for more formats (JSON, XML, language-native formats) is a piece of piss.

I will never write a config file parser again, nor hinky semi-structured report formats.

December 23, 2008

Net filter

Filed under: political, technology — Craig Lawton @ 7:40 am

Ars technica says it best:

“So, in summary, it appears that the government is trying to make up for the failure of an earlier PC-based filtering program by rolling out an alternative, ISP-level filtering program that they know won’t fully prevent access to illegal material. They promise not to state what sites are being blocked, even as they promise only illegal content will be. To prepare for the roll out, they’re doing live testing of equipment and protocols they haven’t used in the lab, and not telling the ISPs when the program will be ready. It sounds like all of the worst clichés about government incarnated in a single program.”

August 8, 2008

Lean and Mean

Filed under: business, management, technology — Craig Lawton @ 3:00 pm

Lean manufacturing principles originated in Japan.

People are now applying them in IT: Lean software management, and also other aspects of IT such as Service Management.

In July 2007 it all looked so promising.

A year later what went wrong?

July 29, 2008

Super-user excuses

Filed under: musing, sysadmin, technology — Craig Lawton @ 1:12 pm

System administrators always get super-user access. Third parties, increasingly located wherever, are often granted super-user access as well, usually to smooth project implementations. Super-user access is thrown around willy-nilly because it’s a hell of lot easier than documenting privileges which is really, really boring work.

This leads to poor outcomes: downtime, systems in “undocumentable” states, security holes etc.

The horrible truth is that somebody somewhere must be able to gain super-user access when required. It can’t be avoided.

The other horrible truth is that when you allow super-user access only because properly defining a particular role is hard, you are in effect, giving up control of your environment. This is amplified when more than one team shares super-user access. It only takes one cowboy, or an innocent slip-up, to undermine confidence in an environment.

In this increasingly abstracted IT world, where architecture mandates shared, re-usable applications, where global resourcing mandates virtual remotely-located teams, where IT use and and server numbers exponentially increase and where businesses increasingly interact through gateways, security increasing looks like a feature tacked on at the last minute.

Security costs a lot and adds nothing to the bottom line – though lack of it can and will lead to some big bottom line subtractions.

The mainframe guys had this licked ages ago. The super-user excuse is looking rather thin.

The Age of Authorization is upon us…

Update: An amazing story from San Francisco, which outlines how lack of IT knowledge at the top of an organisation, and too much power devolved to to few IT staff, can cause much grief.

July 27, 2008

It’s really different this time…

Filed under: business, technology — Craig Lawton @ 11:28 am

The seems to be a common thread in the media that the IT industry is in for a downturn because the economy in general is struggling. I think it is different this time.

The last time IT struggled was at the end of the dot-com boom. The US dollar was really high, tech companies had massive inventories to clear and Cisco had been the biggest company in the world. The IT world had been set for a golden age which never arrived.

This time, the US dollar is low, tech companies are lean and in good shape having learnt their lessons, and surprise, surprise, the earnings of the big players are impressive and growing.

Intel, VMware, EMC, Apple, Microsoft, Google, all increased profits impressively. Some didn’t increase earnings enough and were “punished”, but this is clearly market sentiment. For example, VMware increased earnings by 40-ish% instead of 50-ish%, and their share price dropped. Strong international revenues especially are boosting results. SUN still struggles, but they were hit hardest by the dot-com era ending, and they still pull $4 billion in revenue each year.

Now big Australian corporate oligopolies, run by cosy, tech-ignorant boomers, have woken to the fact that they have under-invested in IT for the last decade, and have expensive legacy environments which are due for a big clear out. They have to spend money to make their environments lean; to make their businesses internationally competitive. And it’s a good time for CapEX in US dollars. Not only is IT gear very, very cheap compared to 8 years ago, each aussie dollar goes twice as far in US purchases as it once did.

May 23, 2008

A language for sysadmin testing

Filed under: sysadmin, test-driven sysadmin — rchanter @ 4:13 pm

This article is part of a series on Test-Driven Systems Administration.

Having decided that we’re going to test the hell out of everything, we need to settle on a language for both the tests we want to run, and the data we want to collect. In trying to building a systems test tool, there is a series of design decisions that flow from having a sytems admin perspective.

Basic Design Principles

  • Low friction: it should be simple to write tests.
  • Didactic: Examining existing tests should yield something meaningful. They should serve as a knowledge-sharing channel.
  • Language independence: it should be easy to use your language of choice to write tests.
  • Ubiquity: run-time dependencies should not be an obstacle to deploying and using a tool.
  • Safety: it will be common for tests to run with elevated privileges.

This post is about all of these things, but principally about the testing language. At the highest level, we can divide language into vocabulary and grammar. It’s worth considering these separately when we look at systems testing.

It occurs to me that most sysadmins already have a perfectly good vocabulary for systems testing: shell one-liners and scripts. There are a few … ah, let’s say … improvement opportunities there. I’ll start by confessing that I love one-liners. I would reckon that 80% of what I want to do to assess a Unix system can be done with one-liners. Preliminary health checks on most servers are done with a handful of standard commands. Putting together a 6-command pipeline usually gets me cackling with glee at my own ingenuity. So one-liners are probably going to be a big part of my testing toolkit.

As for shell scripts, there are a few classic shell scripting patterns. In general, they violate my “Low Friction” principle.

  • One-offs, ranging from a simple for-loop at the shell prompt to single-purpose throwaway scripts (say, up to a couple of dozen lines).
  • Write a 20-line shell script that basically just builds and executes one command. By the time you get to the bit that executes the command, it looks something like “$DO_CMD $CMD_OPTS $ARGS”, so you need to trace the script logic, include debug code, or insert “set -x” all over the place to figure out what it’s doing. My personal favourite is a 65-line shell script that does an rsync in a for-loop and subversion commit. This violates my “Didactic” principle.
  • Write a 1000-line shell script that is effectively 100 little scripts in one, and sends you blind if you try and maintain it. Because it’s a shell script interpreted top-to-bottom, you have to put all your functions at the top and no one can find where the main loop starts. This violates the “Sanity” principle.
  • Come to your senses, abandon the shell script, and rewrite it in Perl (Python, Ruby, whatever your systems scripting language of choice is).

In practice, shell scripting tends to involve reusing a relatively small set of common idioms over and over. If you’re lucky, you’ll have a set of common libraries. But shell libraries have a tendency to be a bit opaque and non-portable in their own right (for example, useful as some of the things in Red Hat’s /etc/init.d/functions might be, nobody in their right mind is going to use them for portable shell scripts). If you’re less lucky, you’ll have some skeleton scripts that you can plug your specifics into. If you’re less lucky still, you do it from scratch every time, so no two scripts work quite alike (or you take more of a productivity hit than you should need to to automate something). There are a few well-known problems with shell scripts in general, but I’m not necessarily going to attack them head-on just yet.

  • Portability is awful. You have GNU and POSIX variants of utilities, varying directory locations, and no guarantees about the output format (see, for example, what Red Hat aliases “ls” to by default). GNU systems tend to let you get away with Bash-isms even when called as /bin/sh. Linux doesn’t have a “real” Korn shell. Solaris paths can be ugly. The list goes on.
  • Shell quoting rules can get ugly.

There is another useful idiom for shell scripts, and that’s the “foo.d/” directory that gets executed by a “run-parts” or similar calling mechanism. That’s good, and goes a long way to solving the 1000-line script problem, but doesn’t solve the problem of writing the same 20-line script with minor variations over and over.

The general idea of putting a bunch of scripts in a directory and pointing a test-runner at it is goodness. Which leads me to another design decision: use the file system as a database. This satisfies the “Ubiquity” principle, and besides, I’m generally in the Databases-Are-Evil camp. That’s not to say that storing tests in a database couldn’t come later, but it’s certainly not necessary.

The second aspect of this vocabulary is how to interpret the results of running some command. The most obvious is exit codes. This is not without its problems (quick, what exit code does /usr/bin/host return for NXDOMAIN replies on your system?), but in a controlled environment it’s a good place to start. It’s also as applicable to more serious systems glue languages as it is to shell. There’s also the presence or absence of output, the contents of the output, whether anything gets written to STDERR, or the reply codes for application protocols. We should be able to deal with all of these.

So to summarize the “vocabulary”, we have a pretty standard set of building blocks: One-liners, shell/perl/python/whatever scripts, exit codes, command output, and protocol reply codes.

By “grammar” in a sytems testing language, I really mean the file formats and system APIs we expect to deal with.

If we’re going to move our most common logic idioms back from the test cases into the harness, we need to settle on some sort of format to describe the specifics of a test. An example might be:

command: /bin/grep/ 'my\.dns\.server' /etc/resolv.conf
  0: OK I have the right nameserver
  1: FAIL my.dns.server missing from resolv.conf
  2: FAIL something went wrong with grep

This example encapsulates pretty much everything we want to check, with no program logic or environment variables getting in the way.

The most likely candidate for the test description format is a data-serialisation format, like YAML, JSON, Perl’s Storable or Data::Dumper, or, God Forbid, XML. Remember, since we’re abstracting all the logic out into the harness, all we need in the test description is the thing to be tested (usually a command to run) and a little bit of metadata, such as how to interpret the results. For concise representation of not-too-deeply-nested data structures, YAML seems like the best fit to me as a starting point:

  • It’s better for human consumption than XML, so it’s a good choice for human-editable inputs.
  • It’s not executable (unlike JSON or the native Perl serialisation formats), which is in line with the “Safety” principle.
  • It’s relatively ubiquitous; there are good-quality YAML libraries for all the popular systems glue languages

The other thing a test harness needs to do is produce output you can use. Pass/fail counters on STDOUT are fine, but not so useful for examining the test output in more detail, for audit trails, for capturing trends, or for tarting up into web pages. So I want something that can produce different styles of output for the different presentation contexts. The same set of data serialisation formats I mentioned above would be a good start, along with syslog and pretty(-ish) STDOUT output. There are also specific testing protocols like TAP, TET, or others which would be useful to implement.

Next post: test formats.

St. George and the IT dragon

Filed under: business, musing — Craig Lawton @ 3:08 pm

Over the last few months I’ve read a couple articles in the AFR relating to IT spend in M&A activity.

It’s amazing to consider that about half of the business integration costs for the proposed merger between Westpac and St. George will be in IT (0.5 * $451,000,000).

Consider that the Commonwealth Bank is planning on spending $580,000,000 to re-engineer its aging platforms (to me this means cleaning out all the legacy crap), and NAB is looking at doing the same.

A merged Westpac/St. George would be $225,500,000 behind the eight-ball, before it could even contemplate a project of this scale.

Also, to make the merger more attractive, or because of uncertainty, either side could be tempted to put off required upgrades, lay off staff (possibly key staff), and run-down maintenance.

Accenture recently concluded a survey of 150 CIOs and found that poor IT integration was the leading cause of failure to meet the stated objectives of a merger or acquisition (38%).

It makes you wonder if this whole “IT thing” is going to collapse under the weight, and expense, of its own complexity!

Frustrating in-house systems

Filed under: musing, technology — Craig Lawton @ 1:25 pm

I’m constantly amazed at the crappy performance of in-house applications at the places I’ve worked. Customer-facing applications must perform, or business is lost. In-house applications are never tuned for performance it seems, and this makes work that much harder.

This difficulty is related to the level of brain-memory you are using for your current task. Very short term memory is great, and necessary, when you are flying through a well understood task. But short system interruptions (usually involving the hour glass) force you to use more extended memory times, making the effort that much larger, and less enjoyable.

There are other types of interruptions of course, which have a similar effect, such as people-interruptions (“What are you doing this weekend?”) and self-inflicted-interruptions (such as twitter alerts).

If your system hangs for long enough you may start a new task altogether (so as not to look stoned at your desk) and therefore lose track completely of where you were at.

This forces unnecessary re-work and brain exhaustion!

I see lots of people with “notepad” or “vi” open constantly so they can continually record their work states. This is a good idea but takes practice and is an overhead.

It comes down to this. I want a system which can keep up with me! :-)

And is that unreasonable, with gazillions of hertz and giga-mega-bits of bandwidth available?

May 1, 2008

Going with the cloud

Filed under: management, musing, technology — Craig Lawton @ 5:01 pm

Really interesting article on the Reg’ which should put data centre fretters’ feet firmly back on the ground. It seems the “thought leaders” don’t see data centres disappearing anytime soon because:

  • Security – “… there are data that belongs in the public cloud and data that needs to go behind a firewall. … data that will never be put out there. Period. Not going to happen. Because no matter how you encrypt it, no matter how you secure it, there will be concerns.”
  • Interoperability- “…figure out ways for systems that are … behind the firewall … to interoperate with systems that are in the public cloud”
  • Application licensing complexity.
  • Wrangling code to work over the grid – getting any code written that exploits parallel infrastructure seems to be very difficult.
  • Compliance – “What happens when government auditors come knocking to check the regulatory complicity of an application living in the cloud?”

Also they didn’t cover jurisdictional issues, such has, who do you take to court, and in what country, when there is an issue with data mis-use “in the cloud”.

It makes you wonder about why cloud computing will be any different to grid computing, or thin desktop clients. A great idea, but not enough inertia to overcome ingrained corporate behaviour.

April 7, 2008


Filed under: technology — rchanter @ 8:26 pm

Despite having been more or less web-native since the mid-90s, I’ve never really done much hands-on web design or javascript programming. Still, I read (and listen to) enough tech stuff to get the general idea. Today I decided I needed to do a little javascript to flip between alternative presentations of some data. So I figured, generate all 3 up-front, put them inside divs, and set the CSS display property for the one I wanted. That much I knew before I started.

Off to look for sample code. I realise that for people who actually do web work that this is the equivalent of “Hello World”, but I still needed a little help. A handful of JS and CSS tutorials later, and I found myself on the Yahoo developer site looking at YUI.

15 minutes later, a fully functioning tabbed widget containing my 3 bits of data, completely integrated with the existing stuff (different display options for diff output, for what it’s worth). I am seriously impressed at how good YUI is for grab-and-go code samples. Would have taken me at least an hour from scratch (yeah, I know, I’m a sysadmin, not a proper programmer).

Older Posts »

Blog at