Saturday, December 09, 2006

fighting fires

I think I've figured out what it is that I most enjoy about sysadmin work: fighting fires. No, not literal fires. If I ever have to fight a literal fire while working as a Unix system administrator, things will have gone very, very wrong. The fires I'm talking about are the system outages that have to be resolved as soon as possible, and even then it will be too late. The kind of outages that have customers asking for tens of thousands of dollars in refunds, the kinds of outages that have CEOs screaming for status reports every 30 minutes, the kind of outages that involve dozens of people unable to do their job until I fix the computers that they are depending on.

I'm exaggerating a bit--clearly, any job that features this kind of activity on a daily basis is a job that won't last long, if only because any company that has so little stability in its computing infrastructure is doomed to fail. Alternatively, any sysadmin who sees so many outages happening on his watch is a sysadmin that really needs to go back to answering phones for the help desk. The whole point of employing sysadmins is to prevent these sort of outages from ever occurring, after all.

But any IT shop that is frequently changing its hardware or software infrastructure is going to experience unplanned system outages or will face projects with urgent deadlines that absolutely must be met. These situations can be stressful, but it's a kind of stress that keeps me interested in what I'm doing. Much better to be working under pressure than sitting around waiting for something to happen, doing hardware inventory, reading or writing documentation, looking for ways to tweak software to gain incremental efficiency improvements. I *like* knowing that people are depending on me to get a job done.

We had another outage on our LDAP infrastructure for about 30 minutes yesterday morning. It could have been much worse, but I and a coworker of mine were on the problem immediately, had it diagnosed, developed a procedure for fixing each LDAP client, and had over 300 servers repaired in 30 minutes. It was *fun*. I want that kind of fun more often.

This job has been a bit too boring for my tastes. Fortunately, that should be changing. We just kicked off the planning for a massive overhaul of our core database infrastructure. We'll be migrating our storage platform to a new Hitachi SAN, our database servers to new Sun v890s, and our network gear to new Cisco switches.

This isn't that big of a deal, except for the fact that our customers want no downtime whatsoever (sad for them, because downtime will be unavoidable), and our management wants us to simultaneously be upgrading all of our customers to new versions of our proprietary software platform. And they want all of this done by the end of the 1st quarter. So the entire project constitutes one big fire. I have a feeling I'll be working a *lot* of Saturdays for the next three months. I'm glad. I'm tired of boredom.

No comments: