May 26, 2004

SysAdmin to SysAdmin: Did you ever have one of those weeks?

Author: Jim Westbrook

This week's little computer-based irritations have reached the
overload state for me. I need to vent a little to folks who'll understand
the frustrations.

The network at the office has been in transition hell for almost two weeks. We're adding a new server and several administrative applications, and moving rack gear from a small rack to a new cabinet. Of course, the hundred-plus users
all tried to find new ways to drive me nuts, too. Sometimes being a
department of one is a good thing, but not in the last couple of weeks -- too
many time-line dependent things going on at the same time. The bean-counters were screaming almost hourly about the collective cost of the new
hardware that wasn't yet producing tangible results.

User workstations that had the motherboards burn up were among the top
contributors to my sanity decrease. We had two in the span of four hours
just die for no apparent reason. On one the IDE drive controller literally
melted a hole in the motherboard before it quit. The other had a
fan die on the video card, which then soldered itself to the AGP slot by
dripping what used to be circuit traces into the socket. Need I mention
that these two machines actually handled about 35 percent of the transactions
that produce revenue? It was literally drop everything else to deal with
getting replacement boxes ordered and substitutes online. The best thing in
this particular mess was that the replacement machines took only two days to
arrive.

"I think I may have a virus" is not the first voice-mail you want to pick
up on a Monday morning. Not only did the caller have a virus, but so did the
rest of the systems on that subnet before I could disable the nodes at the
router. Naturally, it was a Sasser variant for which our anti-virus
software did not yet have a signature, much less a clean-up utility. Those
items would not be available for another five hours. This was the
accounting department's subnet, and it was the department manager who started this snowball down the hill. At least they knew why nothing else was getting
done. Some of them even began to understand why I hate Windows.

Finally, it was Tuesday, the scheduled day (night really) to do the
hardware shuffle on the rack gear. After pulling the bolts out of the
existing rack, I started the move into the new locking cabinet. That's when
I discovered that the bolts from the old rack were too large for the
cagenuts in the cabinet. At 9 p.m. there were no suppliers open, so I had to
remount all of the gear in the old rack and call it a night.

The next
available time that I could take the network offline was Friday night after
the close of business. It was well into Saturday before I got out
of there. Somewhere in the shuffle I managed to kill one of the patch rack
connections, only I was not aware of that detail until the following Monday
morning when that department's manager called, screaming that they were
unable to accept inbound merchandise because the barcode machine could not
connect to the network. Re-punching the patch rack did not correct the
problem, so it's still down, awaiting the cable vendor to replace the in-wall
Cat-5 cable -- more not-so-happy folks to add their voices to the bean counters'
chant of "cost overrun" on every facet of the works in progress.

I could hardly wait to get home to my Linux network, where viruses are not
rampant, the network hardware is not in flux, and I could just relax at the
keyboard. I should have known better, huh? What I found when I got home
was that there had been a brief power outage at about 9:15 a.m. Two of the three
UPS boxes had failed, and only one system had come up when the power was
restored. It's only four hours later and all of the systems are finally
back online. Honestly, two of them just needed to be rebooted to insure
that all of the desired services were restarted. The third, however,
failed fsck of the ext2 boot partition. I was finally able to boot a Knoppix disk and run fsck manually, which corrected the corruption, but it took three downloads to get a working
ISO file that the CD-ROM in the crashed machine could read.

I was handling all of this without coming completely unglued until I went to
print a short document. The danged inkjet printer is out of the
liquid gold Lexmark calls ink. At least I can send the file to work and
print it tomorrow -- I hope!

Click Here!