Tuesday, December 2, 2008

Uptime: It's not rocket science

I've always been stumped at how some people just have trouble keeping a server up, worse yet, those that don't have a proper plan if it goes down, which happens.

In the past, 2 days, I've witnessed two Ultima Online servers go down, for hours. :O

There is no reason for a server to be down that long. If a HDD fails and it's not a raid, then there should be a very good plan to recover as fast as possible. 1-2 hours MAX. With raid, there is no reason why it should take more then an hour if something else goes wrong.

I will admit, my online production server has NO raid, however, I have everything documented in the case of a failure, so I can be up and running ASAP.

Also, backups are your friends. Automatic backups. No excuse.

That said, this is my uptime.



I'm aiming to pass 1 year. This server has been rebooted once, and it was the day after it was deployed. I did a reboot to test that everything works properly after a reboot, so I don't get any surprises in the future if I do have to reboot. (such as a service not set to start, and it reboots when I'm not around).

This server is hosted at The Planet and their service and support have been great.

2 comments:

Bryan C. Fleming said...

You would think they would be using an uptime monitoring service (I use http://www.internetuptimemonitor.com). That way they could fix the problem within a few minutes.

BTW - Nice Uptime on your server. 230 days is pretty amazing!

Red Squirrel said...

Good point, especially if they have a cell or other mobile device they could have alerts sent there.

In fact I don't use such system and probably should (will code one eventually), but then again I only have one server and if it goes down I'll know fairly quick. I go to one of my sites at least once every hour. Though the issue I could run into is if it goes down and I'm at work.

But even then I could quickly fix up something so there's at least a status page of some sort to tell everyone I'm aware and I will be working on it asap. Lot of people tend to like to leave you in the dark when their service goes down.