Not the Best Day: Server Problems at fxguide

Thanks for your patience for the last 29 hours…and thanks to those of you who let us know the site was down and you missed it. Jeff, Mike and I really appreciate it. Over the years, we’ve been fortunate at fxguide to avoid major server problems.

In fact, until Wednesday, we didn’t have more than 3 hours of continual downtime since our first day in 1999. We’ve been with our dedicated hosting provider, Server Beach, since 2001 and have been incredibly happy with their services and support. In fact, I was was just bragging about this on Tuesday in Chicago with some fantastic new web programmers we recently hired at fxphd/fxguide.

So I guess you could say Wednesday was my fault.

So what actually happened with the server? If you want to know…click through the read more link.

Wednesday was a serious amount of downtime here at the site, adding up to almost exactly 29 hours. It’s all good now….our backups were in order and the site is back full force. The good news is that there were no attacks. And our backups were perfect. No conspiracy. Just some hardware failing. The kind of thing you probably hate happening on your own computer.

In the end it came down to a RAID card failure (we use RAID1 for protection). The controller failed. One of the two drives failed. And — worst of all – the card wasn’t recognizing the RAID configuration. Which meant a total OS reload (and a rebuild of the server). But the problem was that the OS reload became problematic. Server Beach spent about 10 hours trying multiple times to reload the OS, but it failed each time.

After a ton of troubleshooting, they discovered the problem was my email login. I had used a “+” in my email address, as Gmail allows this for a single account (as in “[email protected]”, “[email protected]”. I used this for easy filtering and forwarding from my end. But I guess their scripts took it as a grep pattern or something else. And it choked.

But instead of simply asking me to change my email address — they tried multiple workarounds to get it to work. This is really the only fault I have with their support in the last 24 hours. A simple update to me could have easily straightened it out as I just would have changed the email address. But instead, they tried to fix their loading scripts and such to accommodate my email address. I’ve been there myself at times in troubleshooting…and you lose sight of what you really should be doing and finding the simple answers to problems.

By this time its about 20 hours into the day, so I bag it and head off to bed.

The next morning I’m up at 6am but no change in the server and worse yet no update from Server Beach. Finally, after multiple phone calls from me throughout the morning, they just change the email address and the OS reload works. After we got the server back in our hands, it took me about 90 minutes to configure the server and reload from the database backups. Copying the files took a bit longer with all the hires fxguidetv eps we’ve done over the years…but all in all that part was minimal. And I was relieved the backups worked.

So around 4pm here in Chicago we were back up today — and I could start working on the October fxphd term.

Again — apologies for the downtime. But I think there’s a good lesson in troubleshooting and fixing problems to be learned from this experience…which is why I share the story.

4 thoughts on “Not the Best Day: Server Problems at fxguide”

  1. phew !!!
    i was having withdrawal symptoms 😛
    Thanks for writing out your experience john.It’s something of a learning experience and might come in handy some day.

    look forward to a great new term @ fxphd.

    b

  2. i was wondering if you had some how p’d off 4chan and were being LOIC’d

    glad your back

    on the last fxdod you wondered why someone would listen but not join fxphd

    btw the reason I listen to fxdod but don’t join fxphd is that I am severely disabled and cant rely on having enough spoons http://is.gd/fEudS- in any week/month to do the course

Comments are closed.