Just In Case 1and1 Takes Us Offline Again…

I thought I’d give you a quick update, while I have an operational window, of our hosting status. As some of you may have noticed last night and midday this morning, we’re continuing to experience sporadic outages of our MySQL databases. If you try to comment and get a weird error, that’s probably it. If you want to be sure, hit our wiki; if that’s offline, it’s a MySQL problem. Or to place the blame more accurately, it’s a 1and1 problem – this experience has nothing to do, as far as I’m aware, with MySQL as a product. It’s rather 1and1’s inability to manage connectivity to that product.

Despite repeated emails to a 1and1 supervisor requesting information on what, precisely, the problem is I’m still in the dark. Here, in fact, is one of the email threads from a interchange with a 1and1 supervisor (whom I will anonymize):

Dear Steve,

I already escalated your problem to the admins. I will monitor the case
and if there’s any feedback from them I’ll let you know.

Sorry for the inconvenience. Hopefully the admins can permanently fix the
problem.

kind regards,

> relating to my message from last night, the mysql outage is *again*
> occuring.
>
> why has this not been permanently fixed?
>
> On Tue, 2006-05-23 at 18:30 -0400, wrote:
>> Dear Steve,
>>
>> As I check the records here, somebody already fixed the problem and
>> emailed you. Can you please check it in your end?
>>
>> Once again, sorry for the inconvenience.
>>
>> If you have further questions, please do not hesitate to contact us.
>>
>> kind regards,
>>
>>
>> > ,
>> >
>> > i appreciate the apology, but we’re about to enter day five without
>> > access to our MySQL databases. when should we anticipate them being
>> back
>> > online? every day that our databases are down, 1and1 is costing us
>> > money. we pay for a dedicated server to avoid these sorts of problems.
>> >
>> > more importantly, why has the problem persisted this long? the
>> previous
>> > occurence of this problem was addressed relatively quickly, and yet
>> this
>> > has dragged on for nearly a week. even if this was a hardware related
>> > problem, surely you could have reimaged and reloaded from backups by
>> > now.
>> >
>> > i would appreciate the following:
>> >
>> > 1. a detailed explanation of what – precisely – the problem is
>> > 2. some expectation of when the problem might be resolved, or failing
>> > that, dumps of our MySQL databases that can be restored on alternative
>> > hardware
>> > 3. a service credit for the complete outage and lack of communication
>> > regarding the problem
>> >
>> > – steve
>> >
>> >
>> >
>> > On Mon, 2006-05-22 at 15:33 -0400, wrote:
>> >> Dear Mr. Stephen O’Grady,
>> >>
>> >> I sincerely apologize for the inconvenience it has caused you and for
>> >> the
>> >> delayed reply. As the issue with the outage of our MySQL database has
>> >> not
>> >> yet been completely fixed, your case which was previously being
>> worked
>> >> on
>> >> by our second level of support is now escalated to our technical guys
>> in
>> >> Germany. Currently, we do not have an estimated time frame for its
>> >> resolution. Be assured though that we are aware of this issue and
>> that
>> >> we
>> >> are taking extra measures to resolve this as soon as possible.Thank
>> you
>> >> for your patience in this matter.
>> >>
>> >> Sincerely,
>> >>
>> >> 1&1 Internet
>> >>
>> >>
>> >> > ,
>> >> >
>> >> > i’m requesting a status on our account (#7121550), which
>> experienced
>> >> an
>> >> > outage of our MySQL databases last thursday – an outage which has
>> not
>> >> > been fixed some four days later.
>> >> >
>> >> > when will this be addressed?
>> >> >
>> >> > – steve

But you’ve heard enough bitching from our end, what matters now is what our plan is going forward. The first step is recovering our V20Z; that will happen tomorrow afternoon. Next up is shipping it to the chosen hosting provider (I’m almost decided on this). From there, I’ll get Apache, MySQL, etc up and running and drop in our base site along with the blogs. Once those are in place, I’ll cut over DNS, and 1and1 should be more or less out of the picture from your perspective. It may take me a bit longer to get the wiki back up and running, because we’re still running Mediawiki 1.3 – current version is 1.6 – but we’ll restore it as soon as humanly possible. Our subscriber library, now having been phased out contractually, will not be ported: we will instead be making that content freely available via a means as yet undetermined (that’s Cote’s baby ;).

Will we have better uptime than 1and1? Who knows – anything can happen, and it’s entirely possible that we’ll have hardware problems, or that I’ll screw up our Apache config, etc. But at least we’ll have some control, and won’t be at the mercy of a provider who keeps us on hold for a half hour to open a ticket about a problem that’s been fixed three times now. If/when something does go wrong, I should at least be able to give you better information than I can today.

I feel like a broken record saying this, but again, let me apologize personally and on behalf of my colleagues for the inconvenience that’s a direct result of 1and1’s incompetence. We’re doing everything in our power to rectify the situation, and hope to have some news for you on that front soon. If you have problems or questions, please direct them to me as the hosting situation is my responsibility. Thanks, and sorry.

tecosystems

Just In Case 1and1 Takes Us Offline Again…

About

The Book

Subscribe to Blog via Email

Archives

Search

Recent Posts

Categories

Archives