We are just now recovering from a massive power outage at Rackspace’s Dallas (DFW) data center where all of our Laughing Squid Web Hosting servers are located, including the one that hosts this blog. A truck hit one of their transformers, causing damage, which compromised the power of the data center, specifically the cooling system. Our servers were pro-actively shut down in order to prevent overheating while the power was restored. Total downtime was approximately two hours.
While we were offline, I was posting updates over on our hosting status blog, which is hosted on wordpress.com. Overall Rackspace handled the situation well, but one of my main complaints is their lack of a public status page or blog, which would have been very useful when conveying information to our customers. Rackspace posted an update on their customer portal an hour into the outage, but up until that point I had to rely on Twitter updates and blog posts for information on the downtime.
Special thanks goes out to the Laughing Squid support team for going the extra mile during this unfortunate situation. This is the first time we have had power outage like this in the 9 years that we have been in hosting (8 of which at Rackspace).
UPDATE 1: I’ve been tracking coverage of this issue over on our status blog and there is a Techmeme thread currently developing.
UPDATE 2: 37signals is hosted at the DFW data center and was down as well.
UPDATE 3: Rackspace has posted an update on their two recent power outages.
We cannot promise that hardware won’t break, that software won’t fail or that we will always be perfect. What we can promise is that if something goes wrong we will rise to the occasion, take action, resolve the issue and accept responsibility.
UPDATE 4: I just had a great conversation with Rackspace chairman Graham Weston and co-founder Pat Condon about the power outage, discussing details on how it happened and what they are doing to try and prevent it in the future. They agree with me about the need to provide information on data center issues much faster and I think I’ve even talked them into setting up a status blog.
UPDATE 5: We have some great neighbors at Rackspace, Threadless was down too.
UPDATE 6: The Technology section of The New York Times is currently linking to our status blog coverage on the power outage.
UPDATE 7: Rackspace has posted several updates on what caused the power outage and resulting downtime, as well has how they are working to prevent it in the future. Lanham Napier, President & CEO of Rackspace, has posted a video statement as well.