6 back-to-back power outages hit the SOMA neighborhood of San Francisco Tuesday afternoon causing major havoc with popular web services. 365 Main is down, along with craigslist, Technorati, Yelp, AdBrite and SixApart (including TypePad, LiveJournal and Vox). It caused some problems with servers used by Current TV, RedEnvelope and Second Life.
Pacific Gas and Electric (PG&E) is currently working on the issue. It is now estimated that over PG&E 30,000 customers are with out power.
So the big question, where is the backup power at the data centers used by these services? UPS and diesel generators should normally help situation like this.
UPDATE 1: Digg is still up.
UPDATE 2: Some services, like Technorati are starting to come back online.
UPDATE 4: We’re seeing several reports of a chaotic scene at 365 Main, with a line of sys admins forming outside the door waiting to get in to work on their servers.
UPDATE 6: The funny thing is that The Onion predicted all of this a couple of weeks ago.
UPDATE 8: The Associated Press is reporting that the Netflix downtime is not related to the power outage in San Francisco. I’ve updated the post accordingly.
UPDATE 10: Automattic sys admin (and former Laughing Squid sys admin) Barry Abrahamson has a great write-up on why data centers should have better power redundancy and what they have done with WordPress.com to help it survive possible outages like this.
UPDATE 11: What’s amazing about this is that there has still been no public response from 365 Main on their website. It seems like they should at least acknowledge the issue.
* At 1:49 p.m. on Tuesday, July 24, 365 Main’s San Francisco data center was affected by a power surge caused when a PG&E transformer failed in a manhole under 560 Mission St. While back-up electrical infrastructure is installed in the facility to defend against power surges, an initial investigation has revealed that certain 365 Main back-up generators did not start when the initial power surge hit the building. On-site facility engineers responded and manually started affected generators allowing stable power to be restored at approximately 2:34 pm across the entire facility.
* As a result of the incident, continuous power was interrupted for up to 45 minutes for certain customers. We’re certain 3 of the 8 colocation rooms were directly affected, and impact on other colocation rooms is still being investigated. Due to the complexity and specialization of data center electrical systems, we are currently working with Hitec, Valley Power Systems, Cupertino Electric and PG&E to further investigate the incident and determine the root cause of why certain generators did not start. All generators will continue to operate on diesel fuel until the root cause of the event has been identified and corrected. Generators are currently filled with over 4 days of fuel and additional fuel has already been ordered.
* We will apply knowledge gained in this investigation to all 365 Main facilities to help prevent this type of incident from happening again.