DreamHost are having some serious problems with one of its fileservers this weekend. The problems started early Saturday morning and are still not completely solved now 48 hours later.
DreamHost has been pretty good at updating their DreamHost Status blog this time (Network fileserver troubles, File Server Issues, File Server Finale and More Files Issues!), but I just got the latest news directly from DreamHost support:
As you may know, we recently had a network filing system crash causing severe downtime. When a filer like this crashes, it basically has to go through each of the it’s disks to verify that they are ok, and can accept, read or change data. This takes an extremely long time, due to the sheer size of the machine (two terabytes!).
Once we got the machine up, fixed and serving files, everything seemed like it was ok, so we went back to making sure all content, data and websites were working normally.
Right about then, it crashed again! This time, however, it came back up correctly, so it didn’t take as long as it had previously.
That was 6am PST this morning. Since then, we are currently constructing a new filer machine (we had to cannibalize two just to get this current one back up and running) to offload everyone. In the meantime, it has crashed again, however, it seems to come back up in the correct state. If possible, you may want to get any sensitive, or important data off of your account, just to be safe. We are working on getting everyone off the faulty machine, however, as you can imagine, it will take sometime.
We are terribly sorry about the problems related to this disaster and hope to have everything stabilized and working ASAP. Please understand that we are doing everything humanly possible (including working 24 hours shifts, and sleeping in our data center!) to get every site back up and running. If your domain is down, or showing an error, we are working on reconfiguring all of the effected services, and should have them fixed soon.
Since the discussion forum also seems to be hit by the outage, it’s hard to tell how many customers are affected by this, but judging from the support queue (which has had an average of 400 customers in queue and a maximum of 600) the incident might not be as severe as it sounds. Hopefully DreamHost’s staff will soon have everything back under control.