Wikipedia suffers outage

February 13, 2006

Wikipedia, the well-known free-to-edit encyclopedia, suffered a period of extended downtime starting approximately 10:00 UTC, following a failure of the primary DNS/NFS server, which caused numerous cascading failures on other servers within the Wikimedia cluster.

This failure led to all Wikimedia projects and web sites being rendered inaccessible for several hours. Technical staff worked to restore services using backup equipment, and set up temporary hosts to fulfill some of the roles of Zwinger, the failed host.

Following restoration of the backbone of the cluster, several problems arose within the database, causing pages to appear missing. At this time, it was noticed that the normal editing pattern of users was inadvertently causing damage to pages, although no histories were lost. The exact cause for the database issues is not known; while there was a heavy replication lag at the time, it has been asserted that this would not have caused such wide problems.

In order to prevent future problems and to protect the integrity of data, editing was shut off for two hours and 47 minutes. Meanwhile, technical staff worked to restore several squid cache servers and to reduce the dependencies on NFS mounts.

Editing returned to near-normal levels from about 21:45 UTC.

As of 22:00 UTC, no official statement has been released commenting on the downtime.