Rackspace Outage Nov 12th

1,163 Views

On November 12th at 13:51 CST Rackspace experienced an isolated issue in their core network. A small number of their customers were affected, including REW. The outage lasted about 90 minutes. In simple terms, a core network switch died and when the traffic failed over to the secondary switch it also died. Rackspace is investigating the incident to find ways to improve their network and processes to ensure this event is not repeated. REW Sysadmins were immediately notified of the outage by our monitoring tools and were in constant contact with Rackspace during the outage working to resolve as quickly as possible.

REW apologizes for this outage; we promise that we are putting Rackspace's feet to the fire to ensure maximum uptime for our customers!

Here is the incident report from Rackspace if you want the techy details:


INCIDENT OVERVIEW

On 12 November at 13:51 CST, an issue occurred with an ExNet aggregation router in our DFW data center. As a result, a portion of customers with devices provisioned to the router experienced an interruption in service for approximately 53 minutes due to a failed module on the device. Our engineers replaced the module at 15:21 CST to restore service.

REMEDIATION AND CURRENT STATUS

Engineers were alerted to failures on a switch within the affected aggregation router VSS pair. During remediation efforts being performed on the secondary affected switch, the primary switch became unstable and rebooted into recovery mode. The problem on the secondary switch was caused by a faulty module that was replaced to restore service. We apologize for any inconvenience you experienced and appreciate your patience as we worked to resolve the issue. We will be performing a root cause investigation to determine the cause of the issue as well as actions to ensure a stable and reliable network environment.

Comments

Enjoy this post? Why not share with friends or add a comment of your own?

mikey

Aha! I saw that both of my websites were down and when I went to the REW website it was also down. I'm glad REW was able to recover quickly.

Share your thoughts…