First let me start off by apologising to you for last night's outage - we pride ourselves in having built a resilient, independent network, but a small part of it has let us down in a big way yesterday.
What happened?
Last night's (28 August 2019) issue was a result of our upstream's switches going into a constant flapping (service turning up and down) state between 20:30 and 00:45. The vendor is still investigating the cause.
Both our protected route between JHB and Cape Town as well as International Transit from this vendor are provided via a redundant link at our Cape Town node, from these affected switches. This meant that some transit traffic bound for our network was couldn't make it to where it needed to go, both in Cape Town and in some cases, nationally. In addition, services like AWS which we pick up in Cape Town primarily, were affected for a short while until we forced these to JHB.
What are we doing about it?
We are bringing up new National routes between Cape Town and Johannesburg that we will manage entirely. Web sQuad are one of the most diversely connected ISPs locally, with multiple transit providers, and we are restructuring this design so that a single provider can't adversely affect a users' experience to the extent it did yesterday. We have also terminated all services with the affected vendor.
What's the way forward?
Our upgrade path to ensure you're not affected again is as follows:
- We are establishing a new, fully protected JHB-CPT circuit with immediate effect. We are awaiting the installation of a few nuts and bolts today and tomorrow and we will immediately migrate the L2 service away from our current vendor - this will completely eliminate any risk associated with the switches in question. We are working to have this process completed by early next week
- Our largest international transit provider will switch their handoff to Cape Town instead of Johannesburg later next week
- A new transit provider will be brought up in both Cape Town and Johannesburg during September
In closing, I want to assure you that ensuring optimal connectivity and the highest standards of uptime and support are a core principle at Web sQuad. We know that the above upgrades will provide you with the best possible experience going forward and we look forward to remaining your preferred service provider in the future!