MTN Business Gallo Manor datacenter outage - 03 May 2015

Deckert

Well-Known Member
Joined
Jan 13, 2004
Messages
425
Reaction score
38
Location
Centurion, South Africa
Hi,

Just spoke to MTN Business support personnel who confirmed that the Gallo Manor datacenter is experiencing a power outage. Their backup generators did not kick in. Seems their DC staff are currently onsite trying to sort things out.

Power problems started at 18:02 when one of the power feeds went down, quickly followed by the secondary feed going down a few minutes later, which is when the whole DC went dark.

--deckert
 
Hi,

Power has been restored, but their core routing infrastructure is still down. They are working to resolve this last issue and then hopefully things will be back online.

Amateurish is not the right word. Unprepared comes to mind - poor planning and even worse execution of their mitigation planning (eg. only sent system engineers to site *after* power was restored).

--deckert
 
This is now the nth time this happened. Really MTN ? do you not test this stuff?
 
Hi,

Yes they are really bad at giving information. Bit of a stupid fault of mine that both our primary DNS and Secondary DNS records reside on a service provider that hosts inside Gallo. Will change the secondary to hetzner... so at least my stuff will be up.
 
Thank you for the regular updates, much appreciated.

I understand that sometimes downtime is inevitable, but continuity is quite important in this game. I'm beginning to consider migrating to Amazon AWS EC2...
 
Feedback from MTN Business is that some core router infrastructure has been restored, but they are now working on the hosting routers. The hosting routers distribute traffic to the individual server cabinets and these routers are still down. MTN has assured us that they are working on the issue at the highest levels.

Quoting MTN:

"Please note that the supervisor cards on hr11.jnb6.za and hr12.jnb6.za are faulty.
We have reseated the cards and rebooted the devices and they still remain offline.
Our Engineers are now working on bringing up hr16.jnb6.za."

--deckert
 
Great .... I wonder how long this is going to take!
 
The total lack of communication from MTN is what gets to us. We have to extract this stuff (with the help of our acct manager) with a chisel and hammer. And even then only the bare minimum is shared.

Still down at this stage. It's 5 hours of downtime and counting.

Feedback from MTN:

MTN Business has informed us that they do not have spare supervisor cards on hand at the same location and they are currently checking their other sites for stock of these devices. This just shows total lack of planning and preparedness for this type of hardware failure.

--deckert
 
Last edited:
The total lack of communication from MTN is what gets to us. We have to extract this stuff (with the help of our acct manager) with a chisel and hammer. And even then only the bare minimum is shared.

Still down at this stage. It's 5 hours of downtime and counting.

Feedback from MTN:

MTN Business has informed us that they do not have spare supervisor cards on hand at the same location and they are currently checking their other sites for stock of these devices. This just shows total lack of planning and preparedness for this type of hardware failure.

--deckert

Lack of supervisor cards for what? to get into the datacenter itself?
 
Lack of supervisor cards for what? to get into the datacenter itself?

Nah these are like the main network cards in a core switch, they are just called supervisor cards and are not actually access cards to the premises, although they use those too :P
 
Top
Sign up to the MyBroadband newsletter
X