View Full Version : Admins worst FEARS
Hi Guys,
Though an admin can be the most skilled person in the world and have your network running and server running as crisp as a waterfall ... WITHOUT properly functioning aircon units you are STUFFED.
Hence my dilemma… I have two 30 000BTU Daikins unit where one is been giving us issues for the past few weeks. More money was spent trying to fix the damn thing than what its worth replacing it.
The main issue I am having is that these units aren’t server room class aircons, but more office(ish) orientated units.
If you are happy and confident with the cooling solution in your server room, can you guys tell me what makes & models you are using? Or suggestions as to what I can use. I have about R100k MAX that I can spend.
Regards,
Akkie
Rhino
30-12-2011, 04:11 PM
We use Rittal Liquid cooled racks. They are awesome and save us costs big time, BUT initial outlay is very very expensive....
Messugga
30-12-2011, 04:26 PM
For the love of all that is holy, just make sure you can actually grab the unit's status by some or other means of communication, be it SNMP or MODBUS.
I work on SCADA systems and we recently worked on a system where the designer forgot about that little requirement for the rather large data centre. The result? A month or so ago, two of the units failed and nobody noticed for quite some time. Bit of an issue.
The_Librarian
30-12-2011, 04:39 PM
Smoothwall + performance graphs - on supported hardware it allows me to keep a daily/weekly/monthly/yearly graph of the server room temperature.
http://dl.dropbox.com/u/1188550/hddtemp.jpg
http://dl.dropbox.com/u/1188550/hddtemp2.jpg
Great tool to check remotely (manually, of course) for temperature graphs, and also great to see what the temperature was doing the past 24hrs/week/month etc.
We also had an issue with our airconditioner recently - it iced up. I had to run it at 26deg to prevent it from icing up until the aircon techies could check it out - and it needed a regas.
That issue prompted me to install the performance graphs MOD on Smoothwall, this is a great help in determining what the temperature in your server room is doing.
AcidRaZor
30-12-2011, 05:03 PM
*monitoring* the heat is one thing, talking cooling units is another.
I'd definitely move my servers to a decent data center and remote connect with 2 bonded 10mbps lines. But if all else fails, make sure the room you're hosting in is well ventilated. I have had the opportunity to install a server in a vault on the basement/parking floor of an office building where their normal PABX's are hosted. The room's temperature quickly shot up and all we did was get some extractor fan's and make sure it's ventilated enough.
The_Librarian
30-12-2011, 05:12 PM
*monitoring* the heat is one thing, talking cooling units is another.
I'd definitely move my servers to a decent data center and remote connect with 2 bonded 10mbps lines. But if all else fails, make sure the room you're hosting in is well ventilated. I have had the opportunity to install a server in a vault on the basement/parking floor of an office building where their normal PABX's are hosted. The room's temperature quickly shot up and all we did was get some extractor fan's and make sure it's ventilated enough.
On the one hand, if there's a fire and it's well-ventilated, it'll only make the fire spread quicker...
On the other hand, if it's not well-ventilated, and the aircon goes bonk, then you're in a bit of a bothersome spot...
MEH
I'll prefer the non-ventilated option...
ponder
30-12-2011, 09:39 PM
http://www.aiac.co.za/
bubbatentoe
02-01-2012, 07:50 AM
I've used MANY brands for server room cooling and they all fail at some point (if not maintained as per the service interval).
I've lately settled on Samsung units purely because we get maintenance agreements thrown in.
Ducting is also VERY important.
The primary AC cools the server room but ducting from secondary unit can be opened (to the server room) if unit1 fails.
a server room (20-odd servers) goes from 18degrees to 45degrees in about 5 minutes.
Shot!!!
Thx guys for the feerdback!! It is quite helpful. As bubba had said... Yeah man I know. I have about 31 HP's and it gets nasty in there very quickly :( I was switching off and switching on servers like flippen crazy on the public holiday while everyone was on holiday
avert
10-01-2012, 09:43 AM
manage a server room with about 300 servers. when one aircon fails, i go home. In short, secondary ducting, secondary aircons, secondary power.
Also get hp to come do their environmental analysis stuff. You might not even need that second aircon, and they will definately save you money
hungrybeaver
10-01-2012, 10:00 AM
Well one of my fears is being realised today. All of my companies branches (10 of them) connect to our Head Office through a WAN implementation to receive email and use software on our servers at HQ. Freaking WAN goes down due to a cable fault in the area and now all the branches cannot connect or receive email. I hate having things break that are out of my control :mad: And when this happens, all work that I had planned to do in the morning now takes a back seat :(
Never thought I'd say this, but thank fsck for BB and BES! Like 80% of our employees have BBs on our BES so at least they can still answer emails.
The_Librarian
10-01-2012, 08:19 PM
Well one of my fears is being realised today. All of my companies branches (10 of them) connect to our Head Office through a WAN implementation to receive email and use software on our servers at HQ. Freaking WAN goes down due to a cable fault in the area and now all the branches cannot connect or receive email. I hate having things break that are out of my control :mad: And when this happens, all work that I had planned to do in the morning now takes a back seat :(
Never thought I'd say this, but thank fsck for BB and BES! Like 80% of our employees have BBs on our BES so at least they can still answer emails.
I know that feeling. And everybody constantly asks you when the problem will be fixed.
Like we have magical staffs or the such that we can fix problems remotely :rolleyes:
hungrybeaver
06-03-2012, 09:46 AM
I sit right against the wall to our server room and I can hear the hum of the servers. Last week I walked into the office and the hum was a bit louder than normal. I had no alerts on my phone so the servers are all still running, so i sort of told myself that it has always sounded that way (:o). I needed to have a look, so I open the huge fire door to the server room and man is it loud! Way louder than normal, and warmer too. Just about every server has its fans going full blast. The aircon is on but its blowing much warmer than normal which is obviously a huge shock to me. Turns out the aircon needed to be regassed, and luckily I managed to pursued the aircon company to come around that same day to fix the problem.
Man it is not a good feeling coming in on a Monday morning and hearing the servers screaming like that! I've now setup an aircon maintenance check every 2 months.
thisgeek
06-03-2012, 12:17 PM
You need a Business Continuity Plan and a Disaster Recover Plan for when the fecal matter connects with the fan.
The_Librarian
08-03-2012, 04:57 PM
Interesting one.
3-phase feed from transformer into building, and is distributed all over the building in a seemingly random manner.
Transformer went cuckoo yesterday evening at around 18:00 (according to the temp graphs on my Smoothies). This meant that the feed (or phase) feeding the aircons went... dead.
This morning I was busy assisting with a software installation, and didn't had any time to check on the servers etc. But I noticed all building lights was off. And some lights was flickering strangely.
Then the one set of lights completely brownouted. They still glow, but is very dim. :wtf:
Around that time something tells me I need to check my servers. NAO. Which I did - and was sorely distressed to find that Servers No 3 and 5 is down. Went down to the server room - to find a nice sauna. :eek:
Powered down the last two remaining servers, then faffed around a bit, not knowing what to do.
One of the managers then suggested that we get an extension cable, and lay it from a working plug to the server room. Luckily there was two working points close to the server room, and I then placed an extension cable between that point and the servers. All went well, powered 'em all up one by one, no damage. Whew.
Noticed that the cable was getting a bit warm to the touch :eek:
Laid another extension cable from the second point to the server room, and moved some of the servers over to that cable.
Now all is well - except for the aircon. A big desk fan is providing cooling for now.
Iron is hot, I strongly punted the case for a generator, and it is on the table now.
froot
08-03-2012, 05:04 PM
Hahaha indeed.
A friend of mine was in charge of (not saying where), 200 large servers.
Short story: aircons went down for some reason. The CPU temps went to 85+ and the boss said they couldn't shut down as the servers are of a critical nature (note, air temp went to 40-something as a result). :D
avert
11-03-2012, 08:54 PM
phone call at 6am this morning, generators failed ( god damn maintenance people do your job :||| )
This is what some retard caused :-
-lost power from both A and B DB boards in our network racks,we lost a 6500,4500,riverbed,broadlink router, 2 diginet lines,cisco acs,cisco wcs, our 2x48 port brocade switches, 1x16bay fully populated blade enclosure, 3x2960s, 1x1821 and a cisco waas.
-Due to the brocades going down, servers all servers lost their storage.getting an old GFS cluster up again.fun,fun. Brocades also ended up shutting their ports due to excessive errors.
-1 of our 2 XP storage units decided it wasnt going to turn on again. (about 60TB worth of data)
-2/4 aircons lost power, with the third one having fanbelt issues during the previous week
-an IBM P595 decided to turn itself off :) (due to the multitude of failures i guess)
Great Sunday, got everything back up by 4pm.
phone call at 6am this morning, generators failed ( god damn maintenance people do your job :||| )
He he. I see quite a few genset failures during a period of a month. ~90% Of them fail due to insufficient maintenance! They need to be tested with a proper load on a bi weekly basis if not waaaay more. The most common cause of failure is usually dirty diesel. Diesel does not like standing in a container/tank. It likes a half full tank even less because diesel fungi loves that half full tank. Dirty fuel leads to clogged injectors leading to excessive heat leading to a failing generator leading to dead servers.
phone call at 6am this morning, generators failed ( god damn maintenance people do your job :||| ) Where was this?
The_Librarian
20-03-2012, 07:16 PM
We got a genny at last. Gonna draw up a maintenance plan.