Replacing EoL Seagate 2TB HDD in RAID 5 array

Pada · Mar 24, 2014

Hi guys,

I'm looking for advice on replacing a faulty (and End-of-Life) Seagate Green Barracuda Green 2TB (ST2000DL003), which was part of my RAID 5 (4x 2TB) setup. Luckily I had another identical 2TB HDD as a portable drive, but it has only a few bad-sectors. At least my drive that broke ran for just over 2 years in my HP Microserver N36L,.

Now can I simply replace the faulty drive with another brand & model 2TB drive or is that ill advised?

I don't really care about performance, but I do care about my data - which is why I went for a RAID setup. I just wish that I've gone for a RAID 1 (mirror) setup instead

My HP Microserver has 8GB of RAM, so I am caching data, which I suppose should compensate somewhat for drive speed differences?

DrJohnZoidberg · Mar 24, 2014

Pada said:
Hi guys,

I'm looking for advice on replacing a faulty (and End-of-Life) Seagate Green Barracuda Green 2TB (ST2000DL003), which was part of my RAID 5 (4x 2TB) setup. Luckily I had another identical 2TB HDD as a portable drive, but it has only a few bad-sectors. At least my drive that broke ran for just over 2 years in my HP Microserver N36L,.

Now can I simply replace the faulty drive with another brand & model 2TB drive or is that ill advised?

I don't really care about performance, but I do care about my data - which is why I went for a RAID setup. I just wish that I've gone for a RAID 1 (mirror) setup instead
My HP Microserver has 8GB of RAM, so I am caching data, which I suppose should compensate somewhat for drive speed differences?

If it's just a software raid, which it is from the sounds of it, then you can just replace it with any 2tb drive. Doesn't matter what brand it is.

I have to do the same, getting errors on one of my disks too.

EDIT: Also, RAID5 is almost the worst setup to go for if you're looking for best redundancy/stability.

Pada · Mar 24, 2014

Thanks DrJohnZoidberg!

I am doing software RAID-5 in Ubuntu on my HP Microserver N36L.

I have realised how terrible software RAID-5 is on my HP Microserver after I setup everything and then did MySQL stuff! At least after some tweaking I got my performance drastically better.

Initially I thought I would need lots of disk space and I didn't want to fork out tonnes on hard drives, so I opted for some redundancy and maximum disk space. Of course in hindsight I would've opted for a RAID 0+1 setup instead, just because I don't need 6TB of space.
... if I only knew hard drive prices would increase, I would've bought at least 1 or 2 more HDD's back then, when I paid R850 per 2TB.

Pada · Mar 24, 2014

I see now in HD Tune that its only the "Reallocated Sector Count" S.M.A.R.T reading that is very high: 44288

As far as I understand that is an indication that the drive may be about to fail, so I suppose I can still use the drive as a portable drive for non-critical data.

DrJohnZoidberg · Mar 24, 2014

Pada said:
I see now in HD Tune that its only the "Reallocated Sector Count" S.M.A.R.T reading that is very high: 44288

As far as I understand that is an indication that the drive may be about to fail, so I suppose I can still use the drive as a portable drive for non-critical data.

That's exactly what happened with one of our disks at the office, eventually ran out of reallocation space and caused some headaches.

What OS are you using? If you're running linux check out the system logs (i.e. /var/log/messages).

Pada · Mar 24, 2014

I'm using Ubuntu, so I found those kind of things in /var/log/syslog.1:

Code:

Mar 23 22:09:52 hpserver kernel: [415382.702152] ata2.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 action 0x0
Mar 23 22:09:52 hpserver kernel: [415382.702164] ata2.00: irq_stat 0x40000008
Mar 23 22:09:52 hpserver kernel: [415382.702173] ata2.00: failed command: READ FPDMA QUEUED
Mar 23 22:09:52 hpserver kernel: [415382.702190] ata2.00: cmd 60/00:00:b4:18:2c/04:00:00:00:00/40 tag 0 ncq 524288 in
Mar 23 22:09:52 hpserver kernel: [415382.702193]          res 41/40:00:a0:1b:2c/00:04:00:00:00/00 Emask 0x409 (media error) <F>
Mar 23 22:09:52 hpserver kernel: [415382.702202] ata2.00: status: { DRDY ERR }
Mar 23 22:09:52 hpserver kernel: [415382.702207] ata2.00: error: { UNC }
Mar 23 22:09:52 hpserver kernel: [415382.734513] ata2.00: configured for UDMA/133
Mar 23 22:09:52 hpserver kernel: [415382.734580] ata2: EH complete
Mar 23 22:09:53 hpserver kernel: [415382.843485] ata2.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 action 0x0
Mar 23 22:09:53 hpserver kernel: [415382.843498] ata2.00: irq_stat 0x40000008
Mar 23 22:09:53 hpserver kernel: [415382.843509] ata2.00: failed command: READ FPDMA QUEUED
Mar 23 22:09:53 hpserver kernel: [415382.843525] ata2.00: cmd 60/00:f0:b4:18:2c/04:00:00:00:00/40 tag 30 ncq 524288 in
Mar 23 22:09:53 hpserver kernel: [415382.843529]          res 41/40:00:a0:1b:2c/00:04:00:00:00/00 Emask 0x409 (media error) <F>
Mar 23 22:09:53 hpserver kernel: [415382.843538] ata2.00: status: { DRDY ERR }
Mar 23 22:09:53 hpserver kernel: [415382.843543] ata2.00: error: { UNC }
Mar 23 22:09:53 hpserver kernel: [415382.876978] ata2.00: configured for UDMA/133
Mar 23 22:09:53 hpserver kernel: [415382.877059] ata2: EH complete
Mar 23 22:09:53 hpserver kernel: [415382.984907] ata2.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 action 0x0
Mar 23 22:09:53 hpserver kernel: [415382.984919] ata2.00: irq_stat 0x40000008
Mar 23 22:09:53 hpserver kernel: [415382.984929] ata2.00: failed command: READ FPDMA QUEUED
Mar 23 22:09:53 hpserver kernel: [415382.984945] ata2.00: cmd 60/00:00:b4:18:2c/04:00:00:00:00/40 tag 0 ncq 524288 in
Mar 23 22:09:53 hpserver kernel: [415382.984949]          res 41/40:00:a0:1b:2c/00:04:00:00:00/00 Emask 0x409 (media error) <F>
Mar 23 22:09:53 hpserver kernel: [415382.984958] ata2.00: status: { DRDY ERR }
Mar 23 22:09:53 hpserver kernel: [415382.984963] ata2.00: error: { UNC }
Mar 23 22:09:53 hpserver kernel: [415383.017456] ata2.00: configured for UDMA/133
Mar 23 22:09:53 hpserver kernel: [415383.017540] ata2: EH complete
Mar 23 22:09:53 hpserver kernel: [415383.126214] ata2.00: exception Emask 0x0 SAct 0x7ffffffe SErr 0x0 action 0x0
Mar 23 22:09:53 hpserver kernel: [415383.126223] ata2.00: irq_stat 0x40000008
Mar 23 22:09:53 hpserver kernel: [415383.126230] ata2.00: failed command: READ FPDMA QUEUED
Mar 23 22:09:53 hpserver kernel: [415383.126240] ata2.00: cmd 60/00:f0:b4:18:2c/04:00:00:00:00/40 tag 30 ncq 524288 in
Mar 23 22:09:53 hpserver kernel: [415383.126243]          res 41/40:00:a0:1b:2c/00:04:00:00:00/00 Emask 0x409 (media error) <F>
Mar 23 22:09:53 hpserver kernel: [415383.126248] ata2.00: status: { DRDY ERR }
Mar 23 22:09:53 hpserver kernel: [415383.126252] ata2.00: error: { UNC }
Mar 23 22:09:53 hpserver kernel: [415383.160032] ata2.00: configured for UDMA/133
Mar 23 22:09:53 hpserver kernel: [415383.160113] ata2: EH complete
Mar 23 22:09:53 hpserver kernel: [415383.267679] ata2.00: exception Emask 0x0 SAct 0x3fffffff SErr 0x0 action 0x0
Mar 23 22:09:53 hpserver kernel: [415383.267691] ata2.00: irq_stat 0x40000008
Mar 23 22:09:53 hpserver kernel: [415383.267701] ata2.00: failed command: READ FPDMA QUEUED
Mar 23 22:09:53 hpserver kernel: [415383.267717] ata2.00: cmd 60/00:00:b4:18:2c/04:00:00:00:00/40 tag 0 ncq 524288 in
Mar 23 22:09:53 hpserver kernel: [415383.267721]          res 41/40:00:a0:1b:2c/00:04:00:00:00/00 Emask 0x409 (media error) <F>
Mar 23 22:09:53 hpserver kernel: [415383.267729] ata2.00: status: { DRDY ERR }
Mar 23 22:09:53 hpserver kernel: [415383.267735] ata2.00: error: { UNC }
Mar 23 22:09:53 hpserver kernel: [415383.300115] ata2.00: configured for UDMA/133
Mar 23 22:09:53 hpserver kernel: [415383.300174] ata2: EH complete
Mar 23 22:09:53 hpserver kernel: [415383.419220] ata2.00: exception Emask 0x0 SAct 0x3ffffffe SErr 0x0 action 0x0
Mar 23 22:09:53 hpserver kernel: [415383.419232] ata2.00: irq_stat 0x40000008
Mar 23 22:09:53 hpserver kernel: [415383.419244] ata2.00: failed command: READ FPDMA QUEUED
Mar 23 22:09:53 hpserver kernel: [415383.419261] ata2.00: cmd 60/00:e8:b4:18:2c/04:00:00:00:00/40 tag 29 ncq 524288 in
Mar 23 22:09:53 hpserver kernel: [415383.419264]          res 41/40:00:a0:1b:2c/00:04:00:00:00/00 Emask 0x409 (media error) <F>
Mar 23 22:09:53 hpserver kernel: [415383.419296] ata2.00: status: { DRDY ERR }
Mar 23 22:09:53 hpserver kernel: [415383.419302] ata2.00: error: { UNC }
Mar 23 22:09:53 hpserver kernel: [415383.451759] ata2.00: configured for UDMA/133
Mar 23 22:09:53 hpserver kernel: [415383.451964] sd 1:0:0:0: [sdb] Unhandled sense code
Mar 23 22:09:53 hpserver kernel: [415383.451970] sd 1:0:0:0: [sdb]  Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Mar 23 22:09:53 hpserver kernel: [415383.451980] sd 1:0:0:0: [sdb]  Sense Key : Medium Error [current] [descriptor]
Mar 23 22:09:53 hpserver kernel: [415383.451991] Descriptor sense data with sense descriptors (in hex):
Mar 23 22:09:53 hpserver kernel: [415383.451996]         72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
Mar 23 22:09:53 hpserver kernel: [415383.452063]         00 2c 1b a0
Mar 23 22:09:53 hpserver kernel: [415383.452072] sd 1:0:0:0: [sdb]  Add. Sense: Unrecovered read error - auto reallocate failed
Mar 23 22:09:53 hpserver kernel: [415383.452083] sd 1:0:0:0: [sdb] CDB: Read(10): 28 00 00 2c 18 b4 00 04 00 00
Mar 23 22:09:53 hpserver kernel: [415383.452103] end_request: I/O error, dev sdb, sector 2890656
Mar 23 22:09:53 hpserver kernel: [415383.452189] ata2: EH complete
Mar 23 22:09:53 hpserver kernel: [415383.766571] md/raid:md1: read error corrected (8 sectors at 2888664 on sdb2)
Mar 23 22:09:53 hpserver kernel: [415383.766590] md/raid:md1: read error corrected (8 sectors at 2888672 on sdb2)
Mar 23 22:09:53 hpserver kernel: [415383.766595] md/raid:md1: read error corrected (8 sectors at 2888680 on sdb2)
Mar 23 22:09:53 hpserver kernel: [415383.766600] md/raid:md1: read error corrected (8 sectors at 2888688 on sdb2)
Mar 23 22:09:53 hpserver kernel: [415383.766604] md/raid:md1: read error corrected (8 sectors at 2888696 on sdb2)
Mar 23 22:09:53 hpserver kernel: [415383.766609] md/raid:md1: read error corrected (8 sectors at 2888704 on sdb2)
Mar 23 22:09:53 hpserver kernel: [415383.766613] md/raid:md1: read error corrected (8 sectors at 2888712 on sdb2)
Mar 23 22:09:53 hpserver kernel: [415383.766617] md/raid:md1: read error corrected (8 sectors at 2888720 on sdb2)
Mar 23 22:09:53 hpserver kernel: [415383.766622] md/raid:md1: read error corrected (8 sectors at 2888728 on sdb2)
Mar 23 22:09:53 hpserver kernel: [415383.766626] md/raid:md1: read error corrected (8 sectors at 2888736 on sdb2)

Tinuva · Mar 24, 2014

Pada, you can switch the drive with same sized drive, but just double check the byte size is also exactly the same, or slightly bigger, slightly smaller and it won't work. It happens that if you use different brands or different lines of the same brand, that the byte size is not always 100% equal.

That said, your drive is on it's way out, you should get it replace a.s.a.p. or at least the critical data backed up.

The_Unbeliever · Mar 24, 2014

Same sized drive or bigger one.

But before you add the new drive, try to backup most of your data to a single, external HDD.

Reason being if the rebuild fails, you'll have a backup.

And replace the failed drive with a brand new drive preferably, otherwise the rebuild will fail.

Rickster · Mar 24, 2014

Pada said:
Hi guys,

I'm looking for advice on replacing a faulty (and End-of-Life) Seagate Green Barracuda Green 2TB (ST2000DL003), which was part of my RAID 5 (4x 2TB) setup. Luckily I had another identical 2TB HDD as a portable drive, but it has only a few bad-sectors. At least my drive that broke ran for just over 2 years in my HP Microserver N36L,.

Now can I simply replace the faulty drive with another brand & model 2TB drive or is that ill advised?

I don't really care about performance, but I do care about my data - which is why I went for a RAID setup. I just wish that I've gone for a RAID 1 (mirror) setup instead
My HP Microserver has 8GB of RAM, so I am caching data, which I suppose should compensate somewhat for drive speed differences?

How many hours did you get out of your 2TB, my ST2000DL003 is at 20034 Hours :scared:

DrJohnZoidberg · Mar 24, 2014

Rickster said:
How many hours did you get out of your 2TB, my ST2000DL003 is at 20034 Hours :scared:

My failing 2TB Samsung drive is sitting at 27061 power on hours. I should really replace it this week.

Rickster · Mar 24, 2014

DrJohnZoidberg said:
My failing 2TB Samsung drive is sitting at 27061 power on hours. I should really replace it this week.

Was it dropped?
Did you take care of it?

SouthBit · Mar 24, 2014

Good advice here about backing up before you rebuild the array. The number of times we've received RAID recovery jobs because the client has replaced a failed drive with another dodgy drive and the rebuild fails.

DrJohnZoidberg · Mar 24, 2014

Rickster said:
Was it dropped?
Did you take care of it?

It's been inside my NAS box since I bought it. I have another identical drive in the array which is still fine though so just that drives time.

DrJohnZoidberg · Mar 24, 2014

SouthBit said:
Good advice here about backing up before you rebuild the array. The number of times we've received RAID recovery jobs because the client has replaced a failed drive with another dodgy drive and the rebuild fails.

I feel so much more at ease now that I've backed up 5TB of my stuff to tape. I was always on edge when the power went out or there's a kernel panic or something. Now at least I know if the raid dies then I can easily restore my files.

Rickster · Mar 24, 2014

DrJohnZoidberg said:
It's been inside my NAS box since I bought it. I have another identical drive in the array which is still fine though so just that drives time.

Whats the power on count?

DrJohnZoidberg · Mar 24, 2014

Rickster said:
Whats the power on count?

DrJohnZoidberg said:
My failing 2TB Samsung drive is sitting at 27061 power on hours. I should really replace it this week.

Around 3 years, that's the raw value but sounds about right if it's in hours.

Pada · Mar 24, 2014

Thanks for all the advice! It's much appreciated!

I've already backed up what needed to be backed up and after almost 1 day of recovery, its at 50%.

I reckon that I'll replace the 2TB drive with the Western Digital Red 2TB: http://www.takealot.com/computers/wd-red-2tb-sata-6-gb-s-nas-drive,29924902
... or do you guys perhaps know of a better (and semi-affordable) 2TB+ drive for long term use?

Rickster · Mar 24, 2014

DrJohnZoidberg said:
Around 3 years, that's the raw value but sounds about right if it's in hours.

Power on count (how many times its been spun up) not power on hours.

DrJohnZoidberg · Mar 24, 2014

Rickster said:
Power on count (how many times its been spun up) not power on hours.

Ah, sorry.

Here are the values:

Code:

Start Stop Count 	260
Reallocated Sector Ct 	0
Seek Error Rate 	0
Seek Time Performance 	0
Power On Hours 	27067
Spin Retry Count 	0
Calibration Retry Count 	0
[B]Power Cycle Count 	295[/B]

Rickster · Mar 24, 2014

DrJohnZoidberg said:

Ah, sorry.

Here are the values:

Code:

Start Stop Count 	260
Reallocated Sector Ct 	0
Seek Error Rate 	0
Seek Time Performance 	0
Power On Hours 	27067
Spin Retry Count 	0
Calibration Retry Count 	0
[B]Power Cycle Count 	295[/B]

Not bad, I dont see why it failed.

EDIT: Will eskoms load shedding degrade my hard drive(s)? The unclean shutdowns really worry me.

Join the MyBroadband community

Get started

Replacing EoL Seagate 2TB HDD in RAID 5 array

Executive Member

Honorary Master

Executive Member

Executive Member

Honorary Master

Executive Member

The Magician

Honorary Master

EVGA Fanatic

Honorary Master

EVGA Fanatic

Dealer

Honorary Master

Honorary Master

EVGA Fanatic

Honorary Master

Executive Member

EVGA Fanatic

Honorary Master

EVGA Fanatic