The FreeBSD Diary |
(TM) | Providing practical examples since 1998If you buy from Amazon USA, please support us by using this link. |
Restoring an INOPERABLE 3Ware unit
12 February 2012
|
I've been using a 3Ware 9550SX-8LP since 2006. Over the weekend, I encountered the first problem with it. It became inoperable. That's an overstatement, but the problem was easily fixed. After a reboot to upgrade the kernel, Nagios alerted me to a problem. I checked via the command line and found this situation: # tw_cli info c0 Ctl Model (V)Ports Drives Units NotOpt RRate VRate BBU ------------------------------------------------------------------------ c0 9550SX-8LP 8 8 3 1 4 1 OK # tw_cli info c0 Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy ------------------------------------------------------------------------------ u0 RAID-10 OK - - 64K 195.548 ON ON u1 SPARE OK - - - 69.2404 - ON u2 RAID-10 INOPERABLE - - 64K 195.548 OFF ON Port Status Unit Size Blocks Serial --------------------------------------------------------------- p0 OK u2 69.25 GB 145226112 WD-WMAKE2379003 p1 OK u1 69.25 GB 145226112 WD-WMAKE2379069 p2 OK u0 69.25 GB 145226112 WD-WMAKE2379066 p3 OK u0 69.25 GB 145226112 WD-WMAKE2379012 p4 OK u0 69.25 GB 145226112 WD-WMAKE2379286 p5 OK u0 69.25 GB 145226112 WD-WMAKE2379019 p6 OK u0 69.25 GB 145226112 WD-WMAKE2394339 p7 OK u0 69.25 GB 145226112 WD-WMAKE2378696 Name OnlineState BBUReady Status Volt Temp Hours LastCapTest --------------------------------------------------------------------------- bbu On Yes OK OK OK 255 02-Sep-2010 Here, you can see that u2 has a problem. Looking at the output details, we can also see that u2 contains a single HDD and is connected to port 0 (p0). That means it is one of the two spares that have existed in this array since I set it up. I will remove that unit, and add it back into the array. See below. |
Fixing it
|
I found help via Google and used that as an example. I also posted to FreeBSD Forums before I proceeded. But today, before I received a reply, I went ahead... First, I removed the defective u2 unit: # tw_cli maint deleteunit c0 u2 Deleting unit c0/u2 ...Done. # tw_cli info Ctl Model (V)Ports Drives Units NotOpt RRate VRate BBU ------------------------------------------------------------------------ c0 9550SX-8LP 8 8 2 0 4 1 OK # tw_cli info c0 Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy ------------------------------------------------------------------------------ u0 RAID-10 OK - - 64K 195.548 ON ON u1 SPARE OK - - - 69.2404 - ON Port Status Unit Size Blocks Serial --------------------------------------------------------------- p0 OK - 69.25 GB 145226112 WD-WMAKE2379003 p1 OK u1 69.25 GB 145226112 WD-WMAKE2379069 p2 OK u0 69.25 GB 145226112 WD-WMAKE2379066 p3 OK u0 69.25 GB 145226112 WD-WMAKE2379012 p4 OK u0 69.25 GB 145226112 WD-WMAKE2379286 p5 OK u0 69.25 GB 145226112 WD-WMAKE2379019 p6 OK u0 69.25 GB 145226112 WD-WMAKE2394339 p7 OK u0 69.25 GB 145226112 WD-WMAKE2378696 Name OnlineState BBUReady Status Volt Temp Hours LastCapTest --------------------------------------------------------------------------- bbu On Yes OK OK OK 255 02-Sep-2010 This has removed the unit from the array. Now I add it back into the array. I knew it was p0 because it was listed as so in the above output. # tw_cli maint createunit c0 p0 rspare Creating new unit on controller /c0 ... Done. The new unit is /c0/u2. WARNING: This Spare unit may replace failed drive of same interface type only. Now the status looks like this: # tw_cli info c0 Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy ------------------------------------------------------------------------------ u0 RAID-10 OK - - 64K 195.548 ON ON u1 SPARE OK - - - 69.2404 - ON u2 SPARE OK - - - 69.2404 - OFF Port Status Unit Size Blocks Serial --------------------------------------------------------------- p0 OK u2 69.25 GB 145226112 WD-WMAKE2379003 p1 OK u1 69.25 GB 145226112 WD-WMAKE2379069 p2 OK u0 69.25 GB 145226112 WD-WMAKE2379066 p3 OK u0 69.25 GB 145226112 WD-WMAKE2379012 p4 OK u0 69.25 GB 145226112 WD-WMAKE2379286 p5 OK u0 69.25 GB 145226112 WD-WMAKE2379019 p6 OK u0 69.25 GB 145226112 WD-WMAKE2394339 p7 OK u0 69.25 GB 145226112 WD-WMAKE2378696 Name OnlineState BBUReady Status Volt Temp Hours LastCapTest --------------------------------------------------------------------------- bbu On Yes OK OK OK 255 02-Sep-2010 The next step is to verify that new unit: # tw_cli //supernews> maint verify c0 u2 Sending start verify message to /c0/u2 ... Done. //supernews> Now that verify has started, you can see that in the output of info: # tw_cli info c0 Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy ------------------------------------------------------------------------------ u0 RAID-10 OK - - 64K 195.548 ON ON u1 SPARE OK - - - 69.2404 - ON u2 SPARE VERIFYING - 23% - 69.2404 - OFF Port Status Unit Size Blocks Serial --------------------------------------------------------------- p0 VERIFYING u2 69.25 GB 145226112 WD-WMAKE2379003 p1 OK u1 69.25 GB 145226112 WD-WMAKE2379069 p2 OK u0 69.25 GB 145226112 WD-WMAKE2379066 p3 OK u0 69.25 GB 145226112 WD-WMAKE2379012 p4 OK u0 69.25 GB 145226112 WD-WMAKE2379286 p5 OK u0 69.25 GB 145226112 WD-WMAKE2379019 p6 OK u0 69.25 GB 145226112 WD-WMAKE2394339 p7 OK u0 69.25 GB 145226112 WD-WMAKE2378696 Name OnlineState BBUReady Status Volt Temp Hours LastCapTest --------------------------------------------------------------------------- bbu On Yes OK OK OK 255 02-Sep-2010 This was all much easier that I thought it was going to be... |