The FreeBSD Diary |
(TM) | Providing practical examples since 1998If you buy from Amazon USA, please support us by using this link. |
CLI for 3Ware 9550SX-8LP
14 August 2006
|
This article shows you how I accessed my 3Ware 9550SX-8LP controller from a command line interface. You can perform RAID maintenance from within the 3Ware BIOS, but the CLI allows you to do this while the system is up and running. This is useful for monitoring the system. I also plan to create a NetSaint plug-in for RAID card, much like I did for another RAID product. Given that we have a CLI, the creation of a plug-in is a rather simple procedure. |
CLI? why bother?
|
I took an interest in the CLI when I noticed this in /var/log/messages: Aug 11 13:40:40 opti kernel: twa0: WARNING: (0x04: 0x0023): Sector repair completed: port=4, LBA=0x9A3F9E This was worrying. Sector repair? That can't be good. That's when I decided to run with the CLI to find out what I could about the RAID array. There is a FreeBSD port for the 3Ware CLI. You want to install sysutils/tw_cli. I found it worked right out of the box, with no configuration required. You can run it as an interactive shell (just type tw_cli and press ENTER) or you can pass it commands as arguments. |
CLI - the shell version
|
Here is what the shell version of the CLI looks like # tw_cli //opti> help Copyright(c) 2004, 2005 Applied Micro Circuits Corporation(AMCC). All rights reserved. AMCC/3ware CLI (version 2.00.03.013) Commands Description ------------------------------------------------------------------- info Displays information about controller(s), unit(s) and port(s). maint Performs maintenance operations on controller(s), unit(s) and ports. alarms Displays current AENs. set Displays or modifies controller and unit settings. sched Schedules bachground tasks on controller(s) (9000 series) quit Exits the CLI. ---- New Command Syntax ---- focus Changes from one object to another. For Interactive Mode Only! show Displays information about controller(s), unit(s) and port(s). flush Flush write cache data to units in the system. rescan Rescan all empty ports for new unit(s) and disk(s). commit Commit dirty DCB to storage on controller(s). (Windows only) /cx Controller specific commands. /cx/ux Unit specific commands. /cx/px Port specific commands. /cx/bbu BBU specific commands. (9000 only) Type help <command> to get more details about a particular command. For more detail information see tw_cli's documentation. //opti> info Ctl Model Ports Drives Units NotOpt RRate VRate BBU ------------------------------------------------------------------------ c0 9550SX-8LP 8 8 3 1 4 4 OK //opti> Controller zero (c0) is listed correctly as a 9550SX-8LP, with 8 ports, 8 drives, three units (one of which was not optimal), and a BBU (battery backup unit). A good start. Now what's going on inside this controller? //opti> info c0 Unit UnitType Status %Cmpl Stripe Size(GB) Cache AVerify IgnECC ------------------------------------------------------------------------------ u0 SPARE OK - - 69.2404 - OFF - u1 SPARE OK - - 69.2404 - OFF - u2 RAID-10 INITIALIZING 77 64K 195.548 ON OFF OFF Port Status Unit Size Blocks Serial --------------------------------------------------------------- p0 OK u2 69.25 GB 145226112 WD-WMAKE23790 p1 OK u2 69.25 GB 145226112 WD-WMAKE23790 p2 OK u2 69.25 GB 145226112 WD-WMAKE23943 p3 OK u2 69.25 GB 145226112 WD-WMAKE23790 p4 OK u2 69.25 GB 145226112 WD-WMAKE23790 p5 OK u2 69.25 GB 145226112 WD-WMAKE23792 p6 OK u0 69.25 GB 145226112 WD-WMAKE23790 p7 OK u1 69.25 GB 145226112 WD-WMAKE23786 Name OnlineState BBUReady Status Volt Temp Hours LastCapTest --------------------------------------------------------------------------- bbu On Yes OK OK OK 0 xx-xxx-xxxx //opti> There are three units: two spares (u0 and u1) and one RAID-10 array (u2) which is initializing and is 77% completed. I watched progress for a while, and progress seemed to go from 77% to 100% without any intermediate steps. The last section of the output relates to the battery backup unit (BBU). The LastCapTest refers to the Last Capacity Test, which has never been run. A battery test takes at least 24 hours. I'll run that one night when the rest of the family is away. I don't think they'll tolerate the server running when they are home. I found the following output very useful. It shows the RAID arrays within the main RAID10 array: # tw_cli info c0 u2 Unit UnitType Status %Cmpl Port Stripe Size(GB) Blocks ----------------------------------------------------------------------- u2 RAID-10 OK - - 64K 195.548 410093568 u2-0 RAID-1 OK - - - - - u2-0-0 DISK OK - p0 - 65.1826 136697856 u2-0-1 DISK OK - p1 - 65.1826 136697856 u2-1 RAID-1 OK - - - - - u2-1-0 DISK OK - p2 - 65.1826 136697856 u2-1-1 DISK OK - p3 - 65.1826 136697856 u2-2 RAID-1 OK - - - - - u2-2-0 DISK OK - p4 - 65.1826 136697856 u2-2-1 DISK OK - p5 - 65.1826 136697856 The CLI documentation has this to say about this command: This command presents detailed information on the specified unit. If the unit consists of sub-units as is the case in RAID 1, RAID 5, RAID 10, and RAID 50 arrays (applicable for 9000 controllers), then details about each sub-unit are also presented. One application of this command is to see which sub-unit of a degraded unit has caused the unit to degrade and which disk within that sub-unit is the source of degradation. You can also get very concise status reports: [root@opti:~] # tw_cli info c0 u0 status /c0/u0 status = OK [root@opti:~] # tw_cli info c0 u1 status /c0/u1 status = OK [root@opti:~] # tw_cli info c0 u2 status /c0/u2 status = OK [root@opti:~] # I will make use of that command when building the NetSaint plug-in. |
CLI - the argument version
|
Then I tried passing arguments on the command line: # tw_cli /c0 show unitstatus Unit UnitType Status %Cmpl Stripe Size(GB) Cache AVerify IgnECC ------------------------------------------------------------------------------ u0 SPARE OK - - 69.2404 - OFF - u1 SPARE OK - - 69.2404 - OFF - u2 RAID-10 OK - 64K 195.548 ON OFF OFF As shown above, the status of the RAID10 array has changed to OK. The initialization had completed. When I noticed the RAID array had settled, I checked /var/log/messages again and found (the actual messages have been trimmed, the full messages are here: twa0: WARNING: (0x04: 0x0023): Sector repair completed: port=4, LBA=0x9DCBA3 last message repeated 2 times twa0: WARNING: (0x04: 0x0023): Sector repair completed: port=4, LBA=0x9DCEBE last message repeated 2 times twa0: WARNING: (0x04: 0x0023): Sector repair completed: port=4, LBA=0x9F23A9 twa0: WARNING: (0x04: 0x0023): Sector repair completed: port=4, LBA=0x9F23A9 twa0: WARNING: (0x04: 0x0023): Sector repair completed: port=4, LBA=0x9F23A9 twa0: WARNING: (0x04: 0x0023): Sector repair completed: port=4, LBA=0x9F2A48 twa0: WARNING: (0x04: 0x0023): Sector repair completed: port=4, LBA=0x9F2D63 twa0: WARNING: (0x04: 0x0023): Sector repair completed: port=4, LBA=0x9F2D63 twa0: WARNING: (0x04: 0x0023): Sector repair completed: port=4, LBA=0x9F2D63 twa0: WARNING: (0x04: 0x0023): Sector repair completed: port=4, LBA=0x9F307E last message repeated 2 times twa0: INFO: (0x04: 0x0007): Initialize completed: unit=2 |
There were not actual HDD errors
|
Although I was initially concerned with the above messages, they do not appear to be hard errors (that is, actual errors with the driver). When I was grepping for all of the messages, I found this: twa0: WARNING: (0x04: 0x0008): Unclean shutdown detected: unit=2 Looking at the full log, I found these messages: acd0: CDROM And sure enough, something did happen but I don't recall what. I think I was playing with IPMI and it caused a panic or something. I am not sure. |
BBU charging
|
I also found this interesting message: kernel: twa0: INFO: (0x04: 0x0056): Battery charging completed: At least now I know I can run the BBU test |
The NetSaint plugin
|
Although I haven't written it, I'm quite sure it will be straight forward. It's a matter of grepping out the right information. Stay tuned. |