Brocade SAN switch CLI Commands for troubleshooting minor issues

We have already discussed about, Brocade SAN switch Zoning steps Via CLI and CISCO MDS Zoning steps via CLI

This write-up, focuses on the basic trouble shooting commands used in Brocade SAN switch. For better understanding of the commands, let us first understand the day to day operational challenges faced in the SAN fabric. Listed below are few of the operational error codes/prompts:

  1. Alias/port went offline
  2. Bottlenecks
  3. Port error
  4. Hanging zones
  5. Rx Tx Voltage/Power Issue

let’s read in brief about, how to identify the errors and how to troubleshoot them. Read more

Alias/port went offline

This error is recorded due to the following reasons:

  1. Reboot/ Shutdown of the host
  2. Faulty cable
  3. Issue in the HBA card.

——————- advertisements ——————-

———————————————————

Thus, when ‘WWN/ Alias went offline’ is recorded, use the below mentioned commands to identify, when the port went offline and which port went offline.

#fabriclog -s                                                                              States the ports which went offline recently.

#fabriclog -s |grep -E “Port Index |GMT”                               This command states the ports which went offline before. Note: This command will fail in case the FOS upgrade or Switch reboot activity was performed. As both the activities clear the fabriclog.

In order to know the zoning details through the WWN of the device, use below mentioned command:  

#alishow |grep wwn -b2                                                              This lists the alias.

then use below command

#zoneshow –alias Alias_Name                                                    This lists the zone name and component aliases.

——————- advertisements ——————-

———————————————————

Bottlenecks

There are many kinds of bottlenecks. But, the once prominent in SAN fabric are Latency bottleneck and congestion bottleneck.

Latency bottleneck occurs when a slow drain device is connected to the port. Even initiator or target ports can report latency, no matter what kind of port it is, if a slow drain device is attached, there will be bottleneck in that port. A

Slow drain devices, is a device which either has all or any one of the bellow mentioned issues:

  1. Unsupported firmware.
  2. Hardware issues.
  3. SFP which has a voltage or power issue.

Whereas, Congestion bottleneck occurs due to high rate of data transfer in the port. In the next write-up we will discuss in detail, about the causes of a congestion bottleneck.

——————- advertisements ——————-

———————————————————

The commands used to identify latency as well as congestion bottleneck are:

#errdump

#mapsdb –show

If there is latency or congestion bottleneck, it should to be fixed by logging a support case with Server/Storage hardware vendor.

Port errors

There are many kinds of port errors. Most of the time, its due to bottleneck issue/ physical layer issue. Bottleneck issue we have already addressed above. Physical layer issue is, either Cable issue or SFP issue.

Below are the commands to identify the port errors:

#porterrshow                                                       This will list all ports in error state.

#porterrshow port_number                       

#porterrorshow -i Port_Index                              Both these commands will list the errors in a particular port.

——————- advertisements ——————-

———————————————————

In case an error is listed, before troubleshooting clear the status using below commands and observe it again.

#statsclear

#slotstatsclear

#portstatsclear port_number

Apart from this, there are other commands to display the current data transfer rate of a port or all ports, such as:

#portperfshow

#portperfshow port_number

Hanging Zone

Hanging zones are the purposeless zones residing in the zoning configuration. The zone in which all initiators or all targets are inactive are considered as hanging zone.

There is no specific command to list out hanging zones in the fabric, we have to use SAN health to identify the hanging zone. To check if all the aliases of a zone are active or not use the command mentioned below:

#zonevalidate “zonename

In the result of the above command, there will be have a ‘*’ mark at the end of each active alias in the zone.

Rx Tx Voltage/Power Issue

The Rx & Tx Voltage and power of an SFP can be validated only if, there is connectivity in the SFP with its port in online state.

The command below will display the voltage, power and all the details related to the SFP.

#sfpshow port_number -f

__________________________________________________________________________________________________

Please feel free to connect with us in case of any queries. Also, please give us your feedback, it will help us to improve our skill sets.

Hitachi VSP Auto Dump Collection

We had posted previously on log collection from Hitachi unified storage and EMC VNX/Cellera storage arrays. In the same way, let us see how we can gather Auto Dump from Hitachi VSP storage arrays.

An Auto dump is a support log which is a mandate to analyse any kind of issue occurred on a Hitachi VSP Storage system. It is necessary to have Auto Dump collected during or shortly after an issue occurrence.

Read more

Normal Auto Dump can be collected by the customer and for detailed Auto dump you can take assistance from an HDS engineer.

For collecting an Auto Dump we need to have the SVP access, we can take an RDP session into the SVP. Basically, SVP is a Windows Vista system. Login to the SVP using the credentials.

Once you are logged in to the SVP you may either able to see the SVP console or the storage navigator Web console. If it is a storage navigator web console then go to “Maintenance” tab and select “Maintenance component General” It will open SVP console for you.

In the console, you can navigate to Auto Dump and you can click on the auto dump option to collect the log. You will be taken to a new window and you will be prompted to enter the target file location.

Once auto dump is started, it may take 30-45 minutes to complete.  Once it is completed you can navigate to C:\DKCxxx\TMP and you will find a file named hdcp.tgz last modified today (or the date you run the auto dump).

You can copy the file to your local PC or any server where internet connection is available.

Once we have files in our local system, we can upload the same to Hitachi Technical Upload Facility (TUF). This will require a valid Hitachi support case ID.

Hope this helped you. Feel free to provide your feedback in the comments section.

Way to gather simple trace from Hitachi Unified Storage

You may refer our previous post on VNX/Cellera log collection. Here we can see how to gather support logs from Hitachi Unified Storage (HUS 110, HUS 130 & HUS 150).

A “Simple Trace” is needed for analysis of all issues relating to the Hitachi storage systems. The trace can be obtained by the customer. It is critical to gather a trace as soon as possible after a problem is detected. This is to prevent log data wrapping and loss of critical information.

Read more

If a performance problem is being experienced, take one simple trace as soon as possible, while performance is affected. This can assist greatly in finding the root cause

 

Hitachi modular storage (HUS/AMS) log collection is a very simple task.

We can login to the WEB GUI using the controller IP Address (type the controller IP Address in a web browser and press enter). both HUS controllers have IP Address. Then in the left panel of the WEB GUI we can find “Simple Trace” Under Trace.

HUS Log Collection

Click on the simple trace and it will pop-up a screen, here will take some time to get it generated. We can monitor the percentage from the pop-up screen. Once the fetching completed we can download it to our local system. There may be multiple files we have to download for complete information from the same pop-up.

The file name will be as follows  smpl_trc1_systemserialnumber_yyyymmdd.dat 

Once we have files in our local system, we can upload the same to Hitachi Technical Upload Facility (TUF) Here we requires valid support case ID from Hitachi.