VMware vSAN – Understanding Fault Domains

VMware vSAN is one of the leading enterprise class software defined storage from VMware. It helps in leveraging the server based storage for enterprise applications. Advantages, as you might have already known – cost reduction, ease of administration and more…

In this post we are discussing one of the characteristic of vSAN, Fault Domains. Read more

What ?

Fault Domains helps an administrator to design the failure scenarios that may occur in a vSAN cluster. If a customer want to avoid data inaccessibility during a chassis failure or power failure in a rack etc… customer can do so by setting the right fault domains.

There should be a minimum of 3 fault domains for having this enabled on a cluster.

——————- advertisements ——————-

———————————————————

How ?

In a vSAN cluster, writes will be send to multiple hosts/drives depending on the Storage policy and the Failures To Tolerate (FTT) settings. If the FTT=1, the write will be send to 2 hosts at the same time. Even if one of the host fails, the data will be still accessible as the replica will be available on the host and thus IO operation continues. We will discuss the IO operation in vSAN, in a separate post.

In case of Failure Domain configuration, the replicas will be saved in different Failure Domains. We can define all the hosts in the same rack to be part of one Failure Domain and thus data and its replica will never be in the same (host in the same) rack. Thus the administrator can plan for any maintenance activities at the rack level without any disruption of the services running on the vSAN.

Same applies for the chassis level or any other level protection. We can define all the fault domains at the chassis level, so that replicas will not reside in the same chassis.

Additional reading :

 

Hope you enjoyed reading this post and was helpful for you. Please share your thoughts in the comments section.

EMC ISILON Interview questions

Adding one more post to our interview questions post category, this time for ISILON. We are trying to cover some of the frequently asked questions from the ISILON architecture and configuration areas.

  •  Node and drive types supported :  ISILON supported 3 different types of nodes S-Series, X-Series and NL-Series. S-Series (S210) is high performance node type supports SSD drives. X-Series nodes (X210 and X410) supports upto 6 SSDs and can have remaining slots with HDDs in them. NL-Series (NL-410) nodes supports only one SSD in the system and SATA drives in the remaining slots. This node type is intended for the archiving requirements.

Read more

The system with the recent OneFS versions, also supports All-Flash nodes, Hybrid nodes, Archive nodes and IsilonSD nodes. ISILON All-Flash nodes (F800) can have upto 924 TB in a single 4U node and can grow upto 33PB in one cluster. One node can house upto 60 drives. Hybrid nodes (H400, H500 and H600) supports a mix of SSDs and HDDs. H400 and H500 can have SATA drives and SSDs and H600 supports SSDs and SAS drives. Archive nodes (A200 and A2000) are intended for archiving solutions. A2000 nodes can have 80 slots with 10TB drives only supported. This node is for high-density archiving. A200 is for near-primary archiving storage solutions which supports 2 TB, 4 TB or 8 TB SATA HDDs- a maximum of 60 drives.

IsilonSD is the software only node type which can be installed in customer hardware.

  •  Scale-Out and Scale-Up architecture : The first thing comes with ISILON is the architecture, Scale-Out. With Scale-Out architecture, the processing and capacity will be increased in parallel. As we add a node, both capacity and processing power will be increased for the system. Let’s take the example of VNX for Scale-Up architecture. Here, the processing power (i.e. Storage processors) can not be increased as the system limit is 2 SPs, but we can grow the overall system capacity by adding more DAEs (and disks) to the system supported limit.

  •  Infiniband Switches and types : ISILON makes use of IB switches for the internal communication between the nodes. ISILON now supports 40 GbE switches also with the Gen-6 hardware in addition to the 10GbE IB switches.
  •  SmartConnect and SSIP : [definition from ISILON SmartConnect whitepaper] SmartConnect is a licensable software module of the EMC ISILON OneFS Operating System that optimizes performance and availability by enabling intelligent client connection load balancing and failover support. Through a single host name, SmartConnect enables client connection load balancing and dynamic NFS failover  and failback of client connections across storage nodes to provide optimal utilization of the cluster resources. SmartConnect eliminates the need to install client side drivers, enabling the IT administrator to easily manage large numbers of client with confidence. And in the event of a system failure, file system stability and availability are maintained.

For every SmartConnect zone there will be one SSIP (SmarConnect Service IP), which wil be used for the client connections. SSIP and associated hostname will have the DNS entry and the client requests will come to the cluster/zone via SSIP. The zone redirects the request to the nodes for completion.

  •  SmartPool : SmartPool enables effective tiering of storage nodes within the filesystem. Data – based on the utilization – will be moved across the tiers within the filesystem automatically with seamless application and end user access. Customers can define policies for the data movement for different workflows and node types.

  •  Protection types in ISILON : ISILON cluster can have protection types N+M (where N is the number of data blocks and M is the number of nodes/drives failures the system can tolerate) or N+M:B (where N is the number of data blocks M is the number of drives failures the system can tolerate and B is the number of node failures can be tolerated ), where N>M. In case of a 3-node system, it can have +1 (i.e. 2+1) protection type. Here the system can tolerate 1 drive/node failure without any data loss.
  •  Steps to create an NFS export : Here we have listed the commands to create and list/view the NFS export.

To create the NFS export :
isi nfs exports create –clients=10.20.10.31,10.20.10.32 –root-clients=10.20.10.33,10.20.10.34 –description=”Beginnersforum Test NFS Share” –paths=/ifs/BForum/Test –security-flavors=unix

To list the NFS exports :
isi nfs exports list

To view the NFS export :
isi nfs exports view <export_number>

You can create the NFS export alias and quotas also for the NFS export.

Hope this helped you in learning some ISILON stuff. We will have more questions and details in upcoming posts. For more interview questions posts please click here. Please add any queries and suggestions in comments.

Understanding the VMAX FA/RDF port numbering

The logical numbering of the VMAX ports (FE, RF etc.. ports on the VMAX-3 and VMAX AFA’s) are quite confusing. It is important as we do the zoning and the host integration for these arrays. And yes it really is confusing, for any storage administrator who is not very familiar with.

This post is an attempt to explain the mapping of physical to logical numbering of the FA/RDF ports on a VMAX system.

Read more

The physical numbering of the SLICs (IO Modules) are as in the above snip and is quite straightforward. The modules are starting from the Slot 0 to Slot 10 with the Management modules (MMCS on the first engine/ MM on the remaining engines) on the left most. Few slots will be used for Vault to Flash modules which will have no ports on them. The slot# and the type of module in it may vary slightly with the addition of compression modules in AFA arrays. The SLICs 2,3,8 and 9 are important from an admin point of view as these will have either FA/RDF modules in. Even though the Back-end modules will have physical ports and will have connectivity, it is not of much concern for an admin as it will be configured during the array initialization.

——————- advertisements ——————-

———————————————————-

Here comes the logical numbering. For numbering a port, we should be aware of the director we are referring to and the slot number for the specific module. Considering the above snip as a single engine scenario, the odd director will be director 1 and even director will be director 2. This will be similar for the remaining engines (e.g., For engine-4, the odd director will be director 7 and even will be director 8 ).

——————- advertisements ——————-

———————————————————-

Now, let us assume the SLIC 2 is configured with FA emulation. The ports on director 1 SLIC will be numbered starting with 1d (d for FA emulation). The port numbers will be 1d4,1d5,1d6 and 1d7. You will have to keep this image in mind or will have to make a note of it to have the logical numbering for each SLIC. For RDF emulation, the numbering will have an e in it (e for RDF). Let us assume, the SLIC 8 is of RDF emulation. The 3rd port (the last port with numbering starting from 0 as in first pic) on SLIC 8 on even director will be 2e27. 2 for 2nd director, e for RDF emulation and 27 is the logical number.

Hopefully that was not that tough and it helped you. You may try various combinations for your practice. To start with, what will be the logical number for an FA port odd director SLIC 8 and port 2 ?

You may find more EMC VMAX posts here. Please use the comments section for any queries/suggestions.

Hitachi VSP Auto Dump Collection

We had posted previously on log collection from Hitachi unified storage and EMC VNX/Cellera storage arrays. In the same way, let us see how we can gather Auto Dump from Hitachi VSP storage arrays.

An Auto dump is a support log which is a mandate to analyse any kind of issue occurred on a Hitachi VSP Storage system. It is necessary to have Auto Dump collected during or shortly after an issue occurrence.

Read more

Normal Auto Dump can be collected by the customer and for detailed Auto dump you can take assistance from an HDS engineer.

For collecting an Auto Dump we need to have the SVP access, we can take an RDP session into the SVP. Basically, SVP is a Windows Vista system. Login to the SVP using the credentials.

Once you are logged in to the SVP you may either able to see the SVP console or the storage navigator Web console. If it is a storage navigator web console then go to “Maintenance” tab and select “Maintenance component General” It will open SVP console for you.

In the console, you can navigate to Auto Dump and you can click on the auto dump option to collect the log. You will be taken to a new window and you will be prompted to enter the target file location.

Once auto dump is started, it may take 30-45 minutes to complete.  Once it is completed you can navigate to C:\DKCxxx\TMP and you will find a file named hdcp.tgz last modified today (or the date you run the auto dump).

You can copy the file to your local PC or any server where internet connection is available.

Once we have files in our local system, we can upload the same to Hitachi Technical Upload Facility (TUF). This will require a valid Hitachi support case ID.

Hope this helped you. Feel free to provide your feedback in the comments section.

Expanding a (EMC Celerra/VNX) NAS Pool

In this post let’s discuss expanding a (EMC Celerra/VNX-File) NAS pool by adding new LUNs from the backend storage. A NAS Pool from on which we create Filesystems for NFS/CIFS (SMB) should have sufficient space for catering the NAS requests. Here our pool is running out of space, with only a few MBs left.

[nasadmin@beginnersNAS ~]$ nas_pool -size Bforum_Pool
id = 48
name = Bforum_Pool
used_mb = 491127
avail_mb = 123
total_mb = 491250
potential_mb = 0
[nasadmin@beginnersNAS ~]$

Let’s see how we can get this pool extended.

Read more

Let’s have a look first at the existing disks (LUNs from backend). Here we already have 9 disks assigned. We should have the 10th one in place, which will add up space to the pool.

[nasadmin@beginnersNAS ~]$ nas_disk -l
id inuse sizeMB storageID-devID type name servers
1 y 11263 CKxxxxxxxxxxx-0000 CLSTD root_disk 1,2
2 y 11263 CKxxxxxxxxxxx-0001 CLSTD root_ldisk 1,2
3 y 2047 CKxxxxxxxxxxx-0002 CLSTD d3 1,2
4 y 2047 CKxxxxxxxxxxx-0003 CLSTD d4 1,2
5 y 2047 CKxxxxxxxxxxx-0004 CLSTD d5 1,2
6 y 32767 CKxxxxxxxxxxx-0005 CLSTD d6 1,2
7 y 178473 CKxxxxxxxxxxx-0010 CLSTD d7 1,2
8 n 178473 CKxxxxxxxxxxx-0011 CLSTD d8 1,2
9 y 547418 CKxxxxxxxxxxx-0007 CLSTD d9 1,2
[nasadmin@beginnersNAS ~]$

As per the requirement, we have to assign the LUNs from the backend storage. It is recommended to add the new LUNs of identical size as of existing LUNs in the pool to have best performance.

Now to the most important part – Rescaning the new disks. We have to use the server_devconfig command for rescan. We can run the command against individual data movers also. The recommeded way to do this is to start with the standby DMs first and then on primary ones. Listing the nas_disks will show the servers on which the disks are scanned.

[nasadmin@beginnersNAS ~]$ server_devconfig ALL -create -scsi -all

Discovering storage (may take several minutes)
server_2 : done
server_3 : done
[nasadmin@beginnersNAS ~]$

Yes, that is done successfully. Now let’s check the disks list. We can see the 10th disk with inuse=n which is scanned on both servers (data movers).

[nasadmin@beginnersNAS ~]$ nas_disk -l
id inuse sizeMB storageID-devID type name servers
1 y 11263 CKxxxxxxxxxxx-0000 CLSTD root_disk 1,2
2 y 11263 CKxxxxxxxxxxx-0001 CLSTD root_ldisk 1,2
3 y 2047 CKxxxxxxxxxxx-0002 CLSTD d3 1,2
4 y 2047 CKxxxxxxxxxxx-0003 CLSTD d4 1,2
5 y 2047 CKxxxxxxxxxxx-0004 CLSTD d5 1,2
6 y 32767 CKxxxxxxxxxxx-0005 CLSTD d6 1,2
7 y 178473 CKxxxxxxxxxxx-0010 CLSTD d7 1,2
8 n 178473 CKxxxxxxxxxxx-0011 CLSTD d8 1,2
9 y 547418 CKxxxxxxxxxxx-0007 CLSTD d9 1,2
10 n 547418 CKxxxxxxxxxxx-0006 CLSTD d10 1,2
[nasadmin@beginnersNAS ~]$

Let’s check the pool again to see the available and potential storage capacity.

[nasadmin@beginnersNAS ~]$ nas_pool -size Bforum_Pool
id = 48
name = Bforum_Pool
used_mb = 491127
avail_mb = 123
total_mb = 491250
potential_mb = 547418
[nasadmin@beginnersNAS ~]$

Now, as you see, the expanded capacity is available in the pool (refer the potential storage) .

You may refer to our previous post on scanning new LUNs on VNX File/Celerra Data movers. click here for more Celerra/VNX posts.

ISILON basic commands

Here in this post, we are discussing a few basic isilon commands. Some of which comes to our help in our daily administration tasks for managing and monitoring the isilon array. You may refer the Celerra/VNX health check steps also we discussed in one of our previous posts.

Here are some of the isilon commands.

isi status : Displays the status of the cluster,nodes and events etc… You can use various options including -r (for displaying raw size), -D (for detailed info) -n (info for specific node -n <node id>)

isi_for_array : For running various commands against specific nodes. -n for a specific node and

isi events : There are many options with the ‘events’ command including isi events list (to list all the events) isi events cancel (to cancel the events) isi events quiet (to quiet the events). You can setup event notifications also using the isi events command.

Read more

isi devices : To view and change the cluster devices status. There are plenty of options with the devices command.

isi devices –device <Device>  : where Device can be a drive or an entire node. –action option is used to perform any specific option on the devices (–action <action>) including smartfail, stopfail, status, add, format etc…

isi firmware status : is used to list the isilon firmware type and versions.

isi nfs exports : NFS exports command is used for various isilon NFS operations including export creation, listing/viewing modifying etc… Below are a list of sub-commands.

1. isi nfs exports create –zone <zone name> –root-clients=host1, host2 –read-write-clients=host2, host3 –path=<path>

2. isi nfs exports view <export ID> –zone=<zone name>

3. isi nfs exports modify <export ID> –zone=<zone name> –add-read-write-clients host4

isi smb share : This command is used to create, list/view, modify etc… operations on SMB shares. Sample sub-commands –

1. isi smb shares create <share name> –path=/ifs/data/SMBpath –create-path –browsable=true –zone <zone name> –description=”Test share”

2. isi smb shares delete <share name> –zone=<zone name>

isi quota quotas : this command is used for various quota operations including creation/deletion/modification etc…

Hope this post helped you. Please feel free to comment if you have any questions. We will discuss more detailed commands in a later post.

 

Way to gather simple trace from Hitachi Unified Storage

You may refer our previous post on VNX/Cellera log collection. Here we can see how to gather support logs from Hitachi Unified Storage (HUS 110, HUS 130 & HUS 150).

A “Simple Trace” is needed for analysis of all issues relating to the Hitachi storage systems. The trace can be obtained by the customer. It is critical to gather a trace as soon as possible after a problem is detected. This is to prevent log data wrapping and loss of critical information.

Read more

If a performance problem is being experienced, take one simple trace as soon as possible, while performance is affected. This can assist greatly in finding the root cause

 

Hitachi modular storage (HUS/AMS) log collection is a very simple task.

We can login to the WEB GUI using the controller IP Address (type the controller IP Address in a web browser and press enter). both HUS controllers have IP Address. Then in the left panel of the WEB GUI we can find “Simple Trace” Under Trace.

HUS Log Collection

Click on the simple trace and it will pop-up a screen, here will take some time to get it generated. We can monitor the percentage from the pop-up screen. Once the fetching completed we can download it to our local system. There may be multiple files we have to download for complete information from the same pop-up.

The file name will be as follows  smpl_trc1_systemserialnumber_yyyymmdd.dat 

Once we have files in our local system, we can upload the same to Hitachi Technical Upload Facility (TUF) Here we requires valid support case ID from Hitachi.

 

 

Simple LUN allocation steps – VNX

In one of our post earlier, we have seen the allocation steps in VMAX. Now let’s see the case with the mid-range product, EMC VNX. LUN allocation in VNX is quite simple with the Unisphere Manager GUI. Lets see the steps here.

Read more

Creating a LUN : You need to have the information like the Size of the LUN required, the disk type and RAID type (if there are any specific requirement) etc… Based on these requirements, you have to decide the pool to go with. Based on the disk type and RAID type used in different pools, you will have to select the correct pool. From Unisphere, under Storage>LUNs you have to click the Create button.

You have to furnish the data including the Size, Pool (Video below from EMC on Pool creation) etc…in the screen. You will have to select the checkbox depending on whether the LUN need to be created as a Thin/Thick. Once all the fields are filled in, you have to note the LUN-ID and you can submit the form. Done..! You have created the LUN, you can find the new LUN from the list and verify the details..

https://www.youtube.com/watch?v=rr7Sndqz_cA

Adding a new host : Yes, your requirement may be to allocate new LUN to a new host. Once host is connected via fabric and you have done with the zoning, the host connectivity should be visible in Initiators list (Unisphere> Hosts> Initiators). If you have the Unisphere Host Agent installed on the host or if it is an ESXi host, the host gets auto-registered and you will see the host details in the Initiators list.

Else you will see only the new host WWNs in the list. You have to select the WWNs and do a register. You have to fill in the host details (Name and IP) and the ArrayCommpath and failover mode settings. Once the host is registered, you will see the host in the hosts list (Unisphere > Hosts > Hosts).

 

Storage Group : You now have to make the LUN visible to the hosts. Storage Group is the way to do this in VNX/Clariion. You will have to create a new storage group for the hosts (Unisphere > Hosts > Storage Groups). You can name the new storage group to match the host/cluster name for easy identification and then add the hosts to the group.

If there are multiple hosts which will be sharing the LUNs, you have to add the hosts in the storage group. And you also have to add the LUNs to the Storage Group. You have to set the HLU for the LUNs in the SG and have to be careful in giving the HLU. For changing the HLU, you will have to take a downtime as it can not be modified on-the-fly.

Once the LUNs and hosts are added to the Storage Group, you are done with the allocation..! You can now request the host team to do a rescan to see the new LUNs.

Hope this post helped you. For more Celerra/VNX posts click here

 

EMC VMAX3 in 3D

EMC introduced VMAX3 series in the mid of 2014, which is an Enterprise Data Service Platform rather than a storage system. My eyes got stuck on a YouTube video from EMC which shows the 3D view of the components and features.

Read more

I wanted to share it with you, which resulted in this post. Here is it for you…

 

https://www.youtube.com/watch?v=2-RbWm_Gcgo

VMAX3 introduced 3 new models of VMAX – VMAX 100K, 200K and 400K – with the new HYPERMAX Operating System and Dynamic Virtual Matrix architecture. You may read more from the below links, from experts and EMC Elects..

 

New EMC VMAX³ – Industry’s First Enterprise Data Service Platform – Official Press Release

The VMAX3: Why Enterprise Class is Still Very Relevant – By Jason Nash

EMC Announces Next-Generation VMAX Storage Array – By Dave Henry

Symmetrix offers a new kind of MAXimum Virtualisation (VMAX) -By Rob Koper

VMAX 3 – The all-new Enterprise Data Service Platform..! Part-I – By Vipin V.K

EMC ANNOUNCE VMAX3 – By Roy Mikes

EMC announces its next Generation VMAX Array – By Mark May

EMC Announces VMAX Family Re-architected for Enterprise Data Services and Hybrid Clouds -By StorageReview.com

Just a few posts which I found interesting. You may suggest the links to more value added posts on VMAX3 in the comments section. We will try to add the link here for other’s reference.

You may find more EMC VMAX posts here. Thank You.

 

Brocade SAN switch zoning via CLI

We had discussed zoning in Cisco switch recently, in one of our posts. Now we will discuss the same on a Brocade switch via CLI. As we already discussed, the 3 components (zones, aliases and zoneset) remains the core here also. For reading bit more on this, you may read the previous post.

Now let’s directly come in to the commands for various steps.

Read more

We have the new HBA connected to the switch, we can ensure the successful connectivity by running switchshow command. This will show all the ports and the connected device WWNs, you can check the port number if you are aware of, or by finding the WWN (you may do a grep for the WWN).

Else if you are not aware of the switch and port on fabric on which the HBA is attached to, you may run nodefind. nodefind 10:xx:xx:xx:xx:xx:xx:01  will list the port details.

 

 

Now we can create the alias for the HBA (BForum_HBA1) and the storage port (VNX_SPA3). Below are the commands,

alicreate “BForum_HBA1″,”10:xx:xx:xx:xx:xx:xx:01”
alicreate “VXN_SPA3″,”50:06:xx:xx:xx:xx:xx:02”

For adding a WWN to an existing alias (adding a WWN – 10:xx:xx:xx:xx:xx:xx:02 to the alias BForum_HBA2 for example) you may run,

aliadd “BForum_HBA2″,”10:xx:xx:xx:xx:xx:xx:02”

Now we will be creating the zone for the HBA and storage port,

zonecreate “BForum_HBA1_VNX_SPA3″,”BForum_HBA1;VNX_SPA3”

We can add an alias to an existing zone by running the zoneadd command in similar way as we used aliadd command.

We can create zone config with the below command. This will add the zone to the cfg too..

cfgcreate “BForum_SAN1_CFG”,”BForum_HBA1_VNX_SPA3″

 

 

we should use the cfgadd command to add a new zone to an existing cfg as shown below,

cfgadd “BForum_SAN1_CFG”,”BForum_HBA1_VNX_SPB2

Thus we have the zones created and added to the (existing/new) config. Now we should save the config to memory to ensure this will be loaded in the next reboot of the switch also. The cfgsave command will do it for us.

We can now enable the zone config to make it in effect.

cfgenable BForum_SAN1_CFG

Yes we are all set. The server and storage now should be able to communicate. Some other useful commands are,

cfgshow BForum_SAN1_CFG           #Shows the config BForum_SAN1_CFG in detail

cfgdisable BForum_SAN1_CFG           #Disables the config BForum_SAN1_CFG

cfgremove “BForum_SAN1_CFG”,”BForum_HBA1_VNX_SPB2”           #Removes the zone BForum_HBA1_VNX_SPB2 from config BForum_SAN1_CFG

cfgactvshow            #Shows the current active config

alishow BForum_HBA1    #Shows the alias BForum_HBA1

zoneshow BForum_HBA1_VNX_SPA3   #Shows the zone BForum_HBA1_VNX_SPA3 details

More in coming posts. You may click here for SAN switch related posts. Thanks for reading..

 

1 2