Skip to main content

Home made SAN Migration

The topic sounds more elaborate than it is - alternative title could be "Hackjob SAN volume backup and restore".

The Setup

In a legacy-style SAN and Compute setup, I have an EMC Unity 450F box deployed with Fibre Channel (FC) to a Cisco UCS (Unified Compute System). I am booting the UCS blades off the SAN, running vmware with Block/LUN DataStores, and one blade running Windows.
I also have a Dell R740 server in the mix, with a Qlogic HBA as well as onboard storage.

The Situation

Not in production yet, but we had spun up a few VMs, and all our blades had been installed, esx and vcenter running, a few VM's, and including the physical Windows blade, and the R740.

Then we discovered that the Unity had no SFP+ ports in it, and I need to do replication - swearing my vendor up and down, I call EMC, and they are sending out NIC modules and guy to install it. BUT, because we have to remove a module to insert a new one, the whole SAN box has to be reset to Factory setting (!!!). My response is WTF?  But apparently this is how the new Unity platform works, if you have to remove modules and replace with a different kind, the whole system has to be factory reset.
On the other hand, if you only have to add to an empty slot, or replace the same type of an existing one, you are okay.

The Remedy

So at this point I am trying to think of ways to avoid having to configure/install all the hosts and everything again - my salvage is that I have that R740 there, and it has enough onboard storage for me to keep a copy of all the SAN data on local disk.

Here is an overview of the plan forming in my head at this time:

  1. Boot the R740 with Live linux of sorts
  2. Copy all the LUN's data to local storage on the R740 (dd)
  3. Make sure i have exact LUN configuration details
  4. Factory Reset the SAN
  5. Re-create the LUNs 
  6. Copy all thhe LUN data back in from the R740
  7. Spin up and re-mount datastore
Sounds easy enough.. below is the process i went through to get it done - it is likely some of the steps are not always required, but I always like to be safe than sorry.
If you don't have something with enough storage outside your SAN, this process won't be helpful, on the other hand this could be applied to other types of systems as well, using iSCSI etc - the idea is very simple at a high level - dd the volumes.

The Process

Boot the R740 with Live Linux - Ubuntu 19 Desktop

  • Thinking i just wanted newest drivers, I downloaded latest (April 2019) vanilla Ubuntu 19 desktop.
  • Using the iDrac interface - mount the Ubuntu ISO and boot, ran the Live option (Evaluate without install or similar) so it does not write anything over my system OS.
  • Configure networking - if you are like me and miss ifcofnig, then google use of ip instead - if you have a simple interface you can just use the Ubuntu settings to configure a NIC.
  • I needed to configure link aggregation for my  NIC's, found my NICs with 
    • ip a
    • then edit file
      • /etc/netplan/01-network-manager-all.yaml
    • Adjust to fit your setup of course:
      network:
      
         ethernets:
                eno1:
                        dhcp4: false
                eno2:
                        dhcp4: false
      
         version: 2
         bonds:
                bond0:
                      interfaces: [eno1,eno2]
                      addresses: [172.16.0.16/24]
                      gateway4: 172.16.0.1
                      nameservers:
                              addresses: [172.16.5.9]
                      parameters:
                              mode: 802.3ad
                              lacp-rate: fast
                              mii-monitor-interval: 100
      
    • Test with: netplan try
    • Enable with: netplan apply
  • For my own purposes I installled ssh server so I could work it remotely, i also like to have full vim as that is my editor of choice

    apt update ; apt install openssh-server vim
  • I may have installed another utility or two that I don't recall now in the aftermath
  • I prepped my storage for the backup copies and mounted, in my case it showed up as sda6 - make sure you dont overwrite something you want to keep of course - you could just mount something existing to use - even use smbfs or nfs to a remote share or similar
    • mkfs.ext4 /dev/sda6
    • mount /dev/sda6 /mnt
    • mkdir /mnt/backup
    • cd /mnt/backup

Get Storage details

I already had my FC switches configured with pwwn pairs for all hosts in a zone set, so I did not have to do anything here, depending how you allow access you may need to adjust your zones.

  • Check that your kernel supports the HBA - in my case i had a Qlogic QLE2692 card, and they showed up automatically for me- exactly how you look for them will vary on what you have
  • You may need a kernel module, or driver etc.. In my case, I was not sue if I needed, I went to Qlogic site http://driverdownloads.qlogic.com and looked up my card, it confirmed that Ubuntu 16 and 18 had the drivers
  • I did however download the "QConvergeConsole CLI for Ubuntu" tool, i did not really need it, but it allowed me to verify that the HBA was working properly and could see the LUN's
  • Need a few more tools now for fibre channel and scsi stuff - on the ubuntu system
    • apt install scsitools lsscsi
  • On the SAN i assigned a lun to the host, then
    • rescan-scsi-bus.sh
    • partprobe
    • lsscsi
    • fdisk -l /dev/sdd
  • So in my case the volume showing up was /dev/sdd - and i verified that with the fdisk command, the partition table matched what the volume was.  The extras are just multi-path of the same.
  • I found it cumbersome to identify and separate multiple volumes that had the same size, so I ended up configuring access for only one at the time on the SAN, then rescan/partprobe again on the R740 - this way knowing for sure I work with the correct volume
  • Now i have access, i needed to get some details for the volumes, so I could can later re-create the exact same ones. I found this post about using the Unity API,  From that and the API info, i adjusted the script:to this:
    • #!/usr/bin/perl -w
      
      #################################################################################
      # This sample code is purely an illustration of REST API techniques for the
      # Unity array. It is not complete for production use.
      #################################################################################
      
      use strict;
      use warnings;
      use v5.10;
      
      # update this with your perl libs dir, or put the following libraries on your lib path
      use lib "./perllibs";
      use REST::Client;
      use JSON;
      use HTTP::Cookies;
      use Data::Dumper qw(Dumper);
      
      
      # connection information - customize for your array
      my $host = "172.16.4.180";
      my $user = "badassuser";
      my $pass = "password";
      
      
      ###############################################################################
      #
      # REST:Client initialization
      #
      ###############################################################################
      
      sub initializeREST {
          my($client, $username, $password, $host) = @_;
      
          $client->setHost("https://$host:443");
          $client->addHeader('Accept', 'application/json'); # this is default, so optional, but here for clarity
          $client->addHeader('Content-Type', 'application/json'); # this is default, so optional, but here for clarity
          $client->addHeader('Accept_Language', 'en_US'); # this is default, so optional, but here for clarity
          $client->addHeader('X-EMC-REST-CLIENT', 'true');  # needed to avoid logging in repeatedly
          $client->setFollow(1);  # follow the redirects you get, needed when you first log in
      }
      
      ###############################################################################
      #
      #  Authentication
      #
      ###############################################################################
      
      sub authenticate {
          my($client, $username, $password, $host) = @_;
          my $realm = "Security Realm";
      
          print("\nLogging in to $host as $username\n");
      
          $client->getUseragent->cookie_jar(HTTP::Cookies->new());
          $client->getUseragent->credentials("$host:443", $realm, $username, $password);
      
          # do a simple GET to force the login
          $client->GET("/api/types/system/instances");
      
          # uncomment this to see the response
          # my $response = $client->responseContent();
          # print("Login GET response is:\n$response\n");
      
          # if the login failed, make a second attempt with admin/Password123#
          if($client->responseCode > 299) {
             return "false";
          }
      
          # to protext against CSRF vulnerabilities, future POST/DELETE requests must have
          # EMC-CSRF-TOKEN set with value returned by successful GET response
          my $CSRF_TOKEN = $client->responseHeader("EMC-CSRF-TOKEN");
          $client->addHeader("EMC-CSRF-TOKEN", $CSRF_TOKEN);
      
          return "true";
      }
      
      ###############################################################################
      #
      #  Main
      #
      ###############################################################################
      
      
      # initialize the REST client library.
      # Also, disables the SSL verification, but REST:Client support a cert store if you prefer.
      $ENV{PERL_LWP_SSL_VERIFY_HOSTNAME}=0; #Older implementations of LWP check this to disable server verification
      my $client = REST::Client->new();
      $client->getUseragent()->ssl_opts( SSL_verify_mode => 0 ); #Newer implementations of LWP use this to disable server verificati
      initializeREST($client, $user, $pass, $host);
      
      # initialize the JSON library
      my $json = JSON->new->allow_nonref();
      
      # log in to the array
      my $result = authenticate($client, $user, $pass, $host);
      print("login success=$result\n");
      
      # look at each pool, find the one with the most space
      $client->GET("/api/types/pool/instances?fields=id,name,sizeTotal");
      my $pools = $json->decode($client->responseContent());
      my @entries = @{$pools->{'entries'}};
      foreach my $e (@entries) {
          my $instance = $e->{'content'};
          say Dumper \$instance;
      
          $client->GET("/api/types/storageResource/instances?fields=type,sizeTotal,LunName::type eq 8 ? name : \"\" &compact=true");
          my $lun = $json->decode($client->responseContent());
          say Dumper $lun
      }
      
      
  • Before I could run it, needed a couple of things (Hopefully I didnt forget anything in this list)
    • cpan install REST::Client
    • cpan install JSON
  • ran the script
    • perl get_luns.pl
  • That gave me a json formatted list of volumes. In my case, I only had a handful so I did not really care to spend the time automating everything with the API, but i am sure you could have - all i wanted was the exact name and size.
  • You also probably need to gather which LUN ID these are presented to in the hosts you have mapped, I did that manually - but i am sure that also could be automated. 
Now we have all the information needed

Create backup copies

  • On the R740:  cd /mnt/backup/
    (Or to wherever you are going to keep your copies)
  • For each of the LUNs to be copied
    • SAN: Map LUN to the R740 Host (As the only LUN)
    • R740: rescan-scsi-bus.sh -r ; partprobe ; lsscsi
    • Double check that you have the right LUN  (Adjust device of course)
      R740: fdisk -l /dev/sdd
    • Copy the device to a file - i like to keep the filename same as my LUN name
      R740:  dd bs=1G if=/dev/sdd of=bootVHPR5-esx.lun
    • Pending on size and speeds it may take some time
    • SAN: Remove mapped LUN from the R740
  • When done you should have as many files as you have LUN
You probably want to test restore process to a an alternate LUN on your current SAN, re-map that host and boot it - make sure you understand and can do all this before you go ahead and flush.

Restore and Recover

  • After your have wiped your SAN - or starting on a new one, or whatever you plan on doing - the first order of business is creating all the hosts and LUNS. When it comes to creating the LUNS, I decided to script it, based on the same script as prior.
    One thing to be aware of here - I only had one pool on the target SAN, and I was lazy so I left the "Biggest pool found" from the original script as what I used to add LUNs, you will need to change that up if you have multiple pools.
    I also just copied and pasted and changed for the few LUNs hat I had, if you have a bunch you will probably automate this from the json data captured earlier - my sample script has been limited to two.
    • #!/usr/bin/perl -w
      
      #################################################################################
      # This sample code is purely an illustration of REST API techniques for the
      # Unity array. It is not complete for production use.
      #################################################################################
      
      use strict;
      use warnings;
      use v5.10;
      
      # update this with your perl libs dir, or put the following libraries on your lib path
      use lib "./perllibs";
      use REST::Client;
      use JSON;
      use HTTP::Cookies;
      use Data::Dumper qw(Dumper);
       
      # connection information - customize for your array
      my $host = "172.16.3.22";
      my $user = "badassuser";
      my $pass = "password";
      
      
      ###############################################################################
      #
      # REST:Client initialization
      #
      ###############################################################################
      
      sub initializeREST {
          my($client, $username, $password, $host) = @_;
      
          $client->setHost("https://$host:443");
          $client->addHeader('Accept', 'application/json'); # this is default, so optional, but here for clarity
          $client->addHeader('Content-Type', 'application/json'); # this is default, so optional, but here for clarity
          $client->addHeader('Accept_Language', 'en_US'); # this is default, so optional, but here for clarity
          $client->addHeader('X-EMC-REST-CLIENT', 'true');  # needed to avoid logging in repeatedly
          $client->setFollow(1);  # follow the redirects you get, needed when you first log in
      }
      
      ###############################################################################
      #
      #  Authentication
      #
      ###############################################################################
      
      sub authenticate {
          my($client, $username, $password, $host) = @_;
          my $realm = "Security Realm";
      
          print("\nLogging in to $host as $username\n");
      
          $client->getUseragent->cookie_jar(HTTP::Cookies->new());
          $client->getUseragent->credentials("$host:443", $realm, $username, $password);
      
          # do a simple GET to force the login
          $client->GET("/api/types/system/instances");
      
          # uncomment this to see the response
          # my $response = $client->responseContent();
          # print("Login GET response is:\n$response\n");
      
          # if the login failed, make a second attempt with admin/Password123#
          if($client->responseCode > 299) {
             return "false";
          }
      
          # to protext against CSRF vulnerabilities, future POST/DELETE requests must have 
          # EMC-CSRF-TOKEN set with value returned by successful GET response
          my $CSRF_TOKEN = $client->responseHeader("EMC-CSRF-TOKEN");
          $client->addHeader("EMC-CSRF-TOKEN", $CSRF_TOKEN);
      
          return "true";
      }
      
      ###############################################################################
      #
      #  Main
      #
      ###############################################################################
      
      
      # initialize the REST client library. 
      # Also, disables the SSL verification, but REST:Client support a cert store if you prefer.
      $ENV{PERL_LWP_SSL_VERIFY_HOSTNAME}=0; #Older implementations of LWP check this to disable server verification
      my $client = REST::Client->new();
      $client->getUseragent()->ssl_opts( SSL_verify_mode => 0 ); #Newer implementations of LWP use this to disable server verificati
      initializeREST($client, $user, $pass, $host);
      
      # initialize the JSON library
      my $json = JSON->new->allow_nonref();
      
      # log in to the array
      my $result = authenticate($client, $user, $pass, $host);
      print("login success=$result\n");
      
      # look at each pool, find the one with the most space
      $client->GET("/api/types/pool/instances?fields=id,name,sizeTotal");
      my $pools = $json->decode($client->responseContent());
      my @entries = @{$pools->{'entries'}};
      my $biggest;
      foreach my $e (@entries) {
          my $instance = $e->{'content'};
          # say Dumper \$instance;
          if (!defined($biggest)) { $biggest = $instance; }
          elsif ($biggest->{'sizeTotal'} lt $instance->{'sizeTotal'}) { $biggest = $instance; }
      }
      # DEBUG say Dumper \$biggest;
      my $poolId = $biggest->{'id'};
      
      my $body = '';
      my $size = '100000000';
      my $lunname = '';
      
      my $lunprefix = '';
      
      
      #------------------------
      $size = '21474836480';
      $lunname = 'bootVHPR5-esx';
      
      print "\nCreating $lunprefix$lunname of size $size\n";
      $body = '{"name":"'.$lunprefix.$lunname.'","lunParameters":{"pool":{"id":"'.$poolId.'"},"size":'.$size.',"isThinEnabled":1,"isDataReductionEnabled":1}}';
      $client->POST("/api/types/storageResource/action/createLun", $body);
      say $client->responseCode();
      say Dumper $client->responseContent();
      
      
      #------------------------
      $size = '21474836480';
      $lunname = 'bootVHPR6-esx';
      
      print "\nCreating $lunprefix$lunname of size $size\n";
      $body = '{"name":"'.$lunprefix.$lunname.'","lunParameters":{"pool":{"id":"'.$poolId.'"},"size":'.$size.',"isThinEnabled":1,"isDataReductionEnabled":1}}';
      $client->POST("/api/types/storageResource/action/createLun", $body);
      say $client->responseCode();
      say Dumper $client->responseContent();
      
      
  • Verify on your SAN that they look the way you want them (Thin, Reduction, Add snapshots etc)
  • On the R740:  cd /mnt/backup/
    (Or to wherever you are going keeping your backup copies)
  • Now we start a restore process
  • For each of the LUNs to be restored
    • SAN: Map LUN to the R740 Host (As the only LUN)
    • R740: rescan-scsi-bus.sh -r ; partprobe ; lsscsi
    • Double check that you have the right LUN  (Adjust device of course)
      R740: fdisk -l /dev/sdd
      The partition table should be empty - and check size
    • Copy the backup file on to the device:
      R740:  dd bs=1G if=bootVHPR5-esx.lun of=/dev/sdd
    • Pending on size and speeds it may take some time
    • SAN: Remove mapped LUN from the R740
  • The next thing is to map LUNs to hosts, make sure you chamge the LUN ID order to match the original - in my case when doing SAN boot i always keep LUN ID 0 as my boot volume, but for additional mappings, say VMFS Datastore volumes, these come and go and may not always be in order.
  • Boot your hosts, should now work just fine

Reovering vmWare datastores

If you are doing VMFS Datastores on block devices like me, chances are you need to reover the datastores, vSpehere did not mount them as the signatures have changed and no longer matches that of the LUN. It is an easy fix (which can be very tricky to find by googling, as there are so many things out there related to datatstores that is not relevant to this scenario).

First make sure all your hosts have access to the LUNs, that you didnt forget something.

Simply do this on each vSpehere host:
  • Enable SSH - connect to shell
  • List all the VMFS items found - your "missing" ones should all be listed here
    • esxcfg-volume -l
  • For each Datastore, run this command to permanently mount the datastore
    • esxcfg-volume -M DatastoreName
  • Should now be all set, your vm's will be back in order

Comments

Popular posts from this blog

Removing Domain - Office 365 / Azure AD Tenant

Recently I had an interesting experience and challenge, removing a domain from an Azure AD (Office 365) Tenant which had been around for years, switching all the users to another domain for logins/UPN. A normal procedure for this should be simple: Change UPN for all users and groups Change any associated apps, email, and other resources Remove Domain (This can be done from Azure Portal, or from Office 365 Admin). The issue for me was that there was resources associated with some users, which I could not find what recourses or how to clear it up. In Azure Portal, Azure AD, Custom Domains - it would not let me delete at all, just showed me a link to the list of users in violation. In office 365 Admin, Settings, Domains - I was able to initiate a Delete action, once, with a supposed automatic removal action. After several hours this failed, and it now remained in a failed state that did not let me try again from UI. So I started digging with PowerShell - I found it most usable with the MS

Linux/Unix - Create a local Certificate Authority (CA)

I get these questions all the time - people know i have some runtime with certificates and such - one question is "Can't i just issue my own certs?" - and the answer of course is yes - but I always make sure to add that it won't be any use on a public web site since no-one will trust it. So setting up your own CA is not "generally useful", it is more if you need some specific things, like issuing certificates with a single signing source for client logins or similar. Most business will have a  couple of Windows Domain controllers, if you need to sign certs for a limited set of users, what you should do is make sure some system in your windows domain runs Certificate Services, then issue certs from there, make sure any non-domain-members has a trust for that CA. If you actually do need to set up you own CA, here is one way to do it Procedure to set up your own local CA The common name for the CA cert must NOT be the same as a domain name or anything e

Cisco UCS Mini - Add Extender Chassis

If you happen to own a UCS Mini Setup, a 5108 Chassis with two Fi 6324 or similar, and you are looking for documentation on how to add another 5108 Chassis with fabric extenders (2204XP in my case), then Cisco really does not have much out there, nor is there a lot of googlable information either (Everything you find is related to standalone Fabric Interconnects and "standard" UCS). Even after calling TAC, it took a while to get something, and what they told us was not even accurate. So here is how we did it, and it worked, came up without any interruption to current chassis, network, or running profiles. Equipment Of course we used our Cisco vendor to spec the equipment, but just for reference here is the list of what we had and what we added: Original Setup 5108 Chassis  Fi 6324 (Qty 2) Ports 1-2 for Fibre Channel, and 3-4 for Ethernet (MMF) Connected to a stack of switches and pair of FC switches/SAN Running UCS version 4.0.1 (Fairly recently upgraded as of M