Over the weekend I had to run a failover test for an application within SRM. As SRM can only replicate down to the datastore level and not the VM level this meant doing a full test failover of all VMs but ensuring beforehand that all protected VMs in the Protection Group were set to Isolated Network on the recovery site. This ensure that even though all VMs would be started in the recovery site they would not be accessible on the network and therefore not cause any conflicts. The main concern, outside of a VM not connecting to the isolated network, was that the VM being tested and the application that sits on it are running on Windows 2000. Yes, that’s not a typo the server is running Windows 2000. The application is from back around that period as well so if it drops and can’t be recovered then it’s a massive headache.

Failover Test:

 Step 1: Power down the production VM

SRM steps shutdown server

Step 2: Perform Test Recovery

Go to Recovery Plans -> Protection Groups and select Test

SRM Protection Group Test

When the prompt comes to begin the test verify the direction of the recovery, from the protected site to the recovery site. Enable the Replicate recent changes to recovery site. In most cases you will be already running synchronous writes between the sites and the data will just about be up to date anyway. It is recommended however to perform a recent change replication anyway to make sure that all data is up to date.

SRM Test Recover Plan

 

Click Next and then click Start to confirm the test recovery

SRM Test Recovery Plan Complete

 

Step 3: Monitor the failover

In the tasks console within vCenter you will see the VMs being reconfigured and powering on.

SRM Monitor Failover

Take a look at the Recovery Steps within SRM and you can see the list of tasks as they occur. The Priority 1 VMs will power on first and each VM will power on in order.

SRM Recovery Steps

 

Once all the VMs have completed the Recovery Steps will show successful and if any VMs have problems powering on it will also be listed here.

Step 4: Change the DNS settings of the server to be the new IP defined in the computer configuration within SRM. To do this go to a Domain Controller and open DNS then change the IP address of the required server.

SRM DNS Settings

Step 5: Change the network from Isloated Network to an active domain network. Open Edit Settings for the test VM in the recovery site Select a different network for the Network Connection so that it is no longer on the Isolated Network

SRM Configure Network Adapter

 

Step 6: Send a ping request to the new IP address to ensure it is active. If the VM is still not pingable log onto the server and check that the auto-IP configuration has taken effect on the vNIC. As the change in network was required some of the settings may have been lost. Re-enter the IP address if required.

SRM Configure IP address

 

Step 7: Next you can hand the VM over to your Applications team to perform testing.

 

Cleanup Test:

Step 8: Once testing has been completed by the applications team and hopefully every test case has been signed off as successful then you can begin to perform the cleanup. The first task is to shutdown the VM in the Recovery site.

SRM-14

Step 9: Edit the settings of the VM and put it back on an Isolated Network. Click Ok to save the changes

SRM Cleanup Test

Step 10: Go to Recovery Plans -> Protection Groups and select Cleanup

SRM Cleanup Isolate Network

Once prompted verify the direction once again of the cleanup and click Next

SRM Verify DirectionYou can then click Start to begin the cleanup process

SRM Complete cleanup process

 

Step 11: You can monitor the progress of the cleanup in the vCenter task window or from the recovery steps in SRM. You will see all the VMs in the recovery site managed by SRM be reconfigured and then shutdown

SRM Monitor cleanup task

SRM Monitor Cleanup Task 2

The cleanup should only take a few minutes.

Step 12: Change the DNS for the test server back to its original production IP address from the domain controller

SRM Modify DNS

 

Step 13: Power On the production VM from its source site once again and verify that it can respond to pings on its original IP address

SRM Power On VM

 

And that’s it, you will have successfully tested the application on just one server in your recovery site.

 

 

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.