Cisco has finally decided to bring the vodka to spike the punch at the Hyper-Converged Infrastructure party. And it tastes pretty damn good. There have been rumours for a while now that Cisco was working with Springpath and as a major third round investor it’s not surprising to hear about their entrance into the HCI arena. The Register’s Chris Mellor reported about Something bubbling up at Springpath back in early December. So what is the offspring of Cisco and Springpath called? Cisco HyperFlex!!
Hyper-converged systems so far have delivered on simplicity and scale but there’s been a massive gap in the lack of network integration in existing solutions. Yes you can use top of rack fast switches. In some cases customers use Cumulus on whitebox top-of-rack switches for software defined networking but networking is not a built in feature of the two leading hyper-converged solutions, Nutanix and Simplivity.
HyperFlex joins the comprehensive DC portfolio along with UCS, MDS and Nexus. It means that Cisco now has a play in traditional component based infrastructure, converge infrastructure and now hyper-converged infrastructure. Cisco is adding HyperFlex to provide it with another string to its software defined infrastructure. It will now have:
- UCS – compute (service profiles, APIs etc.)
- ACI – for software defined network
- HyperFlex- software defined storage, compute and network
On the initial release Cisco HyperFlex will support file storage and VMware. There are a number of other storage types, such as block and object, and hypervisors on the roadmap. There’s also going to be container support. Given that Springpath was hypervisor agnostic I’d expect a quick ramp up from Cisco and fast feature release cycle.
Like pretty much every other hyper-converged solution Cisco sees its expected use-cases to be:
- Server virtualisation
- Test and development
- Large remote branch offices
UCS Manager is already familiar to multiple thousands of customers worldwide and the server and network deployment settings in HyperFlex come from pre-configured Service Profiles. Service Profiles are well and truly familiar to anyone that has worked with Cisco UCS. Given that customer base and the familiarity with existing management tools there’s massive potential for Cisco HyperFlex here. There are some well developed existing incumbents in the hyper-converged market with Nutanix leading the way and HyperFlex will allow Cisco to gain a foothold in that rapidly growing market.
The Deep-dive: Read More
After a recent upgrade to UCS Director 5.4 I noticed that my storage connections were showing a status of failed on the dashboard. I went to Administration -> Physical Accounts -> Physical Accounts. All of my NetApp controllers were offline.
I went to edit settings and re-entered my password to make sure that it had been picked up correctly.
All the settings were fine so I saved them and tested the connection to the controllers again.
The connection failed with the following error:
500 Connection has been shutdown: javax.net.ssl.SSLHandsakeException:
Server chose SSLv3, but that protocol version is not enabled or not supported by the client.
After the recent upgrade to 5.4 I decided to bite the bullet and upgrade to 5.4.1. Go to the download software portal for Cisco. Download the 5.4.1.zip patch file. I had a number of issue with the download as the checksum didn’t match. I had to take a number of attempts to get the file in-tact. I believe the issue was the ISA that acts as our internet proxy. Death to the ISA!!!!
Once the file has been downloaded copy it to your FTP server. Now it’s time to apply the patch. log onto UCS Director via either the console or SSH using the shelladmin account. Select option 3 to stop all the services.
Cisco announced their release of UCS Director 5.4 back in November. As I’m currently running 5.3 and ran into an issue with a workflow Cisco support recommended upgrading to 5.4. I had a look over the Cisco UCS Director 5.4 Release Notes and there’s a new version of Java and the CentOS operating system are newer in the latest version. Due to this the upgrade procedure for 5.4 is different from previous version. In earlier versions it was possible to upload a patch via shelladmin and it would upgrade the software and database schema in place. 5.4 however requires new appliances to be deployed and a migration of database files etc. to be done between the 5.3 and 5.4 versions.
I really think that Cisco needs to look at using a HTML 5 console in the future as this upgrade path is overly complicated. Considering a lot of companies want you to be on the latest version when opening support calls, including Cisco, it would make sense for them to make it easier to perform the required upgrades.
The primary changes that have caused the modification to the upgrade path are:
- CentOS version 5.4 to version 6.6
- Java version 1.6 to version 1.8
Another thing to note is that version 5.54 requires 12GB RAM.
Cisco recommend standing up the new appliances beside your current UCS Director and Bare-Metal Appliances and performing a migration. In my case there’s a few firewall rule etc already been created for the existing environment so I wanted to keep the same IP addresses and machine names. I changed the IP addresses of the current appliances to be something else within the same subnet and gave the new appliances temporary names but the existing IP addresses. Once everything had been migrated and the changes confirmed I was able to rename the appliances to be the existing ones and removed the older appliances from the infrastructure. Before commencing the upgrade I also had a sold read over the UCS Director Upgrade 5.4 Guide and the UCS Director Bare-Metal Agent 5.4 Upgrade Guide
Last Saturday I awoke to find an email from Cisco Champions Program welcoming me into the Cisco Champions community for 2016. I feel humbled, honoured and excited to be selected to be part of this community. This is my first time being nominated as a Cisco Champion and for me personally it shows that I’m progressing in the direction I wished in my career.
When I began this blog a couple of years ago mainly as a drop zone for documenting technical issues I ran into I couldn’t have dreamed that I would have ended up making a contribution to the greater IT community.
For 2016 I want to continue my level of participation in the community via this blog and hopefully expand to participating in podcasts. On a local level I want to contribute more in the virtualization, data center and automation communities. And from a personal level I want to interact with the other Cisco Champions and expand my knowledge of Cisco solutions and services.
Well done to all the other Cisco Champions, particularly the other novices. It’s going to be a blast. I’m looking forward to attend CLMEL later this year as a Cisco Champion.
I was recently deploying new blades within the UCS chassis but found that I was unable to. In one UCS domain there were no issues but in the second UCS domain if failed with the error [FSM:Failed] Configuring service profile xxx (FSM:sam:dme:LsServerConfigure) and there were a number of other minor warnings as well. The service profile would appear in the list but it was highlighted as having an issue and it could not be assigned to any blades. After a bit of searching around I found an answer on the Cisco Communities forum.
I also created a tech support file and downloaded it to my desktop, extracted the compressed files and opened sam_techsupportinfo in notepad. I did a search for errors and found that there was an issue resolving default identified from UCS Central.
The solution was to unregister the server from UCS Central and then to deploy the Service Profile again. To unregister from UCS Central go to Admin Tab -> Communication Management -> UCS Central and select Unregister from UCS Central. Before unregistering make sure that the Policy resolution controls are how you want them to be. In my case they were all set to local so unregistering from UCS Central had to real impact. Many users will have UCS Central integration configured to work as it was designed and will use Global policies. Unregistering from UCS Central can have a knock on impact on how those policies are managed.
Once the unregister had completed I ran the service profile deployment from template and it worked this time. I believe the issue is down to a time sync issue between UCS and UCS Central. I’m currently working on a permanent work around
During a recent Cisco UCS upgrade I noticed an error for ethlanflowmon which was a critical alert. I hadn’t seen the problem before and it occurred right after I had upgraded UCS Manager firmware as per the steps listed in a previous post I wrote about UCS Firmware Upgrade. Before proceeding to upgrade the Fabric Interconnects I wanted to clear all alerts where possible. The alert for “FSM:FAILED: Ethernet traffic flow monitoring configuration error on” both switches was a cause for concern.
On further investigation I found that this is a known bug when upgrading to versions 2.2(2) and above. I was upgrading from version 2.2(1d) to 2.2(3d). Despite being a critical alert the issue does not impact any services. The new UCSM software is looking for new features on the FI that do not exist yet as it has not been upgraded. As soon as you upgrade the FIs this critical alert will go. More information about the bug can be found Cisco’s support page for the bug CSCul11595
During a recent UCS firmware upgrade I had quite a few blades show up with the error “CIMC did not detect storage”. Within UCSM I could see that the blade had a critical alert. It initially started after I upgraded UCS Manager firmware as documents in a previous post I wrote about UCS Firmware Upgrades. I did some searching around to find what may be causing the issue and the best answer I could find was to from the Cisco community forums to disassociate the blade, decommission and reseat within the chassis. I later spoke to a Cisco engineer and he advised of the same steps but that it was also possible to do without reseating the blade. This also looks like its a problem when upgrading from 2.2(1d) to other versions of UCSM but I haven’t been able to validate if it’s only that version or if it also affects others.
The full error I saw was for code F1004 and for Controller 1 on server 2/1 is inoperable. Reason: CIMC did not detect storage
Within UCSM I could see there was an issue with the Blade
Before proceeding with the upgrade of the FIs, IOMs and Blades themselves I wanted to clear any alerts within UCSM, particularly critical alerts. The steps I followed to bring the blade back online were to go to the blade and select Server Maintenance
I had a problem a while ago where UCS Director crashed during a Metrocluster failover test. It was caused by the delay in the transfer of writable disks on the storage which in turn caused the VM kernel to panic and set the disk to read only. After that problem, and due to other restore issues within the infrastructure as well as not having a backup prior to the failover test I was left with a dead UCS Director appliance. It was essentially completely buggered as the Postgres database had become corrupt. Cisco support were unable to resolve the problem and it took a lot of playing around with NetApp snapshots to pull back a somewhat clean copy of the appliance from before the failover test. Really messy and I wouldn’t recommend it.
Since then I’ve been capturing weekly backups of the UCS Director database to a FTP server so I have a copy of the DB to restore should there be any problems with the appliance again. This script is not supported by Cisco so please be aware of that before implementing it. To set up the backup create a DB_BACKUP file in /usr/local/etc with the following:
# server login password localfile remote-dir
echo "open $1"
echo "user $2 $3"
upload_script $1 $2 $3 put $4 $5 | /usr/bin/ftp -i -n -p
if [ ! -f $BKFILE ]
echo "Backup failed. "
export NEWFILE="cuic_backup_`date '+%m-%d-%Y-%H-%M-%S'`.tar.gz"
export FTPLOGIN=< ftp user name >
export FTPPASS=<ftp password>
doftpput $FTPSERVER $FTPLOGIN $FTPPASS $BKFILE $NEWFILE
nohup /opt/infra/startInfraAll.sh &
Next you’ll need to edit your cron jobs on the appliance. You can use the crontab -e command to edit the schedule settings and enter:
1 2 * * 0 /usr/local/etc/DB_BACKUP > /dev/null 2>&1
And there you go, you now have a weekly scheduled backup of your UCS Director database.
UCS Director Baremetal Agent Installation:
Before commencing the Installation of the Baremetal Agent appliance I would recommend that UCS Director has been fully installed and is available before proceeding. If you need to install UCS Director as an initial installation there’s some great documentation on the Cisco site but you can also check out the blog post by Jeremy Waldrop. It’s for an older version of UCS Director but the installation steps still count for the current version. If you are upgrading from a previous version of UCS Director then you can check out a previous post I did on upgrading UCS Director from 5.1 to 5.3.
Cisco UCS Director Baremetal Agent Installation and Configuration Guide, Release 5.2
Cisco UCS Director Baremetal Agent Installation and Configuration Guide, Release 5.3
Go to Cisco Download for UCS Director and select first UCS Director 5.3. Download the Cisco UCS Director Baremetal Agent Patch 184.108.40.206
Accept the license agreement
The download will begin
Next, go back to the main UCS Director download page and select UCS Director 5.2.
Accept the license agreement
The download will begin