I had a problem a while ago where UCS Director crashed during a Metrocluster failover test. It was caused by the delay in the transfer of writable disks on the storage which in turn caused the VM kernel to panic and set the disk to read only. After that problem, and due to other restore issues within the infrastructure as well as not having a backup prior to the failover test I was left with a dead UCS Director appliance. It was essentially completely buggered as the Postgres database had become corrupt. Cisco support were unable to resolve the problem and it took a lot of playing around with NetApp snapshots to pull back a somewhat clean copy of the appliance from before the failover test. Really messy and I wouldn’t recommend it.
Since then I’ve been capturing weekly backups of the UCS Director database to a FTP server so I have a copy of the DB to restore should there be any problems with the appliance again. This script is not supported by Cisco so please be aware of that before implementing it. To set up the backup create a DB_BACKUP file in /usr/local/etc with the following:
#!/bin/sh # server login password localfile remote-dir upload_script(){ echo "verbose" echo "open $1" sleep 2 echo "user $2 $3" sleep 3 shift 3 echo "bin" echo $* sleep 10 echo quit } doftpput(){ upload_script $1 $2 $3 put $4 $5 | /usr/bin/ftp -i -n -p } /opt/infra/stopInfraAll.sh /opt/infra/dbBackupRestore.sh backup BKFILE=/tmp/database_backup.tar.gz if [ ! -f $BKFILE ] then echo "Backup failed. " return 1 fi export NEWFILE="cuic_backup_`date '+%m-%d-%Y-%H-%M-%S'`.tar.gz" export FTPSERVER=xxx.xxx.xxx.xxx export FTPLOGIN=< ftp user name > export FTPPASS=<ftp password> doftpput $FTPSERVER $FTPLOGIN $FTPPASS $BKFILE $NEWFILE nohup /opt/infra/startInfraAll.sh & exit 0
Next you’ll need to edit your cron jobs on the appliance. You can use the crontab -e command to edit the schedule settings and enter:
1 2 * * 0 /usr/local/etc/DB_BACKUP > /dev/null 2>&1
And there you go, you now have a weekly scheduled backup of your UCS Director database.