Vertica Backup And Restore
disclaimer: use at your own risk
If you pay close attentation to the HP Vertica documentation you will see this requirement for doing a restore:
The cluster to which you are restoring the backup has the same number of hosts as the one used to create the backup. The node names and the IP addresses must also be identical.
UPDATE: Since Vertica 9.2 most of this information is opsolete. See Restoring Objects to an Alternate Cluster.
Using the same node names is not complicated. Even if you run your Vertica Cluster on Amazon AWS. IP addresses are difficult though.
Here is a way to solve the problem. It works for me. You might want to test it for yourself before relying on it. Don’t come back to me and tell me you lost your data. It’s your data and in the end you are responsible for it. Enough said.
Let’s dive into the details:
In the first step we need a configuration file for vbr
:
[Misc] snapshotName = backup dest_verticaBinDir = /opt/vertica/bin restorePointLimit = 6 objectRestoreMode = createOrReplace tempDir = /tmp/vbr retryCount = 2 retryDelay = 1 passwordFile = /opt/vertica/config/passwd [Database] dbName = testdb dbUser = dbadmin [Transmission] encrypt = False checksum = False port_rsync = 50000 serviceAccessUser = None total_bwlimit_backup = 0 concurrency_backup = 1 total_bwlimit_restore = 0 concurrency_restore = 1 hardLinkLocal = True [Mapping] v_testdb_node0001 = []:/mnt/data/testdb/backup v_testdb_node0002 = []:/mnt/data/testdb/backup v_testdb_node0003 = []:/mnt/data/testdb/backup v_testdb_node0004 = []:/mnt/data/testdb/backup v_testdb_node0005 = []:/mnt/data/testdb/backup v_testdb_node0006 = []:/mnt/data/testdb/backup
You might wonder why there are empty brackets. This means vbr is using rsync without tcp. Researching this approach I made several performance tests. If you do the normal backup and not the mentioned hardLinkLocal version you already gain a big performance boost by using local instead of tcp. But the biggest time savings I had with using the hardLinkLocal backup.
Now you can run your backup:
/opt/vertica/bin/vbr.py --config-file /opt/vertica/config/backup.ini --task backup 2>&1 >> /opt/vertica/log/backup.log
The bonus with the hardlink backup: You get faster restore times if you have to eg restore a table due to human error.
The downside: you still have the backup on the same disks as the original data. If the disks went down for whatever reason you are without a backup. Sure, in case of AWS you could just relay on direct EBS Volume snapshots. In worst case this would mean you take your snapshots and your new cluster greets you with a failing filesystem check due to inconsistent data.
So let’s extend the vbr command with some extra steps. Use some additional EBS Volumes for your backup.
With a wrapper script like this you can get the files to your backup volumes:
#!/bin/bash BACKUPCONFIG=/opt/vertica/config/backup.ini BACKUPLOG=/opt/vertica/log/backup.log date >> ${BACKUPLOG} { time /opt/vertica/bin/vbr.py --config-file ${BACKUPCONFIG} --task backup; } &>> ${BACKUPLOG} LOCALIP=`ip -f inet -o addr show dev eth0 | cut -d' ' -f7 | cut -d/ -f1` for HOST in `grep --color=never ^v_ ${BACKUPCONFIG} | awk '{print $1}'`; do IP=`grep --color=never ^${HOST} /opt/vertica/config/admintools.conf | grep -v ${LOCALIP} | sed -e "s/${HOST} = //" | awk -F, '{print $1}'` if [[ -n "${IP}" ]]; then echo ${IP} >> ${BACKUPLOG} ssh ${IP} "nohup nice -n 5 ionice -c 3 rsync -aH --exclude=lost+found --delete /mnt/data/backup /mnt/backup/ < /dev/null &" & fi done echo "starting local rsync." >> ${BACKUPLOG} { time nice -n 5 ionice -c 3 rsync -aH --exclude=lost+found --delete /mnt/data/backup /mnt/backup/; } &>> ${BACKUPLOG}
After this script is finished you can take EBS Volume snapshots of these backup volumes. In case of a recovery you can take the backup volumes as your new data volumes.
If you do a recovery test on a new cluster with different IPs you follow the documentation and create the database. After the shutdown you adjust the backupfiles with this script:
for file in `grep -lr '172.16.50.1[0-1]' /mnt/data/backup/testdb/Snapshots/*`; do sed -i \ -e 's/172.16.50.11/172.16.55.61/g' \ -e 's/172.16.50.12/172.16.55.62/g' \ -e 's/172.16.50.13/172.16.55.63/g' \ -e 's/172.16.50.14/172.16.55.64/g' \ -e 's/172.16.50.15/172.16.55.65/g' \ -e 's/172.16.50.16/172.16.55.66/g' \ ${file} done if [ -e /opt/vertica/config/backup.ini ]; then sed -i \ -e 's/172.16.50.11/172.16.55.61/g' \ -e 's/172.16.50.12/172.16.55.62/g' \ -e 's/172.16.50.13/172.16.55.63/g' \ -e 's/172.16.50.14/172.16.55.64/g' \ -e 's/172.16.50.15/172.16.55.65/g' \ -e 's/172.16.50.16/172.16.55.66/g' \ /opt/vertica/config/backup.ini fi
now you can proceed and follow the documentation again and run something like:
vbr -t restore --archive 20170221_161405 --config-file /opt/vertica/config/backup.ini