Amanda backup
Many moons ago I had to convince my former employer that we should have backup. There are only 2 kinds of companies: those without backup and those loosing data. wait. those are the same. Aside from lack of backup we were in the need of more storage. Finally I got backup and learned a bit about Fibreoptic, SAN storage and tapelibraries. Of course a IBM TS3200 with 2 LTO5 drives was not cheap. We had 50 tapes for the unit. For sure there was no budget left for a commercial backup software. But we were the department running almost completely open source. After some research we picked AMANDA the Advanced Maryland Automatic Network Disk Archiver. I wrote a small script to read the LTO barcode tape labels to add them to the AMANDA tape inventory and we were good to go.
Around the same time i bought a HP StorageWorks DAT160 for my homeserver.
Back to the future. Recently we moved to a new flat. Since then my corner in the living room turned into a dedicated office room. Having 30 DAT160 tapes i decided to reconfigure AMANDA. You never now what challenge your next job will bring you. Knowing how to configure a reliable backupsolution doesn’t hurt. 160 GB doesn’t sound much today. But if you focus on the really important stuff - configuration files, git repositories, text files - it is ok. For my webserver i run rsnapshot. The chance that both my webserver and my homeserver die together is very low.
At home i don’t have that many changes. Being lazy i decided to run 1 backup per week.
This is my crontab:
0 2 * * 2 /usr/sbin/amcheck -m -M my_email DailySet1 0 12 * * 3 /usr/sbin/amdump DailySet1
Every tuesday we check for the presence of a usable tape and every wednesday we run the backup. If we need more than one tape we have plenty of time to finish the backup.
AMANDA turns 30 next year. Which means a lot of thoughts went into the software. As it’s open source many eyes have scanned the code over the years. I wouldn’t expect too many bugs left and not much left on the wishlist either. Deduplication is probably the missing tip of the ice cream. Of course not everybody wants to handle tapes anymore. Even though LTO-8 can store 12TB raw or 30TB compressed data. But Amanda supports virtual tapes on harddrives and AWS S3 as well.
What sets Amanda appart from the commercial competition is the way she handles your data. Commercial products always use a proprietary format to write the data. Without the software you cannot read your data. Amanda is just a wrapper around open source tools. if you write unencrypted, uncompressed data to your tape all you need to access the data is tar.
For the most recent version of Amanda - 3.5.1 - you should visit AMANDA.org. The release date of December 1 2017 sounds dated but keep in mind that the software is around for a long time and doesn’t require that much attention anymore. The community is quite active though.
If you want to use encryption - highly recommended if your data leaves your datacenter - there is a pull request for the automatic creation of the encryption keys during package installation. My guess is that this feature is not widely used. Otherwise I could not explain this bug.
The software contains a server and a client part. The client must be configured on the system you want to backup. In the world of localhost or a direct vpn link you might consider using xinetd:
# default: on # # description: Amanda services for Amanda server and client. # service amanda { disable = no flags = IPv4 socket_type = stream protocol = tcp wait = no user = amandabackup group = disk groups = yes server = /usr/libexec/amanda/amandad server_args = -auth=bsdtcp amdump amindexd amidxtaped senddiscover }
or on the client:
service amanda { only_from = your.amanda.server.ip socket_type = stream protocol = tcp wait = no user = amandabackup group = disk groups = yes server = /usr/sbin/amandad server_args = -auth=bsdtcp amdump disable = no }
additionaly in /var/lib/amanda/
or where ever your amanda user has
it’s home directory you need a .amandahosts
file:
your.amanda.server.ip.or.dns amandabackup amdump your.amanda.server.ip.or.dns amindexd amidxtaped
In case you wonder why i add links to “well known” software: in recent years i notice a change in our industry. Just as your typical car mechanic changed in the last 20 years from somebody with a deep understanding of mechanics and the basics of a combustion engine requiring gas, air and a spark at the right time to somebody only able to swap parts in the hope of replacing the broken part our IT industry is changing as well. Some coworkers are able to operate on AWS and run EKS but have never heard about dd or its noerror flag and have trouble recovering data from a half broken metal driven harddisk. But i don’t blame anybody for that. The amount of information is just overwhelming. I am a generalist but there were always colleagues focusing on a specific topic. Think about those caring for the backup for a company with thousands of coworkers. Or those guys heavily into Oracle databases. They have always relied on the help of there coworkers for topics outside of their universe.
In /etc/amanda/
you can have mulitple configurations. Each one in it’s
own folder. The folders contain 3 files:
- amanda.conf
- amanda-client.conf
- disklist
There are plenty of other files but you nurse them with your editor.
Here we have basic amanda.conf:
org "DailySet1" mailto "root" dumpuser "amandabackup" inparallel 1 dumporder "sssS" taperalgo first displayunit "m" netusage 8000 Kbps dumpcycle 4 weeks runspercycle 4 tapecycle 30 tapes bumpsize 20 Mb bumppercent 20 bumpdays 1 bumpmult 4 etimeout 28800 dtimeout 1800 ctimeout 30 device_output_buffer_size 1280k flush-threshold-dumped 100 flush-threshold-scheduled 100 taperflush 0 autoflush yes runtapes 10 tapedev "tape:/dev/tape/by-id/usb-HP_DAT160_4855450922344348-0:0-nst" maxdumpsize -1 tapetype hp_dat160 labelstr "^DailySet1-[0-9][0-9]*$" amrecover_changer "changer" holdingdisk hd 1{ comment "main holding disk" directory "/opt/amanda" use -100 Mb chunksize 1Gb } infofile "/etc/amanda/DailySet1/curinfo" logdir "/etc/amanda/DailySet1" indexdir "/etc/amanda/DailySet1/index" define interface local { comment "a local disk" use 8000 kbps } define tapetype hp_dat160 { comment "Created by amtapetype; compression enabled" length 66420608 kbytes filemark 617 kbytes speed 5346 kps blocksize 32 kbytes part-size 8 gbytes part-cache-type memory part-cache-max-size 512 mbytes } define dumptype normal { comment "gnutar backup" program "GNUTAR" auth "bsdtcp" index yes holdingdisk yes # on by default compress client best priority medium exclude list ".amanda.excludes" } define dumptype all { normal exclude "" }
amanda-client.conf:
conf "DailySet1" index_server "amandahost" tape_server "amandahost" tapedev "tape:/dev/tape/by-id/usb-HP_DAT160_4855450922344348-0:0-nst" auth "bsdtcp" unreserved-tcp-port 1024,65535
disklist:
server1.foo.com opt /opt/ normal server1.foo.com system / normal server2.foo.com etc /etc/ normal
usually you don’t want to backup everything. think of /sys/
or
/tmp/
. then you need an exclude file on the client.
here is one example for system:
./proc/* ./sys/* ./tmp/*
if you want to backup everything don’t make the mistake of using an empty exclude file. You will end up with a full backup in every run. instead use a different dumptype.
each run will send you a report via email.
most commands must be executed as the amanda user.
amstatus DailySet1
will give you the current status while backup is
running.
amreport DailySet1
will give you the report of the last backup:
Hostname: server1.foo.com Org : DailySet1 Config : DailySet1 Date : February 5, 2020 These dumps were to tape DailySet1-21. The next 10 tapes Amanda expects to use are: 9 new tapes, DailySet1-1. The next 9 tape already labelled are: DailySet1-30,DailySet1-29,DailySet1-28,DailySet1-27,DailySet1-26,DailySet1-25,DailySet1-24,DailySet1-23,DailySet1-22. STATISTICS: Total Full Incr. Level:# -------- -------- -------- -------- Estimate Time (hrs:min) 0:00 Run Time (hrs:min) 0:12 Dump Time (hrs:min) 0:04 0:04 0:00 Output Size (meg) 152.4 140.7 11.7 Original Size (meg) 187.8 171.8 16.0 Avg Compressed Size (%) 81.2 81.9 73.3 DLEs Dumped 4 2 2 1:2 Avg Dump Rate (k/s) 581.7 554.6 1405.1 Tape Time (hrs:min) 0:06 0:03 0:04 Tape Size (meg) 152.4 140.7 11.7 Tape Used (%) 0.2 0.2 0.0 DLEs Taped 4 2 2 1:2 Parts Taped 4 2 2 1:2 Avg Tp Write Rate (k/s) 412.8 867.9 56.5 USAGE BY TAPE: Label Time Size % DLEs Parts DailySet1-21 0:04 152M 0.2 4 4 NOTES: planner: Last full dump of server1.foo.com:opt on tape DailySet1-18 overwritten in 2 runs. planner: Last full dump of server1.foo.com:system on tape DailySet1-20 overwritten in 2 runs. planner: Last full dump of server2.foo.com:git on tape DailySet1-19 overwritten in 2 runs. planner: Last full dump of server2.foo.com:etc on tape DailySet1-19 overwritten in 2 runs. planner: Full dump of server2.foo.com:git promoted from 14 days ahead. planner: Full dump of server2.foo.com:etc promoted from 14 days ahead. taper: Slot 1 with label DailySet1-21 is usable taper: tape DailySet1-21 kb 156055 fm 4 [OK] DUMP SUMMARY: DUMPER STATS TAPER STATS HOSTNAME DISK L ORIG-MB OUT-MB COMP% MMM:SS KB/s MMM:SS KB/s --------------------------------------------------- ---------------------- -------------- ------------- server2.foo.com etc 0 74 46 61.9 1:29 525.5 1:51 421.9 server2.foo.com git 0 98 95 96.9 2:51 569.8 0:55 1768.0 server1.foo.com opt 1 12 11 93.5 0:04 3123.0 1:08 164.5 server2.foo.com system 1 4 1 18.3 0:05 161.7 2:24 5.6 (brought to you by Amanda version 3.5.1)
Overall I am happy with Amanda. Some of the documetation out there might be for older versions. Of course there were a few changes over the years regarding configuration files. OReilly’s Backup & Recovery spends a whole chapter on Amanda. Having used some other, commercial backup products in the past I wouldn’t consider the configuration as more complicated as for other products. Sometimes maybe even simpler. Aside from sending emails amreport can be used to write json as well. You might want to use that option to get the information into a central logging system like Splunk