Table of Contents
Setting up Backups
Introduction
Creating appropriate backups of critical data is extremely important. Any time you have only one copy of something, the risk of losing your only copy is high. Without data, a publishing house is useless.
Data can be lost in several ways including:
- Carelessness. A person can get on the file server and accidentally delete files or folders. Or a person could be editing a file and accidentally run the palm of his hand over the touch pad; suddenly a block of text is selected, and then he presses a letter on the keyboard, replacing the block of text with a single letter. Unless the error is promptly caught, that block of text is gone.
- Theft. A computer or phone or thumb drive or external hard drive can be stolen when you least expect it. If the only copy of your data is on the device that is stolen, you have only a slim chance of recovering that data.
- Destruction. Perhaps a fire burns your house down while you are away and your computer with all your data is burned. Perhaps a flood gets your devices wet. Or you were in an accident and your device was crushed.
- Viruses. Viruses can corrupt or delete data on infected systems.
- Software corruption. Hard drives or thumb drives can become corrupted, and sometimes it is not possible to recover the data. You might be required to reformat your drive in order to keep using your drive, but you lose all the data in the process.
- Hardware failure. Hard drives or external drives can sometimes fail due to age. They can also fail due to being dropped, or due to a power surge such as a lightning strike during a storm.
Philosophy
Since data can be lost in so many ways, it is a good idea to always have backups of whatever data is important to you. A common rule is the “3-2-1 backup rule”.
- 3 copies (minimum) of your data. One is the original data and the other two are backups. That way if one backup fails, you still have one backup while you fix the failed backup. Try to never have all three copies with you at the same time.
- 2 different storage devices (or types). Try to store your backup on different media types when possible. Consider keeping a copy online if practical.
- 1 offsite backup. Keep at least one backup at another physical location somewhat far from your original. In the event of the destruction of the print shop by fire, for example, the offsite copy would not be affected.
This is the minimum recommendation, but you are encouraged to make more copies when possible. For example, try to keep one offsite backup in a different city or village. In unstable areas, consider keeping a copy outside the country.
We recommend setting up automatic backups, and then testing your backup system regularly (weekly or monthly) to verify that it is still functioning like you intend.
Incremental Backups
Using a simple copy for backups presents some significant challenges. Let's say we had 20Gb of data yesterday. Today, I modified two small documents and added 0.5Gb more of data. If we simply make a copy of the original data as of today, we would need 20Gb for yesterday's backup, and 20.5Gb for today's backup. Not much has changed since yesterday, but our backup is about 40.5Gb, or about twice the size it was yesterday. Just 5 days of backups would more than fill 100Gb! This is definitely not an efficient way of making backups of large amounts of data.
An incremental backup copies only the data that changed from the preceding backup. In the same example above, our first backup yesterday would be 20Gb. But today's backup would include only the 0.5Gb of new data plus a few kilobytes bit for the two small documents we modified. So the total combined size of yesterday's and today's backup would be about 20.5Gb. Over the course of 5 days, we can imagine that the total size probably won't be over 21Gb, assuming we don't add a huge amount of data. Definitely, an incremental system is the more space-efficient method of making backups.
Implementation
DTM uses incremental backups on all standard file servers. We will prepare two backup drives, but only use one at a time. The other will be kept offsite. There will be two backup scripts, one script for each drive. Both scripts will be run at regular times. The script matching the attached drive will run normally, while the the script matching the detached drive that is currently offsite will exit with a harmless error. This way either drive can be plugged in at any time and a backup is made.
Set Up the Backup Drives
Our standard backup system uses two identical USB external solid state drives named differently.
Verify that the drives are not formatted with exfat, which does not support symbolic links. We recommend formatting the drives with the ext4 filesystem.
- To edit or format the partitions we recommend using
gnome-disk-utilityorgparted - You will need to label one drive
IDTM-Backup1and the other driveIDTM-Backup2to avoid the automatic renaming of simultaneously connected identical drives. HOWTO
If, after formatting the drive, you find that you cannot write to the drive:
- Plug it in to the USB port.
- In Nemo, right-click where there would normally be files and select
Open as Root - Adjust the permissions to allow reading and writing by everyone.
- Eject the USB drive and try plugging it in again. It should be readable/writable.
To ensure that the drive is always mounted with the same name, add the appropriate line to the end of the /etc/fstab file. You will want to add one line each for IDTM-Backup1 and for IDTM-Backup2
- example:
UUID=uuid-of-disk /media/userid/IDTM-Backup ext4 auto,nofail,noatime,rw,users 0 0- Find the device by running the command
sudo lsblk - Find UUID by running the command
sudo blkid - Only the user that mounted a filesystem can unmount it again. If any user should be able to unmount, then use
usersinstead ofuserin the fstab line
Set Up the Backup Scripts
Download the backup script to the .bin directory of your home directory: IDTM-backup1.bash
Ensure the script is executable:
chmod u+x IDTM-backup1.bash
Edit the IDTM-backup1.bash script using a plain text editor as follows:
- Look for
SOURCE_DIR=“/home/idtm/IDTM-Library”(line 10)- Replace the path between the “” marks with the full path that points to the parent folder of your data. Leave no trailing slash.
- Look for
DESTINATION_BASE=“/media/idtm/IDTM-Backup1/Server/IDTM-Library”(line 12)- Replace the path between the “” marks with the full path that points to the destination folder on your backup drive 1. Leave no trailing slash.
Save your changes and exit the text editor.
Duplicate the IDTM-backup1.bash script and rename it to IDTM-backup2.bash. Now you should have two scripts: IDTM-backup1.bash and IDTM-backup2.bash in your .bin directory.
Edit the IDTM-backup2.bash script as follows:
- Look for
DESTINATION_BASEline and replaceIDTM-Backup1withIDTM-Backup2
Save your changes and exit the text editor.
You may run a script manually. For example, in your .bin directory, type:
./IDTM-backup1.bash
Schedule the Backup Scripts on crontab
Add two lines in your crontab, one for each backup script.
To modify (add or remove) crontabs run this command in the terminal: crontab -e
To see currently scheduled crontab jobs run this command in the terminal: crontab -l
The last two lines of your crontab should look something like this:
0 10,12,14,16,18 * * 1-5 /home/idtm/.bin/IDTM-backup1.sh 0 10,12,14,16,18 * * 1-5 /home/idtm/.bin/IDTM-backup2.sh
A quick and simple editor for experimenting with cron schedule expressions: https://crontab.guru
- This crontab expression works well for us:
0 10,12,14,16,18 * * 1-5(“At minute 0 past hour 10,12,14,16, and 18 on every day-of-week from Monday through Friday”)
