The term Migration, as used in the context of Bacula, means moving data from one Volume to another. In particular it refers to a Job (similar to a backup job) that reads data that was previously backed up to a Volume and writes it to another Volume. As part of this process, the File catalog records associated with the first backup job are purged. In other words, Migration moves Bacula Job data from one Volume to another by reading the Job data from the Volume it is stored on, writing it to a different Volume in a different Pool, and then purging the database records for the first Job.
The Copy process is essentially identical to the Migration feature with the exception that the Job that is copied is left unchanged. This essentially creates two identical copies of the same backup. However, the copy is treated as a copy rather than a backup job, and hence is not directly available for restore. If bacula founds a copy when a job record is purged (deleted) from the catalog, it will promote the copy as real backup and will make it available for automatic restore. Note: in the text below, to simplify it, we usually speak of a migration job. This, in fact, means either a migration job or a copy job.
The Copy and the Migration jobs run without using the File daemon by copying the data from the old backup Volume to a different Volume in a different Pool. It is not possible to run commands on the defined Client via a RunScript from within the Migration or Copy Job.
The selection process for which Job or Jobs are migrated can be based on quite a number of different criteria such as:
The details of these selection criteria will be defined below.
To run a Migration job, you must first define a Job resource very similar to a Backup Job but with Type = Migrate instead of Type = Backup. One of the key points to remember is that the Pool that is specified for the migration job is the only pool from which jobs will be migrated, with one exception noted below. In addition, the Pool to which the selected Job or Jobs will be migrated is defined by the Next Pool = ... in the Pool resource specified for the Migration Job.
Bacula permits Pools to contain Volumes with different Media Types. However, when doing migration, this is a very undesirable condition. For migration to work properly, you should use Pools containing only Volumes of the same Media Type for all migration jobs.
The migration job normally is either manually started or starts from a Schedule much like a backup job. It searches for a previous backup Job or Jobs that match the parameters you have specified in the migration Job resource, primarily a Selection Type (detailed a bit later). Then for each previous backup JobId found, the Migration Job will run a new Job which copies the old Job data from the previous Volume to a new Volume in the Migration Pool. It is possible that no prior Jobs are found for migration, in which case, the Migration job will simply terminate having done nothing, but normally at a minimum, three jobs are involved during a migration:
If the Migration control job finds a number of JobIds to migrate (e.g. it is asked to migrate one or more Volumes), it will start one new migration backup job for each JobId found on the specified Volumes. Please note that Migration doesn't scale too well since Migrations are done on a Job by Job basis. This if you select a very large volume or a number of volumes for migration, you may have a large number of Jobs that start. Because each job must read the same Volume, they will run consecutively (not simultaneously).
The following directives can appear in a Director's Job resource, and they are used to define a Migration job.
Once again, a Copy Job cannot be used to restore unless you explicitly specify the Copy JobId during the restore command. If the original backup Job is deleted and there is a Copy of that backup Job, the Copy JobIds will be changed to be backup Jobs that can then be restored.
Jobs on Volumes will be considered for Migration only if the Volume is marked, Full, Used, or Error. Volumes that are still marked Append will not be considered for migration. This prevents Bacula from attempting to read the Volume at the same time it is writing it. It also reduces other deadlock situations, as well as avoids the problem that you migrate a Volume and later find new files appended to that Volume.
For the OldestVolume and SmallestVolume, this Selection pattern is not used (ignored).
For the Client, Volume, and Job keywords, this pattern must be a valid regular expression that will filter the appropriate item names found in the Pool.
For the SQLQuery keyword, this pattern must be a valid SELECT SQL statement that returns JobIds.
The following directives can appear in a Director's Pool resource, and they are used to define a Migration job.
The Next Pool directive may also be specified in the Job resource or on a Run directive in the Schedule resource. Any Next Pool directive in the Job resource will take precedence over the Pool definition, and any Next Pool specification on the Run directive in a Schedule resource will take ultimate precedence.
For disk Volumes, multiple simultaneous Jobs can read the same Volume at the same time, so the above restriction does not apply.
When you specify a Migration Job, you must specify all the standard directives as for a Job. However, certain such as the Level, Client, and FileSet, though they must be defined, are ignored by the Migration job because the values from the original job used instead.
As an example, suppose you have the following Job that you run every night. To note: there is no Storage directive in the Job resource; there is a Storage directive in each of the Pool resources; the Pool to be migrated (File) contains a Next Pool directive that defines the output Pool (where the data is written by the migration job).
# Define the backup Job Job { Name = "NightlySave" Type = Backup Level = Incremental # default Client=rufus-fd FileSet="Full Set" Schedule = "WeeklyCycle" Messages = Standard Pool = Default } # Default pool definition Pool { Name = Default Pool Type = Backup AutoPrune = yes Recycle = yes Next Pool = Tape Storage = File LabelFormat = "File" } # Tape pool definition Pool { Name = Tape Pool Type = Backup AutoPrune = yes Recycle = yes Storage = DLTDrive } # Definition of File storage device Storage { Name = File Address = rufus Password = "ccV3lVTsQRsdIUGyab0N4sMDavui2hOBkmpBU0aQKOr9" Device = "File" # same as Device in Storage daemon Media Type = File # same as MediaType in Storage daemon } # Definition of DLT tape storage device Storage { Name = DLTDrive Address = rufus Password = "ccV3lVTsQRsdIUGyab0N4sMDavui2hOBkmpBU0aQKOr9" Device = "HP DLT 80" # same as Device in Storage daemon Media Type = DLT8000 # same as MediaType in Storage daemon }
Where we have included only the essential information - i.e. the Director, FileSet, Catalog, Client, Schedule, and Messages resources are omitted.
As you can see, by running the NightlySave Job, the data will be backed up to File storage using the Default pool to specify the Storage as File.
Now, if we add the following Job resource to this conf file.
Job { Name = "migrate-volume" Type = Migrate Level = Full Client = rufus-fd FileSet = "Full Set" Messages = Standard Pool = Default Maximum Concurrent Jobs = 4 Selection Type = Volume Selection Pattern = "File" }
and then run the job named migrate-volume, all volumes in the Pool named Default (as specified in the migrate-volume Job that match the regular expression pattern File will be migrated to tape storage DLTDrive because the Next Pool in the Default Pool specifies that Migrations should go to the pool named Tape, which uses Storage DLTDrive.
If instead, we use a Job resource as follows:
Job { Name = "migrate" Type = Migrate Level = Full Client = rufus-fd FileSet="Full Set" Messages = Standard Pool = Default Maximum Concurrent Jobs = 4 Selection Type = Job Selection Pattern = ".*Save" }
All jobs ending with the name Save will be migrated from the File Default to the Tape Pool, or from File storage to Tape storage.
When the Job Level is set to VirtualFull, it permits you to consolidate the previous Full backup plus the most recent Differential backup and any subsequent Incremental backups into a new Full backup. This new Full backup will then be considered as the most recent Full for any future Incremental or Differential backups. The VirtualFull backup is accomplished without contacting the client by reading the previous backup data and writing it to a volume in a different pool.
Bacula's virtual backup feature is often called Synthetic Backup or Consolidation in other backup products.
In some respects the Virtual Backup feature works similar to a Migration job, in that Bacula normally reads the data from the pool specified in the Job resource, and writes it to the Next Pool specified in the Job resource. Note, this means that usually the output from the Virtual Backup is written into a different pool from where your prior backups are saved. Doing it this way guarantees that you will not get a deadlock situation attempting to read and write to the same volume in the Storage daemon. If you then want to do subsequent backups, you may need to move the Virtual Full Volume back to your normal backup pool. Alternatively, you can set your Next Pool to point to the current pool. This will cause Bacula to read and write to Volumes in the current pool. In general, this will work, because Bacula will not allow reading and writing on the same Volume. In any case, once a VirtualFull has been created, and a restore is done involving the most current Full, it will read the Volume or Volumes by the VirtualFull regardless of in which Pool the Volume is found.
A typical Job resource definition might look like the following:
Job { Name = "MyBackup" Type = Backup Client=localhost-fd FileSet = "Full Set" Storage = File Messages = Standard Pool = Default SpoolData = yes } # Default pool definition Pool { Name = Default Pool Type = Backup Volume Retention = 365d # one year NextPool = Full Storage = File } Pool { Name = Full Pool Type = Backup Volume Retention = 365d # one year Storage = DiskChanger } # Definition of file storage device Storage { Name = File Address = localhost Password = "xxx" Device = FileStorage Media Type = File Maximum Concurrent Jobs = 5 } # Definition of DDS Virtual tape disk storage device Storage { Name = DiskChanger Address = localhost # N.B. Use a fully qualified name here Password = "yyy" Device = DiskChanger Media Type = DiskChangerMedia Maximum Concurrent Jobs = 4 Autochanger = yes }
Then in bconsole or via a Run schedule, you would run the job as:
run job=MyBackup level=Full run job=MyBackup level=Incremental run job=MyBackup level=Differential run job=MyBackup level=Incremental run job=MyBackup level=Incremental
So providing there were changes between each of those jobs, you would end up with a Full backup, a Differential, which includes the first Incremental backup, then two Incremental backups. All the above jobs would be written to the Default pool.
To consolidate those backups into a new Full backup, you would run the following:
run job=MyBackup level=VirtualFull
And it would produce a new Full backup without using the client, and the output would be written to the Full Pool which uses the Diskchanger Storage.
If the Virtual Full is run, and there are no prior Jobs, the Virtual Full will fail with an error.
Note, the Start and End time of the Virtual Full backup is set to the values for the last job included in the Virtual Full (in the above example, it is an Increment). This is so that if another incremental is done, which will be based on the Virtual Full, it will backup all files from the last Job included in the Virtual Full rather than from the time the Virtual Full was actually run.
For example, if you have the following Jobs in your catalog:
+-------+---------+-------+----------+----------+-----------+ | JobId | Name | Level | JobFiles | JobBytes | JobStatus | +-------+---------+-------+----------+----------+-----------+ | 1 | Vbackup | F | 1754 | 50118554 | T | | 2 | Vbackup | I | 1 | 4 | T | | 3 | Vbackup | I | 1 | 4 | T | | 4 | Vbackup | D | 2 | 8 | T | | 5 | Vbackup | I | 1 | 6 | T | | 6 | Vbackup | I | 10 | 60 | T | | 7 | Vbackup | I | 11 | 65 | T | | 8 | Save | F | 1758 | 50118564 | T | +-------+---------+-------+----------+----------+-----------+
If you want to consolidate only the first 3 jobs and create a virtual backup equivalent to Job 1 + Job 2 + Job 3, you will use jobid=3 in the run command, then Bacula will select the previous Full backup, the previous Differential (if any) and all subsequent Incremental jobs.
run job=Vbackup jobid=3 level=VirtualFull
If you want to consolidate a specific job list, you must specify the exact list of jobs to merge in the run command line. For example, to consolidate the last Differential and all subsequent Incremental, you will use jobid=4,5,6,7 or jobid=4-7 in the run command line. As one of the Job in the list is a Differential backup, Bacula will set the new job level to Differential. If the list is composed only with Incremental jobs, the new job will have a level set to Incremental.
run job=Vbackup jobid=4-7 level=VirtualFull
When using this feature, Bacula will automatically discard jobs that are not related to the current Job. For example, specifying jobid=7,8, Bacula will discard the jobid 8.
If you know what you are doing and still want to consolidate jobs that have different names (so probably different clients, filesets, etc...), you must use alljobid= keyword instead of jobid=.
run job=Vbackup alljobid=1-3,6-8 level=VirtualFull