Subsections

Database Tables

Filename
Column Name Data Type Remark
FilenameId integer Primary Key
Name Blob Filename

The Filename table shown above contains the name of each file backed up with the path removed. If different directories or machines contain the same filename, only one copy will be saved in this table.

Path
Column Name Data Type Remark
PathId integer Primary Key
Path Blob Full Path

The Path table contains shown above the path or directory names of all directories on the system or systems. The filename and any MSDOS disk name are stripped off. As with the filename, only one copy of each directory name is kept regardless of how many machines or drives have the same directory. These path names should be stored in Unix path name format.

Some simple testing on a Linux file system indicates that separating the filename and the path may be more complication than is warranted by the space savings. For example, this system has a total of 89,097 files, 60,467 of which have unique filenames, and there are 4,374 unique paths.

Finding all those files and doing two stats() per file takes an average wall clock time of 1 min 35 seconds on a 400MHz machine running RedHat 6.1 Linux.

Finding all those files and putting them directly into a MySQL database with the path and filename defined as TEXT, which is variable length up to 65,535 characters takes 19 mins 31 seconds and creates a 27.6 MByte database.

Doing the same thing, but inserting them into Blob fields with the filename indexed on the first 30 characters and the path name indexed on the 255 (max) characters takes 5 mins 18 seconds and creates a 5.24 MB database. Rerunning the job (with the database already created) takes about 2 mins 50 seconds.

Running the same as the last one (Path and Filename Blob), but Filename indexed on the first 30 characters and the Path on the first 50 characters (linear search done there after) takes 5 mins on the average and creates a 3.4 MB database. Rerunning with the data already in the DB takes 3 mins 35 seconds.

Finally, saving only the full path name rather than splitting the path and the file, and indexing it on the first 50 characters takes 6 mins 43 seconds and creates a 7.35 MB database.

File
Column Name Data Type Remark
FileId integer Primary Key
FileIndex integer The sequential file number in the Job
JobId integer Link to Job Record
PathId integer Link to Path Record
FilenameId integer Link to Filename Record
MarkId integer Used to mark files during Verify Jobs
LStat tinyblob File attributes in base64 encoding
MD5 tinyblob MD5/SHA1 signature in base64 encoding

The File table shown above contains one entry for each file backed up by Bacula. Thus a file that is backed up multiple times (as is normal) will have multiple entries in the File table. This will probably be the table with the most number of records. Consequently, it is essential to keep the size of this record to an absolute minimum. At the same time, this table must contain all the information (or pointers to the information) about the file and where it is backed up. Since a file may be backed up many times without having changed, the path and filename are stored in separate tables.

This table contains by far the largest amount of information in the Catalog database, both from the stand point of number of records, and the stand point of total database size. As a consequence, the user must take care to periodically reduce the number of File records using the retention command in the Console program.

Job
Column Name Data Type Remark
JobId integer Primary Key
Job tinyblob Unique Job Name
Name tinyblob Job Name
PurgedFiles tinyint Used by Bacula for purging/retention periods
Type binary(1) Job Type: Backup, Copy, Clone, Archive, Migration
Level binary(1) Job Level
ClientId integer Client index
JobStatus binary(1) Job Termination Status
SchedTime datetime Time/date when Job scheduled
StartTime datetime Time/date when Job started
EndTime datetime Time/date when Job ended
RealEndTime datetime Time/date when original Job ended
JobTDate bigint Start day in Unix format but 64 bits; used for Retention period.
VolSessionId integer Unique Volume Session ID
VolSessionTime integer Unique Volume Session Time
JobFiles integer Number of files saved in Job
JobBytes bigint Number of bytes saved in Job
JobErrors integer Number of errors during Job
JobMissingFiles integer Number of files not saved (not yet used)
PoolId integer Link to Pool Record
FileSetId integer Link to FileSet Record
PrioJobId integer Link to prior Job Record when migrated
PurgedFiles tiny integer Set when all File records purged
HasBase tiny integer Set when Base Job run

The Job table contains one record for each Job run by Bacula. Thus normally, there will be one per day per machine added to the database. Note, the JobId is used to index Job records in the database, and it often is shown to the user in the Console program. However, care must be taken with its use as it is not unique from database to database. For example, the user may have a database for Client data saved on machine Rufus and another database for Client data saved on machine Roxie. In this case, the two database will each have JobIds that match those in another database. For a unique reference to a Job, see Job below.

The Name field of the Job record corresponds to the Name resource record given in the Director's configuration file. Thus it is a generic name, and it will be normal to find many Jobs (or even all Jobs) with the same Name.

The Job field contains a combination of the Name and the schedule time of the Job by the Director. Thus for a given Director, even with multiple Catalog databases, the Job will contain a unique name that represents the Job.

For a given Storage daemon, the VolSessionId and VolSessionTime form a unique identification of the Job. This will be the case even if multiple Directors are using the same Storage daemon.

The Job Type (or simply Type) can have one of the following values:

Value Meaning
B Backup Job
M Migrated Job
V Verify Job
R Restore Job
C Console program (not in database)
I Internal or system Job
D Admin Job
A Archive Job (not implemented)
C Copy Job
M Migration Job
Note, the Job Type values noted above are not kept in an SQL table.

The JobStatus field specifies how the job terminated, and can be one of the following:

Value Meaning
C Created but not yet running
R Running
B Blocked
T Terminated normally
W Terminated normally with warnings
E Terminated in Error
e Non-fatal error
f Fatal error
D Verify Differences
A Canceled by the user
I Incomplete Job
F Waiting on the File daemon
S Waiting on the Storage daemon
m Waiting for a new Volume to be mounted
M Waiting for a Mount
s Waiting for Storage resource
j Waiting for Job resource
c Waiting for Client resource
d Wating for Maximum jobs
t Waiting for Start Time
p Waiting for higher priority job to finish
i Doing batch insert file records
a SD despooling attributes
l Doing data despooling
L Committing data (last despool)

FileSet
Column Name Data Type Remark
FileSetId integer Primary Key
FileSet tinyblob FileSet name
MD5 tinyblob MD5 checksum of FileSet
CreateTime datetime Time and date Fileset created

The FileSet table contains one entry for each FileSet that is used. The MD5 signature is kept to ensure that if the user changes anything inside the FileSet, it will be detected and the new FileSet will be used. This is particularly important when doing an incremental update. If the user deletes a file or adds a file, we need to ensure that a Full backup is done prior to the next incremental.

JobMedia
Column Name Data Type Remark
JobMediaId integer Primary Key
JobId integer Link to Job Record
MediaId integer Link to Media Record
FirstIndex integer The index (sequence number) of the first file written for this Job to the Media
LastIndex integer The index of the last file written for this Job to the Media
StartFile integer The physical media (tape) file number of the first block written for this Job
EndFile integer The physical media (tape) file number of the last block written for this Job
StartBlock integer The number of the first block written for this Job
EndBlock integer The number of the last block written for this Job
VolIndex integer The Volume use sequence number within the Job

The JobMedia table contains one entry at the following: start of the job, start of each new tape file, start of each new tape, end of the job. Since by default, a new tape file is written every 2GB, in general, you will have more than 2 JobMedia records per Job. The number can be varied by changing the "Maximum File Size" specified in the Device resource. This record allows Bacula to efficiently position close to (within 2GB) any given file in a backup. For restoring a full Job, these records are not very important, but if you want to retrieve a single file that was written near the end of a 100GB backup, the JobMedia records can speed it up by orders of magnitude by permitting forward spacing files and blocks rather than reading the whole 100GB backup.

Media
Column Name Data Type Remark
MediaId integer Primary Key
VolumeName tinyblob Volume name
Slot integer Autochanger Slot number or zero
PoolId integer Link to Pool Record
MediaType tinyblob The MediaType supplied by the user
MediaTypeId integer The MediaTypeId
LabelType tinyint The type of label on the Volume
FirstWritten datetime Time/date when first written
LastWritten datetime Time/date when last written
LabelDate datetime Time/date when tape labeled
VolJobs integer Number of jobs written to this media
VolFiles integer Number of files written to this media
VolBlocks integer Number of blocks written to this media
VolMounts integer Number of time media mounted
VolBytes bigint Number of bytes saved in Job
VolParts integer The number of parts for a Volume (DVD)
VolErrors integer Number of errors during Job
VolWrites integer Number of writes to media
MaxVolBytes bigint Maximum bytes to put on this media
VolCapacityBytes bigint Capacity estimate for this volume
VolStatus enum Status of media: Full, Archive, Append, Recycle, Read-Only, Disabled, Error, Busy
Enabled tinyint Whether or not Volume can be written  
Recycle tinyint Whether or not Bacula can recycle the Volumes: Yes, No
ActionOnPurge tinyint What happens to a Volume after purging
VolRetention bigint 64 bit seconds until expiration
VolUseDuration bigint 64 bit seconds volume can be used
MaxVolJobs integer maximum jobs to put on Volume
MaxVolFiles integer maximume EOF marks to put on Volume
InChanger tinyint Whether or not Volume in autochanger
StorageId integer Storage record ID
DeviceId integer Device record ID
MediaAddressing integer Method of addressing media
VolReadTime bigint Time Reading Volume
VolWriteTime bigint Time Writing Volume
EndFile integer End File number of Volume
EndBlock integer End block number of Volume
LocationId integer Location record ID
RecycleCount integer Number of times recycled
InitialWrite datetime When Volume first written
ScratchPoolId integer Id of Scratch Pool
RecyclePoolId integer Pool ID where to recycle Volume
Comment blob User text field

The Volume table (internally referred to as the Media table) contains one entry for each volume, that is each tape, cassette (8mm, DLT, DAT, ...), or file on which information is or was backed up. There is one Volume record created for each of the NumVols specified in the Pool resource record.

Pool
Column Name Data Type Remark
PoolId integer Primary Key
Name Tinyblob Pool Name
NumVols Integer Number of Volumes in the Pool
MaxVols Integer Maximum Volumes in the Pool
UseOnce tinyint Use volume once
UseCatalog tinyint Set to use catalog
AcceptAnyVolume tinyint Accept any volume from Pool
VolRetention bigint 64 bit seconds to retain volume
VolUseDuration bigint 64 bit seconds volume can be used
MaxVolJobs integer max jobs on volume
MaxVolFiles integer max EOF marks to put on Volume
MaxVolBytes bigint max bytes to write on Volume
AutoPrune tinyint yes|no for autopruning
Recycle tinyint yes|no for allowing auto recycling of Volume
ActionOnPurge tinyint Default Volume ActionOnPurge
PoolType enum Backup, Copy, Cloned, Archive, Migration
LabelType tinyint Type of label ANSI/Bacula
LabelFormat Tinyblob Label format
Enabled tinyint Whether or not Volume can be written  
ScratchPoolId integer Id of Scratch Pool
RecyclePoolId integer Pool ID where to recycle Volume
NextPoolId integer Pool ID of next Pool
MigrationHighBytes bigint High water mark for migration
MigrationLowBytes bigint Low water mark for migration
MigrationTime bigint Time before migration

The Pool table contains one entry for each media pool controlled by Bacula in this database. One media record exists for each of the NumVols contained in the Pool. The PoolType is a Bacula defined keyword. The MediaType is defined by the administrator, and corresponds to the MediaType specified in the Director's Storage definition record. The CurrentVol is the sequence number of the Media record for the current volume.

Client
Column Name Data Type Remark
ClientId integer Primary Key
Name TinyBlob File Services Name
UName TinyBlob uname -a from Client (not yet used)
AutoPrune tinyint yes|no for autopruning
FileRetention bigint 64 bit seconds to retain Files
JobRetention bigint 64 bit seconds to retain Job

The Client table contains one entry for each machine backed up by Bacula in this database. Normally the Name is a fully qualified domain name.

Storage
Column Name Data Type Remark
StorageId integer Unique Id
Name tinyblob Resource name of Storage device
AutoChanger tinyint Set if it is an autochanger

The Storage table contains one entry for each Storage used.

Counter
Column Name Data Type Remark
Counter tinyblob Counter name
MinValue integer Start/Min value for counter
MaxValue integer Max value for counter
CurrentValue integer Current counter value
WrapCounter tinyblob Name of another counter

The Counter table contains one entry for each permanent counter defined by the user.

JobHisto
Column Name Data Type Remark
JobId integer Primary Key
Job tinyblob Unique Job Name
Name tinyblob Job Name
Type binary(1) Job Type: Backup, Copy, Clone, Archive, Migration
Level binary(1) Job Level
ClientId integer Client index
JobStatus binary(1) Job Termination Status
SchedTime datetime Time/date when Job scheduled
StartTime datetime Time/date when Job started
EndTime datetime Time/date when Job ended
RealEndTime datetime Time/date when original Job ended
JobTDate bigint Start day in Unix format but 64 bits; used for Retention period.
VolSessionId integer Unique Volume Session ID
VolSessionTime integer Unique Volume Session Time
JobFiles integer Number of files saved in Job
JobBytes bigint Number of bytes saved in Job
JobErrors integer Number of errors during Job
JobMissingFiles integer Number of files not saved (not yet used)
PoolId integer Link to Pool Record
FileSetId integer Link to FileSet Record
PrioJobId integer Link to prior Job Record when migrated
PurgedFiles tiny integer Set when all File records purged
HasBase tiny integer Set when Base Job run

The bf JobHisto table is the same as the Job table, but it keeps long term statistics (i.e. it is not pruned with the Job).

Version
Column Name Data Type Remark
LogIdId integer Primary Key
JobId integer Points to Job record
Time datetime Time/date log record created
LogText blob Log text

The Log table contains a log of all Job output.

Location
Column Name Data Type Remark
LocationId integer Primary Key
Location tinyblob Text defining location
Cost integer Relative cost of obtaining Volume
Enabled tinyint Whether or not Volume is enabled

The Location table defines where a Volume is physically.

LocationLog
Column Name Data Type Remark
locLogIdId integer Primary Key
Date datetime Time/date log record created
MediaId integer Points to Media record
LocationId integer Points to Location record
NewVolStatus integer enum: Full, Archive, Append, Recycle, Purged Read-only, Disabled, Error, Busy, Used, Cleaning
Enabled tinyint Whether or not Volume is enabled

The Log table contains a log of all Job output.

Version
Column Name Data Type Remark
VersionId integer Primary Key

The Version table defines the Bacula database version number. Bacula checks this number before reading the database to ensure that it is compatible with the Bacula binary file.

BaseFiles
Column Name Data Type Remark
BaseId integer Primary Key
BaseJobId integer JobId of Base Job
JobId integer Reference to Job
FileId integer Reference to File
FileIndex integer File Index number

The BaseFiles table contains all the File references for a particular JobId that point to a Base file - i.e. they were previously saved and hence were not saved in the current JobId but in BaseJobId under FileId. FileIndex is the index of the file, and is used for optimization of Restore jobs to prevent the need to read the FileId record when creating the in memory tree. This record is not yet implemented.

MySQL Table Definition

The commands used to create the MySQL tables are as follows:

USE bacula;
CREATE TABLE Filename (
  FilenameId INTEGER UNSIGNED NOT NULL AUTO_INCREMENT,
  Name BLOB NOT NULL,
  PRIMARY KEY(FilenameId),
  INDEX (Name(30))
  );
CREATE TABLE Path (
   PathId INTEGER UNSIGNED NOT NULL AUTO_INCREMENT,
   Path BLOB NOT NULL,
   PRIMARY KEY(PathId),
   INDEX (Path(50))
   );
CREATE TABLE File (
   FileId INTEGER UNSIGNED NOT NULL AUTO_INCREMENT,
   FileIndex INTEGER UNSIGNED NOT NULL DEFAULT 0,
   JobId INTEGER UNSIGNED NOT NULL REFERENCES Job,
   PathId INTEGER UNSIGNED NOT NULL REFERENCES Path,
   FilenameId INTEGER UNSIGNED NOT NULL REFERENCES Filename,
   MarkId INTEGER UNSIGNED NOT NULL DEFAULT 0,
   LStat TINYBLOB NOT NULL,
   MD5 TINYBLOB NOT NULL,
   PRIMARY KEY(FileId),
   INDEX (JobId),
   INDEX (PathId),
   INDEX (FilenameId)
   );
CREATE TABLE Job (
   JobId INTEGER UNSIGNED NOT NULL AUTO_INCREMENT,
   Job TINYBLOB NOT NULL,
   Name TINYBLOB NOT NULL,
   Type BINARY(1) NOT NULL,
   Level BINARY(1) NOT NULL,
   ClientId INTEGER NOT NULL REFERENCES Client,
   JobStatus BINARY(1) NOT NULL,
   SchedTime DATETIME NOT NULL,
   StartTime DATETIME NOT NULL,
   EndTime DATETIME NOT NULL,
   JobTDate BIGINT UNSIGNED NOT NULL,
   VolSessionId INTEGER UNSIGNED NOT NULL DEFAULT 0,
   VolSessionTime INTEGER UNSIGNED NOT NULL DEFAULT 0,
   JobFiles INTEGER UNSIGNED NOT NULL DEFAULT 0,
   JobBytes BIGINT UNSIGNED NOT NULL,
   JobErrors INTEGER UNSIGNED NOT NULL DEFAULT 0,
   JobMissingFiles INTEGER UNSIGNED NOT NULL DEFAULT 0,
   PoolId INTEGER UNSIGNED NOT NULL REFERENCES Pool,
   FileSetId INTEGER UNSIGNED NOT NULL REFERENCES FileSet,
   PurgedFiles TINYINT NOT NULL DEFAULT 0,
   HasBase TINYINT NOT NULL DEFAULT 0,
   PRIMARY KEY(JobId),
   INDEX (Name(128))
   );
CREATE TABLE FileSet (
   FileSetId INTEGER UNSIGNED NOT NULL AUTO_INCREMENT,
   FileSet TINYBLOB NOT NULL,
   MD5 TINYBLOB NOT NULL,
   CreateTime DATETIME NOT NULL,
   PRIMARY KEY(FileSetId)
   );
CREATE TABLE JobMedia (
   JobMediaId INTEGER UNSIGNED NOT NULL AUTO_INCREMENT,
   JobId INTEGER UNSIGNED NOT NULL REFERENCES Job,
   MediaId INTEGER UNSIGNED NOT NULL REFERENCES Media,
   FirstIndex INTEGER UNSIGNED NOT NULL DEFAULT 0,
   LastIndex INTEGER UNSIGNED NOT NULL DEFAULT 0,
   StartFile INTEGER UNSIGNED NOT NULL DEFAULT 0,
   EndFile INTEGER UNSIGNED NOT NULL DEFAULT 0,
   StartBlock INTEGER UNSIGNED NOT NULL DEFAULT 0,
   EndBlock INTEGER UNSIGNED NOT NULL DEFAULT 0,
   VolIndex INTEGER UNSIGNED NOT NULL DEFAULT 0,
   PRIMARY KEY(JobMediaId),
   INDEX (JobId, MediaId)
   );
CREATE TABLE Media (
   MediaId INTEGER UNSIGNED NOT NULL AUTO_INCREMENT,
   VolumeName TINYBLOB NOT NULL,
   Slot INTEGER NOT NULL DEFAULT 0,
   PoolId INTEGER UNSIGNED NOT NULL REFERENCES Pool,
   MediaType TINYBLOB NOT NULL,
   FirstWritten DATETIME NOT NULL,
   LastWritten DATETIME NOT NULL,
   LabelDate DATETIME NOT NULL,
   VolJobs INTEGER UNSIGNED NOT NULL DEFAULT 0,
   VolFiles INTEGER UNSIGNED NOT NULL DEFAULT 0,
   VolBlocks INTEGER UNSIGNED NOT NULL DEFAULT 0,
   VolMounts INTEGER UNSIGNED NOT NULL DEFAULT 0,
   VolBytes BIGINT UNSIGNED NOT NULL DEFAULT 0,
   VolErrors INTEGER UNSIGNED NOT NULL DEFAULT 0,
   VolWrites INTEGER UNSIGNED NOT NULL DEFAULT 0,
   VolCapacityBytes BIGINT UNSIGNED NOT NULL,
   VolStatus ENUM('Full', 'Archive', 'Append', 'Recycle', 'Purged',
    'Read-Only', 'Disabled', 'Error', 'Busy', 'Used', 'Cleaning') NOT NULL,
   Recycle TINYINT NOT NULL DEFAULT 0,
   VolRetention BIGINT UNSIGNED NOT NULL DEFAULT 0,
   VolUseDuration BIGINT UNSIGNED NOT NULL DEFAULT 0,
   MaxVolJobs INTEGER UNSIGNED NOT NULL DEFAULT 0,
   MaxVolFiles INTEGER UNSIGNED NOT NULL DEFAULT 0,
   MaxVolBytes BIGINT UNSIGNED NOT NULL DEFAULT 0,
   InChanger TINYINT NOT NULL DEFAULT 0,
   MediaAddressing TINYINT NOT NULL DEFAULT 0,
   VolReadTime BIGINT UNSIGNED NOT NULL DEFAULT 0,
   VolWriteTime BIGINT UNSIGNED NOT NULL DEFAULT 0,
   PRIMARY KEY(MediaId),
   INDEX (PoolId)
   );
CREATE TABLE Pool (
   PoolId INTEGER UNSIGNED NOT NULL AUTO_INCREMENT,
   Name TINYBLOB NOT NULL,
   NumVols INTEGER UNSIGNED NOT NULL DEFAULT 0,
   MaxVols INTEGER UNSIGNED NOT NULL DEFAULT 0,
   UseOnce TINYINT NOT NULL,
   UseCatalog TINYINT NOT NULL,
   AcceptAnyVolume TINYINT DEFAULT 0,
   VolRetention BIGINT UNSIGNED NOT NULL,
   VolUseDuration BIGINT UNSIGNED NOT NULL,
   MaxVolJobs INTEGER UNSIGNED NOT NULL DEFAULT 0,
   MaxVolFiles INTEGER UNSIGNED NOT NULL DEFAULT 0,
   MaxVolBytes BIGINT UNSIGNED NOT NULL,
   AutoPrune TINYINT DEFAULT 0,
   Recycle TINYINT DEFAULT 0,
   PoolType ENUM('Backup', 'Copy', 'Cloned', 'Archive', 'Migration', 'Scratch') NOT NULL,
   LabelFormat TINYBLOB,
   Enabled TINYINT DEFAULT 1,
   ScratchPoolId INTEGER UNSIGNED DEFAULT 0 REFERENCES Pool,
   RecyclePoolId INTEGER UNSIGNED DEFAULT 0 REFERENCES Pool,
   UNIQUE (Name(128)),
   PRIMARY KEY (PoolId)
   );
CREATE TABLE Client (
   ClientId INTEGER UNSIGNED NOT NULL AUTO_INCREMENT,
   Name TINYBLOB NOT NULL,
   Uname TINYBLOB NOT NULL,       /* full uname -a of client */
   AutoPrune TINYINT DEFAULT 0,
   FileRetention BIGINT UNSIGNED NOT NULL,
   JobRetention  BIGINT UNSIGNED NOT NULL,
   UNIQUE (Name(128)),
   PRIMARY KEY(ClientId)
   );
CREATE TABLE BaseFiles (
   BaseId INTEGER UNSIGNED AUTO_INCREMENT,
   BaseJobId INTEGER UNSIGNED NOT NULL REFERENCES Job,
   JobId INTEGER UNSIGNED NOT NULL REFERENCES Job,
   FileId INTEGER UNSIGNED NOT NULL REFERENCES File,
   FileIndex INTEGER UNSIGNED,
   PRIMARY KEY(BaseId)
   );
CREATE TABLE UnsavedFiles (
   UnsavedId INTEGER UNSIGNED AUTO_INCREMENT,
   JobId INTEGER UNSIGNED NOT NULL REFERENCES Job,
   PathId INTEGER UNSIGNED NOT NULL REFERENCES Path,
   FilenameId INTEGER UNSIGNED NOT NULL REFERENCES Filename,
   PRIMARY KEY (UnsavedId)
   );
CREATE TABLE Version (
   VersionId INTEGER UNSIGNED NOT NULL
   );
-- Initialize Version
INSERT INTO Version (VersionId) VALUES (7);
CREATE TABLE Counters (
   Counter TINYBLOB NOT NULL,
   MinValue INTEGER,
   MaxValue INTEGER,
   CurrentValue INTEGER,
   WrapCounter TINYBLOB NOT NULL,
   PRIMARY KEY (Counter(128))
   );

Kern Sibbald 2010-08-30