Without proper setup and maintenance, your Catalog may continue to grow indefinitely as you run Jobs and backup Files, and/or it may become very inefficient and slow. How fast the size of your Catalog grows depends on the number of Jobs you run and how many files they backup. By deleting records within the database, you can make space available for the new records that will be added during the next Job. By constantly deleting old expired records (dates older than the Retention period), your database size will remain constant.
If you started with the default configuration files, they already contain reasonable defaults for a small number of machines (less than 5), so if you fall into that case, catalog maintenance will not be urgent if you have a few hundred megabytes of disk space free. Whatever the case may be, some knowledge of retention periods will be useful.
Bacula uses three Retention periods: the File Retention period, the Job Retention period, and the Volume Retention period. Of these three, the File Retention period is by far the most important in determining how large your database will become.
The File Retention and the Job Retention are specified in each Client resource as is shown below. The Volume Retention period is specified in the Pool resource, and the details are given in the next chapter of this manual.
Since File records in the database account for probably 80 percent of the size of the database, you should carefully determine exactly what File Retention period you need. Once the File records have been removed from the database, you will no longer be able to restore individual files in a Job. However, with Bacula version 1.37 and later, as long as the Job record still exists, you will be able to restore all files in the job.
Retention periods are specified in seconds, but as a convenience, there are a number of modifiers that permit easy specification in terms of minutes, hours, days, weeks, months, quarters, or years on the record. See the Configuration chapter of this manual for additional details of modifier specification.
The default File retention period is 60 days.
As mentioned above, once the File records are removed from the database, you will no longer be able to restore individual files from the Job. However, as long as the Job record remains in the database, you will be able to restore all the files backuped for the Job (on version 1.37 and later). As a consequence, it is generally a good idea to retain the Job records much longer than the File records.
The retention period is specified in seconds, but as a convenience, there are a number of modifiers that permit easy specification in terms of minutes, hours, days, weeks, months, quarters, or years. See the Configuration chapter of this manual for additional details of modifier specification.
The default Job Retention period is 180 days.
If you turn this off by setting it to no, your Catalog will grow each time you run a Job.
Over time, as noted above, your database will tend to grow. I've noticed that even though Bacula regularly prunes files, MySQL does not effectively use the space, and instead continues growing. To avoid this, from time to time, you must compact your database. Normally, large commercial database such as Oracle have commands that will compact a database to reclaim wasted file space. MySQL has the OPTIMIZE TABLE command that you can use. We leave it to you to explore the utility of the OPTIMIZE TABLE command in MySQL.
All database programs have some means of writing the database out in ASCII format and then reloading it. Doing so will re-create the database from scratch producing a compacted result, so below, we show you how you can do this for MySQL, PostgreSQL.
For a MySQL database, you could write the Bacula database as an ASCII file (bacula.sql) then reload it by doing the following:
mysqldump -f --opt bacula > bacula.sql mysql bacula < bacula.sql rm -f bacula.sql
Depending on the size of your database, this will take more or less time and a fair amount of disk space. For example, if I cd to the location of the MySQL Bacula database (typically /opt/mysql/var or something similar) and enter:
du bacula
I get 620,644 which means there are that many blocks containing 1024 bytes each or approximately 635 MB of data. After doing the mysqldump, I had a bacula.sql file that had 174,356 blocks, and after doing the mysql command to recreate the database, I ended up with a total of 210,464 blocks rather than the original 629,644. In other words, the compressed version of the database took approximately one third of the space of the database that had been in use for about a year.
As a consequence, I suggest you monitor the size of your database and from time to time (once every six months or year), compress it.
If you find that you are getting errors writing to your MySQL database, or Bacula hangs each time it tries to access the database, you should consider running MySQL's database check and repair routines. The program you need to run depends on the type of database indexing you are using. If you are using the default, you will probably want to use myisamchk. For more details on how to do this, please consult the MySQL document at: www.mysql.com/doc/en/Repair.html.
If the errors you are getting are simply SQL warnings, then you might try running dbcheck before (or possibly after) using the MySQL database repair program. It can clean up many of the orphaned record problems, and certain other inconsistencies in the Bacula database.
A typical cause of MySQL database problems is if your partition fills. In such a case, you will need to create additional space on the partition or free up some space then repair the database probably using myisamchk. Recently my root partition filled and the MySQL database was corrupted. Simply running myisamchk -r did not fix the problem. However, the following script did the trick for me:
#!/bin/sh for i in *.MYD ; do mv $i x${i} t=`echo $i | cut -f 1 -d '.' -` mysql bacula <<END_OF_DATA set autocommit=1; truncate table $t; quit END_OF_DATA cp x${i} ${i} chown mysql:mysql ${i} myisamchk -r ${t} done
I invoked it with the following commands:
cd /var/lib/mysql/bacula ./repair
Then after ensuring that the database was correctly fixed, I did:
cd /var/lib/mysql/bacula rm -f x*.MYD
If you are running into the error The table 'File' is full ..., it is probably because on version 4.x MySQL, the table is limited by default to a maximum size of 4 GB and you have probably run into the limit. The solution can be found at: dev.mysql.com/doc/refman/5.0/en/full-table.html
You can display the maximum length of your table with:
mysql bacula SHOW TABLE STATUS FROM bacula like "File";
If the column labeled "Max_data_length" is around 4Gb, this is likely to be the source of your problem, and you can modify it with:
mysql bacula ALTER TABLE File MAX_ROWS=281474976710656;
Alternatively you can modify your /etc/my.conf file before creating the Bacula tables, and in the [mysqld] section set:
set-variable = myisam_data_pointer_size=6
The above myisam data pointer size must be made before you create your Bacula tables or it will have no effect.
The row and pointer size changes should already be the default on MySQL version 5.x, so making these changes should only be necessary on MySQL 4.x depending on the size of your catalog database.
dev.mysql.com/doc/refman/5.0/en/gone-away.html
Fatal error: spool.c:402 Spool data read error. Fatal error: backup.c:892 Network send error to SD. ERR=Connection reset by peer
What you need to do is setup MySQL to use a different (larger) temp directory, which can be set in the /etc/my.cnf with these variables set:
tmpdir=/path/to/larger/tmpdir bdb_tmpdir=/path/to/larger/tmpdir
The same considerations apply that are indicated above for MySQL. That is, consult the PostgreSQL documents for how to repair the database, and also consider using Bacula's dbcheck program if the conditions are reasonable for using (see above).
There are a considerable number of ways each of the databases can be tuned to improve the performance. Going from an untuned database to one that is properly tuned can make a difference of a factor of 100 or more in the time to insert or search for records.
For each of the databases, you may get significant improvements by adding additional indexes. The comments in the Bacula make_xxx_tables give some indications as to what indexes may be appropriate. Please see below for specific instructions on checking indexes.
For MySQL, what is very important is to use the examine the my.cnf file (usually in /etc/my.cnf). You may obtain significant performances by switching to the my-large.cnf or my-huge.cnf files that come with the MySQL source code.
For PostgreSQL, you might want to consider turning fsync off. Of course doing so can cause corrupted databases in the event of a machine crash. There are many different ways that you can tune PostgreSQL, the following document discusses a few of them: www.varlena.com/varlena/GeneralBits/Tidbits/perf.html.
There is also a PostgreSQL FAQ question number 3.3 that may answer some of your questions about how to improve performance of the PostgreSQL engine: www.postgresql.org/docs/faqs.FAQ.html#3.3.
Also for PostgreSQL, look at what “effective_cache_size”. For a 2GB memory machine, you probably want to set it at 131072, but don't set it too high. In addition, for a 2GB system, work_mem = 256000 and maintenance_work_mem = 256000 seem to be reasonable values. Make sure your checkpoint_segments is set to at least 8.
The most important indexes for performance are the two indexes on the File table. The first index is on FileId and is automatically made because it is the unique key used to access the table. The other one is the (JobId, PathId, Filename) index. If these Indexes are not present, your performance may suffer a lot.
psql bacula select * from pg_indexes where tablename='file';
If you do not see output that indicates that all three indexes are created, you can create the two additional indexes using:
psql bacula CREATE INDEX file_jobid_idx on file (jobid); CREATE INDEX file_jpf_idx on file (jobid, pathid, filename);
Make sure that you doesn't have an index on File (filenameid, pathid).
mysql bacula show index from File;
If the indexes are not present, especially the JobId index, you can create them with the following commands:
mysql bacula CREATE INDEX file_jobid_idx on File (JobId); CREATE INDEX file_jpf_idx on File (JobId, PathId, Filename(255));
Though normally not a problem, you should ensure that the indexes defined for Filename and Path are both set to 255 characters. Some users reported performance problems when their indexes were set to 50 characters. To check, do:
mysql bacula show index from Path;
and what is important is that for Filename, you have an index with Key_name “Name” and Sub_part “255”. For Path, you should have a Key_name “Path” and Sub_part “255”. If one or the other does not exist or the Sub_part is less that 255, you can drop and recreate the appropriate index with:
mysql bacula DROP INDEX Path on Path; CREATE INDEX Path on Path (Path(255));
Over time, as noted above, your database will tend to grow. I've noticed that even though Bacula regularly prunes files, PostgreSQL has a VACUUM command that will compact your database for you. Alternatively you may want to use the vacuumdb command, which can be run from a cron job.
All database programs have some means of writing the database out in ASCII format and then reloading it. Doing so will re-create the database from scratch producing a compacted result, so below, we show you how you can do this for PostgreSQL.
For a PostgreSQL database, you could write the Bacula database as an ASCII file (bacula.sql) then reload it by doing the following:
pg_dump -c bacula > bacula.sql cat bacula.sql | psql bacula rm -f bacula.sql
Depending on the size of your database, this will take more or less time and a fair amount of disk space. For example, you can cd to the location of the Bacula database (typically /usr/local/pgsql/data or possible /var/lib/pgsql/data) and check the size.
There are certain PostgreSQL users who do not recommend the above procedure. They have the following to say: PostgreSQL does not need to be dumped/restored to keep the database efficient. A normal process of vacuuming will prevent the database from every getting too large. If you want to fine-tweak the database storage, commands such as VACUUM FULL, REINDEX, and CLUSTER exist specifically to keep you from having to do a dump/restore.
Finally, you might want to look at the PostgreSQL documentation on this subject at www.postgresql.org/docs/8.1/interactive/maintenance.html.
On some older Bacula you may begun using Bacula with SQLite then later find that you want to switch to MySQL or Postgres for any of a number of reasons: SQLite is no longer supported by Bacula; SQLite tends to use more disk than MySQL; when the database is corrupted it is often more catastrophic than with MySQL or PostgreSQL. Several users have succeeded in converting by exporting the SQLite data and then processing it with Perl scripts prior to putting it into MySQL or PostgreSQL. This is, however, not always a simple process. Scripts are available on bacula source distribution under examples/database.
If ever the machine on which your Bacula database crashes, and you need to restore from backup tapes, one of your first priorities will probably be to recover the database. Although Bacula will happily backup your catalog database if it is specified in the FileSet, this is not a very good way to do it, because the database will be saved while Bacula is modifying it. Thus the database may be in an unstable state. Worse yet, you will backup the database before all the Bacula updates have been applied.
To resolve these problems, you need to backup the database after all the backup jobs have been run. In addition, you will want to make a copy while Bacula is not modifying it. To do so, you can use two scripts provided in the release make_catalog_backup.pl and delete_catalog_backup.pl. These files will be automatically generated along with all the other Bacula scripts. The first script will make an ASCII copy of your Bacula database into bacula.sql in the working directory you specified in your configuration, and the second will delete the bacula.sql file.
The basic sequence of events to make this work correctly is as follows:
Assuming that you start all your nightly backup jobs at 1:05 am (and that they run one after another), you can do the catalog backup with the following additional Director configuration statements:
# Backup the catalog database (after the nightly save) Job { Name = "BackupCatalog" Type = Backup Client=rufus-fd FileSet="Catalog" Schedule = "WeeklyCycleAfterBackup" Storage = DLTDrive Messages = Standard Pool = Default # WARNING!!! Passing the password via the command line is insecure. # see comments in make_catalog_backup.pl for details. RunBeforeJob = "/opt/bacula/scripts/make_catalog_backup.pl MyCatalog" RunAfterJob = "/opt/bacula/scripts/delete_catalog_backup" Write Bootstrap = "/opt/bacula/working/BackupCatalog.bsr" } # This schedule does the catalog. It starts after the WeeklyCycle Schedule { Name = "WeeklyCycleAfterBackup Run = Level=Full sun-sat at 1:10 } # This is the backup of the catalog FileSet { Name = "Catalog" Include { Options { signature=MD5 } File = /opt/bacula/working/bacula.sql } }
Be sure to write a bootstrap file as in the above example. However, it is preferable to write or copy the bootstrap file to another computer. It will allow you to quickly recover the database backup should that be necessary. If you do not have a bootstrap file, it is still possible to recover your database backup, but it will be more work and take longer.
We provide make_catalog_backup.pl as an example of what can be used to backup your Bacula database. We expect you to take security precautions relevant to your situation. make_catalog_backup.pl is designed to take a password on the command line. This is fine on machines with only trusted users. It is not acceptable on machines without trusted users. Most database systems provide a alternative method, which does not place the password on the command line.
The make_catalog_backup.pl script contains some warnings about how to use it. Please read those tips.
To help you get started, we know PostgreSQL has a password file, .pgpass, and we know MySQL has .my.cnf.
Only you can decide what is appropriate for your situation. We have provided you with a starting point. We hope it helps.
If you are running a database in production mode on your machine, Bacula will happily backup the files, but if the database is in use while Bacula is reading it, you may back it up in an unstable state.
The best solution is to shutdown your database before backing it up, or use some tool specific to your database to make a valid live copy perhaps by dumping the database in ASCII format. I am not a database expert, so I cannot provide you advice on how to do this, but if you are unsure about how to backup your database, you might try visiting the Backup Central site, which has been renamed Storage Mountain (www.backupcentral.com). In particular, their Free Backup and Recovery Software page has links to scripts that show you how to shutdown and backup most major databases.
As mentioned above, if you do not do automatic pruning, your Catalog will grow each time you run a Job. Normally, you should decide how long you want File records to be maintained in the Catalog and set the File Retention period to that time. Then you can either wait and see how big your Catalog gets or make a calculation assuming approximately 154 bytes for each File saved and knowing the number of Files that are saved during each backup and the number of Clients you backup.
For example, suppose you do a backup of two systems, each with 100,000 files. Suppose further that you do a Full backup weekly and an Incremental every day, and that the Incremental backup typically saves 4,000 files. The size of your database after a month can roughly be calculated as:
Size = 154 * No. Systems * (100,000 * 4 + 10,000 * 26)
where we have assumed four weeks in a month and 26 incremental backups per month. This would give the following:
Size = 154 * 2 * (100,000 * 4 + 10,000 * 26)or
Size = 308 * (400,000 + 260,000)or
Size = 203,280,000 bytes
So for the above two systems, we should expect to have a database size of approximately 200 Megabytes. Of course, this will vary according to how many files are actually backed up.
Below are some statistics for a MySQL database containing Job records for five Clients beginning September 2001 through May 2002 (8.5 months) and File records for the last 80 days. (Older File records have been pruned). For these systems, only the user files and system files that change are backed up. The core part of the system is assumed to be easily reloaded from the Red Hat rpms.
In the list below, the files (corresponding to Bacula tables) with the extension .MYD contain the data records whereas files with the extension .MYI contain indexes.
You will note that the File records (containing the file attributes) make up the large bulk of the number of records as well as the space used (459 Mega Bytes including the indexes). As a consequence, the most important Retention period will be the File Retention period. A quick calculation shows that for each File that is saved, the database grows by approximately 150 bytes.
Size in Bytes Records File ============ ========= =========== 168 5 Client.MYD 3,072 Client.MYI 344,394,684 3,080,191 File.MYD 115,280,896 File.MYI 2,590,316 106,902 Filename.MYD 3,026,944 Filename.MYI 184 4 FileSet.MYD 2,048 FileSet.MYI 49,062 1,326 JobMedia.MYD 30,720 JobMedia.MYI 141,752 1,378 Job.MYD 13,312 Job.MYI 1,004 11 Media.MYD 3,072 Media.MYI 1,299,512 22,233 Path.MYD 581,632 Path.MYI 36 1 Pool.MYD 3,072 Pool.MYI 5 1 Version.MYD 1,024 Version.MYI
This database has a total size of approximately 450 Megabytes.