Subsections

Katalog Verwaltung

Ohne eine ordnungsgemäße Einrichtung und Verwaltung kann es sein, dass Ihr Katalog immer größer wird wenn Jobs laufen und Daten gesichert werden. Zudem kann der Katalog ineffizient und langsam werden. Wie schnell der Katalog wächst, hängt von der Anzahl der Jobs und der Menge der dabei gesicherten Dateien ab. Durch das Löschen von Einträgen im Katalog kann Platz geschaffen werden für neue Einträge der folgenden Jobs. Durch regelmäßiges löschen alter abgelaufener Daten (älter als durch die Aufbewahrungszeiträume (Retention Periods) angegeben), wird dafür gesorgt, dass die Katalog-Datenbank eine konstante Größe beibehält.

Sie können mit der vorgegebenen Konfiguration beginnen, sie enthält bereits sinnvolle Vorgaben für eine kleine Anzahl von Clients (kleiner 5), in diesem Fall wird die Katalogwartung, wenn Sie einige hundert Megabyte freien Plattenplatz haben, nicht dringlich sein. Was aber auch immer der Fall ist, einiges Wissen über die Retention Periods/Aufbewahrungszeiträume der Daten im Katalog und auf den Volumes ist hilfreich.

Einstellung der Aufbewahrungszeiträume

Bacula benutzt drei verschiedene Aufbewahrungszeiträume: die File Retention: der Aufbewahrungszeitraum für Dateien, die Job Retention: der Aufbewahrungszeitraum für Jobs und die Volume Retention: der Aufbewahrungszeitraum für Volumes. Von diesen drei ist der Aufbewahrungszeitraum für Dateien der entscheidende, wenn es darum geht, wie groß die Datenbank werden wird.

Die File Retention und die Job Retention werden in der Client-Konfiguration, wie unten gezeigt, angegeben. Die Volume Retention wird in der Pool-Konfiguration angegeben, genauere Informationen dazu finden Sie im nächsten Kapitel dieses Handbuchs.

File Retention = <time-period-specification>

Der Aufbewahrungszeitraum für Dateien gibt die Zeitspanne an, die die Datei-Einträge in der Katalog-Datenbank aufbewahrt werden. Wenn AutoPrune in der Client-Konfiguration auf yes gesetzt ist, wird Bacula die Katalog-Einträge der Dateien löschen, die älter als dieser Zeitraum sind. Das Löschen erfolgt nach Beendigung eines Jobs des entsprechenden Clients. Bitte beachten Sie, dass die Client-Datenbank-Einträge eine Kopie der Aufbewahrungszeiträume für Dateien und Jobs enthalten, Bacula aber die Zeiträume aus der aktuellen Client-Konfiguration des Director-Dienstes benutzt um alte Katalog-Einträge zu löschen.

Da die Datei-Einträge ca. 80 Prozent der Katalog-Datenbankgröße ausmachen, sollten Sie sorgfälltig ermitteln über welchen Zeitraum Sie die Einträge aufbewahren wollen. Nachdem die Datei-Einträge gelöscht wurden, ist es nicht mehr möglich einzelne dieser Dateien mit einem Rücksicherungs-Job wiederherzustellen, aber die Bacula-Versionen 1.37 und später sind in der Lage, aufgrund des Job-Eintrags im Katalog, alle Dateien des Jobs zurückzusichern solange der Job-Eintrag im Katalog vorhanden ist.

Aufbewahrungszeiträume werden in Sekunden angegeben, aber der Einfachheit halber sind auch eine Reihe von Hilfsangaben möglich, so dass man Minuten, Stunden, Tage, Wochen, Monate, Quartale und Jahre konfigurieren kann. Lesen Sie bitte das Konfigurations-Kapitel dieses Handbuchs um mehr über diese Hilfsangaben zu erfahren.

Der Standardwert der Aufbewahrungszeit für Dateien ist 60 Tage.

Job Retention = <time-period-specification>

Der Aufbewahrungszeitraum für Jobs gibt die Zeitspanne an, die die Job-Einträge in der Katalog-Datenbank aufbewahrt werden. Wenn AutoPrune in der Client-Konfiguration auf yes gesetzt ist, wird Bacula die Katalog-Einträge der Jobs löschen, die älter als dieser Zeitraum sind. Beachten Sie, dass wenn ein Job-Eintrag gelöscht wird, auch alle zu diesem Job gehörenden Datei- und JobMedia-Einträge aus dem Katalog gelöscht werden. Dies passiert unabhängig von der Aufbewahrungszeit für Dateien, infolge dessen wird die Aufbewahrungszeit für Dateien normalerweise kürzer sein als für Jobs.

Wie oben erwähnt, sind Sie nicht mehr in der Lage einzelne Dateien eines Jobs zurückzusichern, wenn die Datei-Einträge aus der Katalog-Datenbank entfernt wurden. Jedoch, solange der Job-Eintrag im Katalog vorhanden ist, können Sie immer noch den kompletten Job mit allen Dateien wiederherstellen (ab Bacula-Version 1.37 und größer). Daher ist es eine gute Idee, die Job-Einträge im Katalog länger als die Datei-Einträge aufzubewahren.

Der Standardwert der Aufbewahrungszeit für Jobs ist 180 Tage.

AutoPrune = <yes/no>

Wenn AutoPrune auf yes (Standard) gesetzt ist, wird Bacula nach jedem Job automatisch überprüfen, ob die Aufbewahrungszeit für bestimmte Dateien und/oder Jobs des gerade gesicherten Clients abgelaufen ist und diese aus dem Katalog entfernen. Falls Sie AutoPrune durch das Setzen auf no ausschalten, wird Ihre Katalog-Datenbank mit jedem gelaufenen Job immer größer werden.

Komprimieren Ihrer MySQL Datenbank

Mit der Zeit, wie oben schon angemerkt, wird Ihre Datenbank dazu neigen zu wachsen. Auch wenn Bacula regelmäßig Datei-Einträge löscht, wird die MySQL-Datenbank ständig größer werden. Um dies zu vermeiden, muss die Datenbank komprimiert werden. Normalerweise kennen große kommerzielle Datenbanken, wie Oracle, bestimmte Kommandos um den verschwendeten Festplattenplatz wieder freizugeben. MySQL hat das OPTIMIZE TABLE Kommando und bei SQLite (Version 2.8.4 und größer) können Sie das VACUUM Kommando zu diesem Zweck benutzen. Wir überlassen es Ihnen, die Nützlichkeit von OPTIMIZE TABLE oder VACUUM zu ermitteln.

Alle Datenbanken haben Hilfsmittel, um die enthaltenen Daten im ASCII-Format in eine Datei zu schreiben und diese Datei dann auch wieder einzulesen. Wenn man das tut, wird die Datenbank erneut erzeugt, was ein sehr kompaktes Datenbank-Format als Ergebnis hat. Weiter unten zeigen wir Ihnen, wie Sie das bei MySQL, SQLite und PostgreSQL durchführen können.

Bei einer MySQL Datenbank können Sie den Inhalt der Katalog-Datenbank mit den folgenden Kommandos in eine ASCII-Datei (bacula.sql) schreiben und neu in die Datenbank importieren:

mysqldump -f --opt bacula > bacula.sql
mysql bacula < bacula.sql
rm -f bacula.sql

Abhängig von der Größe Ihrer Datenbank, wird dies mehr oder weniger Zeit und auch Festplattenplatz benötigen. Zum Beispiel, wenn ich in das Verzeichnis wechsle, wo meine MySQL-Datenbank liegt (typischerweise /var/lib/mysql) und dieses Kommando ausführe:

du bacula

bekomme ich die Ausgabe 620,644, was bedeutet dass das Verzeichnis bacula 620.644 Blöcke von 1024 Bytes auf der Festplatte belegt, meine Datenbank enthält also ca. 635 MB an Daten. Nachdem ich das mysqldump ausgeführt habe, ist die dabei entstandene Datei bacula.sql 174.356 Blöcke groß, wenn diese Datei mit dem Kommando mysql bacula < bacula.sql wieder in die Datenbank importiert wird, ergibt sich eine Datenbankgröße von nur noch 210.464 Blöcken. Mit anderen Worten, die komprimierte Version meiner Datenbank, die seit ca. 1 Jahr in Benutzung ist, ist ungefähr nur noch ein Drittel so groß wie vorher.

Als Konsequenz wird empfohlen, auf die Größe der Datenbank zu achten und sie von Zeit zu Zeit (alle sechs Monate oder jährlich) zu komprimieren.

Reparatur Ihrer MySQL Datenbank

Wenn Sie bemerken, dass das Schreiben der MySQL-Datenbank zu Fehlern führt, oder das der Director-Dienst hängt, wenn er auf die Datenbank zugreift, sollten Sie sich die MySQL Datenbanküberprüfungs- und Reparaturprogramme ansehen. Welches Programm Sie laufen lassen sollten, hängt mit der von Ihnen benutzten Datenbank- Indizierung zusammen. Wenn Sie das Standardverfahren nutzen, werden Sie vermutlich myisamchk laufen lassen. Fär nähere Information lesen Sie bitte auch: http://dev.mysql.com/doc/refman/5.1/de/client-utility-programs.html.

Falls die auftretenden Fehler einfache SQL-Warnungen sind, sollten Sie zuerst das von Bacula mitgelieferte dbcheck-Programm ausführen, bevor Sie die MySQL-Datenbank-Reparaturprogramme nutzen. Dieses Programm kann verwaiste Datenbankeinträge finden und andere Inkonsistenzen in der Katalog-Datenbank beheben.

Eine typische Ursache von Datenbankproblemen ist das Volllaufen einer Partition. In solch einem Fall muss entweder zusätzlicher Platz geschaffen werden, oder belegter Platz freigegeben werden, bevor die Datenbank mit myisamchk repariert werden kann.

Hier ist ein Beispiel, wie man eine korrupte Datenbank reparieren könnte, falls nach dem Vollaufen einer Partition die Datenbankprobleme mit myisamchk -r nicht behoben werden können:

kopieren Sie folgende Zeilen in ein Shell-Script names repair:

#!/bin/sh
for i in *.MYD ; do
  mv $i x${i}
  t=`echo $i | cut -f 1 -d '.' -`
  mysql bacula <<END_OF_DATA
set autocommit=1;
truncate table $t;
quit
END_OF_DATA
  cp x${i} ${i}
  chown mysql:mysql ${i}
  myisamchk -r ${t}
done

dieses Shell-Script, wird dann wie folgt aufgerufen:

cd /var/lib/mysql/bacula
./repair

nachdem sichergestellt ist, dass die Datenbank wieder korrekt funktioniert, kann man die alten Datenbank-Dateien löschen:

cd /var/lib/mysql/bacula
rm -f x*.MYD

MySQL-Tabelle ist voll

Falls ein Fehler wie The table 'File' is full ... auftritt, passiert das vermutlich, weil bei den MySQL-Versionen 4.x die Tabellengröße standardmäßig auf 4 GB begrenzt ist und Sie dieses Limit erreicht haben. Hinweise zu der maximal möglichen Tabellengröße gibt es auf http://dev.mysql.com/doc/refman/5.1/de/table-size.html

Sie können sich die maximale Tabellengröße mit:

mysql bacula
SHOW TABLE STATUS FROM bacula like "File";

anzeigen lassen. Wenn die Spalte max_data_length ca. 4 GB entspricht, dann ist dass das Problem. Sie können die maximale Größe aber mit:

mysql bacula
ALTER TABLE File MAX_ROWS=281474976710656;

anpassen. Alternativ können Sie auch die /etc/my.cnf editieren, bevor Sie die Bacula-Tabellen erstellen, setzen Sie im Abschnitt [mysqld]:

set-variable = myisam_data_pointer_size=6

Die myisam Data-Pointer-Größe muss vor dem Anlegen der Bacula-Katalog-Datenbank oder ihrer Tabellen gesetzt werden, um wirksam zu sein.

Die MAX_ROWS und Pointer-Anpassungen sollten bei MySQL-Versionen größer 5.x nicht nötig sein, somit sind diese Änderungen nur bei MySQL 4.x, in Abhägigkeit von Ihrer Katalog-Datenbank-Größe, notwendig.

MySQL-Server Has Gone Away-Fehler

Fall Sie Probleme damit haben, dass Ihr MySQL-Server nicht mehr erreichbar ist, oder Meldungen wie "MySQL server has gone away" erscheinen, dann lesen Sie bitte die MySQL-Dokumentation auf:

http://dev.mysql.com/doc/refman/5.1/de/gone-away.html

Reparatur Ihrer PostgreSQL Datenbank

Dieselben Überlegungen, wie oben für MySQL angeführt, sind auch hier gültig. Lesen Sie die PostgreSQL-Dokumentation um zu erfahren, wie Sie Ihre Datenbank reparieren können. Erwägen Sie auch den Einsatz von Bacula's dbcheck-Programm, falls es sinnvoll erscheint (siehe oben).

Datenbank-Leistung

Es gibt viele Wege, die verschiedenen von Bacula unterstützten Datenbanken abzustimmen, um ihre Leistung zu erhöhen. Zwischen einer schlecht und gut abgestimmten Datenbank kann ein Geschwindigkeitsunterschied von 100 und mehr liegen, wenn es darum geht Datenbankeinträge zu schreiben oder zu suchen.

Bei jeder der Datenbanken, können Sie erhebliche Verbesserungen erwarten, wenn Sie weitere Indexe hinzufügen. Die Kommentare in den Bacula make_xxx_tables-Scripts (z.B. make_postgres_tables) geben einige Hinweise, wofür Indexe geeignet sind. Sehen Sie bitte auch unten für genaue Informationen, wie Sie Ihre Indexe überprüfen können.

Für MySQL ist es sehr wichtig, die my.cnf-Datei durchzusehen (gwöhnlich /etc/my.cnf) Eventuell können Sie die Leistung wesentlich erhöhen, wenn Sie die Konfigurationsdateien my-large.cnf oder my-huge.cnf aus dem MySQL-Quellcode verwenden.

Für SQLite3 ist ein wichtiger Punkt, dass in der Konfiguration die Angabe "PRAGMA synchronous = NORMAL;" vorhanden ist. Dadurch werden die Zeitabstände vergrößert, in denen die Datenbank ihren RAM-Zwischenspeicher auf die Festplatte schreibt. Es gibt noch andere Einstellungen für PRAGMA die die Effizienz steigern können, aber auch das Risiko einer Datenbankbeschädigung beim Absturz des Systems erhöhen.

Bei PostgreSQL sollten Sie eventuell in Betracht ziehen "fsync'' abzuschalten, aber auch das kann bei Systemabstürzen zu Datenbankprobleme führen. Es gibt viele Wege die Leistungsfähigkeit von PostgreSQL zu steigern, diese Internetseiten erklären ein paar von ihnen (auf englisch): http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html.

Auch in den PostgreSQL FAQ's finden sich Hinweise die Performanz zu verbessern: http://www.postgresql.org/docs/faqs.FAQ.html.

Bei PostgreSQL sollten Sie auch auf die "effective_cache_size" achten. Für ein System mit 2GB Arbeitsspeicher können Sie sie auf 131072 setzen, aber setzen Sie sie nicht zu hoch. Zusätzlich sind "work_mem = 256000" und "maintenance_work_mem = 256000", für Systeme mit 2GB Speicher, sinnvolle Werte. Stellen Sie sicher das "checkpoint_segments" auf mindestens 8 gesetzt ist.

Datenbank-Leistung und Indexe

Eine der wichtigsten Aspekte die Leistungsfähigkeit der Bacula-Datenbank sicherzustellen ist zu überprüfen, ob alle Indexe richtig sind. Mehrere Benutzer haben angemerkt, dass ihre Datenbank in der Standardkonfiguration nicht alle notwendigen Indexe hatte. Auch werden Sie eventuell, abhängig von Ihrem Anwendungszweck, zusätzlich Indexe benötigen.

Die wichtigsten Indexe für eine schnelle Datenbank sind die drei Indexe auf der File-Tabelle. Der erste Index ist auf der FileId und wird automatisch anglegt, da es der eindeutige Schlüssel ist, um auf die Tabelle zuzugreifen. Die anderen beiden sind der JobId-Index und der Filename, PathId-Index. Wenn einer dieser Indexe fehlt, verliert Ihre Datenbank enorm an Performance.

PostgreSQL Indexe

Bei PostgreSQL können Sie mit dem folgenden Kommandos überprüfen ob Sie alle Indexe haben:

psql bacula
select * from pg_indexes where tablename='file';

Wenn Sie keine Ausgaben sehen die anzeigen das alle drei Indexe vorhanden sind, können Sie die beiden zusätzlichen Indexe mit:

psql bacula
CREATE INDEX file_jobid_idx on file (jobid);
CREATE INDEX file_fp_idx on file (filenameid, pathid);

anlegen.

MySQL Indexes

Bei MySQL können Sie die Indexe mit:

mysql bacula
show index from File;

überprüfen. Wenn Indexe fehlen, besonders der JobId-Index, kann er mit:

mysql bacula
CREATE INDEX file_jobid_idx on File (JobId);
CREATE INDEX file_jpf_idx on File (Job, FilenameId, PathId);

erzeugt werden.

Obgleich das normalerweise kein Problem darstellt, sollten Sie sicherstellen, dass Ihre Indexe für Filename und PathId beide auf 255 Zeichen gesetzt sind. Einige Benutzer berichten von Problemen mit Indexen die auf 50 Zeichen gesetzt sind. Um das zu kontrollieren, führen Sie:

mysql bacula
show index from Filename;
show index from Path;

aus. Fü die Dateinamen ist es wichtig, dass Sie einen Index haben mit dem Key_name "Name" und dem Sub_part "255". Fü den Pfad müssen Sie einen Index mit dem Key_name "Path" und dem Sub_part "255" haben. Wenn einer der Indexe nicht existiert oder der Sub_part kleiner 255 ist, können Sie den Index neu anlegen indem Sie die folgende Kommandos benutzen:

mysql bacula
DROP INDEX Path on Path;
CREATE INDEX Path on Path (Path(255);

DROP INDEX Name on Filename;
CREATE INDEX Name on Filename (Name(255));

SQLite Indexes

Bei SQLite können Sie so die Indexe überprüfen:

sqlite <path>bacula.db
select * from sqlite_master where type='index' and tbl_name='File';

Falls ein Index fehlt, im besonderen der JobId-Index, können Sie ihn mit den folgenden Befehlen erstellen:

mysql bacula
CREATE INDEX file_jobid_idx on File (JobId);
CREATE INDEX file_jfp_idx on File (Job, FilenameId, PathId);

Komprimieren Ihrer PostgreSQL Datenbank

Über die Zeit, wie schon oben angemerkt, wird Ihre Datenbank wachsen. Auch wenn Bacula regelmäßig alte Daten löscht, wird das PostgreSQL Kommando VACUUM Ihnen helfen die Datenbank zu komprimieren. Alternativ wollen Sie eventuell das vacuumdb-Kommando nutzen, das vom cron-Dienst gestartet werden kann.

Alle Datenbanken haben Hilfsmittel, um die Daten in eine ASCII-Datei zu schreiben um sie dann erneut einzulesen. Wenn Sie das tun, wird die Datenbank komplett neu aufgebaut und so eine kompaktere Version entstehen. Wie Sie so etwas tun können, zeigt Ihnen das folgende PostgreSQL Beispiel.

Bei einer PostgreSQL-Datenbank lassen Sie die Daten in eine ASCII-Datei schreiben und neu einlesen, wenn Sie diese Kommandos ausführen:

pg_dump -c bacula > bacula.sql
cat bacula.sql | psql bacula
rm -f bacula.sql

Abhägig von Ihrer Datenabnkgröße wird dieser Vorgang mehr oder weniger Zeit und Festplattenplatz in Anspruch nehmen. Sie sollten vorher in das Arbeitsverzeichnis Ihrer Datenbank wechseln (typischerweise /var/lib/postgres/data) und die Größe überprüfen.

Bestimmte PostgreSQL-Nutzer empfehlen nicht die oben genannte Prozedur, sie sind der Meinung: bei PostgreSQL ist es nicht notwendig, die Daten zu exportieren um sie dann wieder einzulesen. Das normale Ausführen des VACUUM-Kommandos reicht, um die Datenbank performant zu halten. Wenn Sie es ganz genau machen wollen, benutzen Sie speziellen Kommandos VACUUM FULL, REINDEX und CLUSTER um sich den Umweg über das exportieren und wiedereinlesen der Daten zu ersparen.

Zum Schluß wollen Sie vielleicht noch einen Blick auf die zugehörige PostgreSQL-Dokumentation werfen, Sie finden sie (auf englisch) unter: http://www.postgresql.org/docs/8.2/interactive/maintenance.html.

Komprimieren Ihrer SQLite Datenbank

Lesen Sie bitte zuerst die vorherigen Abschnitte die erklären, warum es erforderlich ist, eine Datenbank zu komprimieren. SQLite-Versionen größer 2.8.4 haben das Vacuum-Kommando um die Datenbank zu komprimieren:

cd {\bf working-directory}
echo 'vacuum;' | sqlite bacula.db

Als Alternative können Sie auch die folgenden Kommandos (auf Ihr System angepasst) benutzen:

cd {\bf working-directory}
echo '.dump' | sqlite bacula.db > bacula.sql
rm -f bacula.db
sqlite bacula.db < bacula.sql
rm -f bacula.sql

Wobei working-directory das Verzeichnis ist, dass Sie in Ihrer Director-Dienst-Konfiguration angegeben haben. Beachten Sie bitte, dass es im Fall von SQLite erforderlich ist, die alte Datenbank komplett zu löschen, bevor die komprimierte Version angelegt werden kann.

Migration von SQLite zu MySQL

Wenn Sie Bacula anfangs mit SQLite zusammen benutzt haben, gibt es später eine Reihe von Gründen, weshalb Sie eventuell auf MySQL umsteigen wollen: SQLite belegt mehr Festplattenplatz für dieselbe Datenmenge als MySQL; falls die Datenbank beschädigt wird, ist es mit SQLite problematischer als bei MySQL oder PostgreSQL, sie wiederherzustellen. Viele Benutzer sind erfolgreich von SQLite auf MySQL umgestiegen, indem sie zuerst die Daten exportiert haben und sie dann mit einem z.B. Perl-Script in ein passendes Format konvertiert haben, um sie in die MySQL-Datenbank zu importieren. Dies ist aber kein sehr einfacher Vorgang.

Sichern Ihrer Bacula Datenbank

Falls jemals der Rechner auf dem Ihre Bacula-Installation läuft abstürzt, und Sie diesen wiederherstellen müssen, wird es einer der ersten Schritte sein, die Datenbank zurückzusichern. Obwohl Bacula fröhlich die Datenbank sichert, wenn sie im FileSet angegeben ist, ist das kein sehr guter Weg, da Bacula die Datenbank ändert, während sie gesichert wird. Dadurch ist die gesicherte Datenbank wahrscheinlich in einem inkonsistenten Zustand. Noch schlimmer ist, dass die Datenbank gesichert wird, bevor Bacula alle Aktualisierungen durchführen kann.

Um diese Problem zu umgehen, müssen Sie die Datenbank sichern nachdem alle Backup-Jobs gelaufen sind. Zusätzlich werden Sie wohl eine Kopie der Datenbank erstellen wollen, während Bacula keine Aktualisierungen vornimmt. Um das zu erreichen, können Sie die beiden Scripte make_catalog_backup und delete_catalog_backup benutzen, die Ihrer Bacula-Version beiliegen. Diese Dateien werden, zusammen mit den anderen Bacula-Scripts, automatisch erzeugt. Das erste Script erzeugt eine ASCII-Kopie Ihrer Datenbank namens bacula.sql in dem Arbeitsverzeichnis, dass Sie in der Konfiguration angegeben haben. Das zweite Script löscht die Datei bacula.sql wieder.

Die grundlegenden Arbeitsschritte damit alles korrekt funktioniert, sind folgende:

alle Backup-Jobs laufen lassen
wenn alle Jobs beendet sind, wird ein Catalog Backup-Job gestartet
Der Catalog Backup-Job muss nach den anderen Backup-Jobs laufen
Benutzen Sie RunBeforeJob um die ASCII-Sicherungsdatei zu erstellen und RunAfterJob um sie wieder zu löschen

Angenommen Sie starten alle Ihre Backup-Jobs nachts um 01:05, können Sie das Catalog-Backup mit der folgenden zusätzlichen Director-Dienst-Konfiguration ausführen lassen:

# Catalog-Datenbank-Backup (nach der n\"{a}chtlichen Sicherung)
Job {
  Name = "BackupCatalog"
  Type = Backup
  Client=rufus-fd
  FileSet="Catalog"
  Schedule = "WeeklyCycleAfterBackup"
  Storage = DLTDrive
  Messages = Standard
  Pool = Default
  # Achtung!!! Das Passwort auf der Kommandozeile zu \"{u}bergeben ist nicht sicher.
  # Lesen Sie bitte die Kommentare in der Datei make_catalog_backup.
  RunBeforeJob = "/home/kern/bacula/bin/make_catalog_backup"
  RunAfterJob  = "/home/kern/bacula/bin/delete_catalog_backup"
  Write Bootstrap = "/home/kern/bacula/working/BackupCatalog.bsr"
}
# Diese Schedule starten das Catalog-Backup nach den anderen Sicherungen
Schedule {
  Name = "WeeklyCycleAfterBackup
  Run = Level=Full sun-sat at 1:10
}
# Das FileSet f\"{u}r die ASCII-Kopie der Datenbank
FileSet {
  Name = "Catalog"
  Include {
    Options {
      signature=MD5
    }
    File = \lt{}working_directory\gt{}/bacula.sql
  }
}

Stellen Sie sicher, dass, wie in dem Beispiel, eine Bootstrap-Datei geschrieben wird. Bevorzugterweise wird eine Kopie dieser Bootstrap-Datei auf einem andern Computer gespeichert. Dies erlaubt eine schnelle Wiederherstellung der Datenbank, falls erforderlich. Wenn Sie keine Bootstrap-Datei haben, ist es trotzdem möglich, erfordert aber mehr Arbeit und dauert länger.

Sicherheitsaspekte

Das Script make_catalog_backup wird als Beispiel bereitgestellt, wie Sie Ihre Bacula Datenbank sichern können. Wir erwarten das Sie, entsprechend Ihrer Situation, Vorsichtsmaßnahmen treffen. make_catalog_backup ist so ausgelegt, dass das Passwort auf der Kommandozeile übergeben wird. Das ist in Ordnung, solange sich nur vertrauenswürdige Benutzer am System anmelden können, ansonsten ist es inakzeptabel. Die meisten Datenbanksysteme bieten eine alternative Methode an, um das Passwort nicht auf der Kommandozeile übergeben zu müssen.

Das Script make_catalog_backup enthält einige Warnungen dies betreffend. Bitte lesen Sie die Kommentare im Script.

Bei PostgreSQL können Sie z.B. eine Passwort-Datei verwenden, siehe .pgpass, und MySQL hat die .my.cnf.

Wir hoffen, dass wir Ihnen damit etwas helfen konnten, aber nur Sie könenn beurteilen, was in Ihrer Situation erforderlich ist.

Sicherung anderer Datenbanken

Wie oben schon erwähnt wurde, führt das Sichern von Datenbank-Dateien im laufenden Betrieb dazu, dass die gesicherten Dateien sich wahrscheinlich in einem inkonsistenten Zustand befinden.

Die beste Lösung dafür ist, die Datenbank vor der Sicherung zu stoppen, oder datenbankspezifische Hilfsprogramme zu verwenden, um eine gültige Sicherungsdatei zu erstellen, die Bacula dann auf die Volumes schreiben kann. Wenn Sie unsicher sind, wie Sie das am besten mit der von Ihnen benutzten Datenbank erreichen können, hilft Ihnen eventuell die Webseite von Backup Central weiter. Auf Free Backup and Recovery Software finden Sie Links zu Scripts die zeigen, wie man die meisten größeren Datenbanken sichern kann.

Datenbank Größe

Wenn Sie nicht automatisch alte Datensätze aus Ihrer Katalog-Datenbank löschen lassen, wird Ihre Datenbank mit jedem gelaufenen Backup-Job wachsen (siehe auch weiter oben). Normalerweise sollten Sie sich entscheiden, wie lange Sie die Datei-Einträge im Katalog aufbewaren wollen und die File Retention entsprechend konfigurieren. Dann können Sie entweder abwarten wie groß Ihre Katalog-Datenbank werden wird, oder es aber auch ungeähr berechnen. Dazu müssen Sie wissen, dass für jede gesicherte Datei in etwa 154 Bytes in der Katalog-Datenbank belegt werden und wieviele Dateien Sie auf wievielen Computern sichern werden.

Ein Beispiel: angenommen Sie sichern zwei Computer, jeder mit 100.000 Dateien. Weiterhin angenommen, Sie machen ein wöchentliches Full-Backup und ein inkrementelles jeden Tag, wobei bei einem inkrementellen Backup typischerweise 4.000 Dateien gesichert werden. Die ungefähre Größe Ihrer Datenbank nach einem Monat kann dann so berechnet werden:

   Gr\"{o}{\ss}e = 154 * Anzahl Computer * (100.000 * 4 + 10.000 * 26)

wenn ein Monat mit 4 Wochen angenommen wird, werden also 26 inkrementelle Backups im Monat laufen. Das ergibt das folgende:

   Gr\"{o}{\ss}e = 154 * 2 * (100.000 * 4 + 10.000 * 26)
or
   Gr\"{o}{\ss}e = 308 * (400.000 + 260.000)
or
   Gr\"{o}{\ss}e = 203.280.000 Bytes

für die beiden oben angenommen Computer können wir also davon ausgehen, dass die Datenbank in etwa 200 Megabytes groß wird. Natürlich hängt dieser Wert davon ab, wieviele Dateien wirklich gesichert werden.

Unten sehen Sie ein paar Statistiken für eine MySQL-Datenbank die Job-Einträge für 5 Clients über 8.5 Monate und Datei-Einträge über 80 Tage enthält (ältere Datei-Einträge wurden schon gelöscht). Bei diesen 5 Clients wurden nur die Benutzer- und System-Dateien gesichert, die sich ständig ändern. Bei allen anderen Dateien wird angenommen, dass sie leicht aus den Software-Paketen des Betriebssystems wiederherstellbar sind.

In der Liste sind die Dateien (die den MySQL-Tabellen entsprechen) mit der Endung .MYD die, die die eigentlichen Daten enthalten und die mit der Endung .MYI enthalten die Indexe.

Sie werden bemerken, dass die meisten Einträge in der Datei File.MYD (die die Datei-Attribute enthält) enthalten sind und diese auch den meisten Platz auf der Festplatte belegt. Die File Retention (der Aufbewahrungszeitraum für Dateien) ist also im wesentlichen dafür verantwortlich, wie groß die Datenbank wird. Eine kurze Berechnung zeigt, dass die Datenbank mit jeder gesicherten Datei ungefähr um 154 Bytes wächst.

Gr\"{o}{\ss}e
  in Bytes   Eintr\"{a}ge   Dateiname 
 ============  =========  ===========
          168          5  Client.MYD
        3,072             Client.MYI
  344,394,684  3,080,191  File.MYD
  115,280,896             File.MYI
    2,590,316    106,902  Filename.MYD
    3,026,944             Filename.MYI
          184          4  FileSet.MYD
        2,048             FileSet.MYI
       49,062      1,326  JobMedia.MYD
       30,720             JobMedia.MYI
      141,752      1,378  Job.MYD
       13,312             Job.MYI
        1,004         11  Media.MYD
        3,072             Media.MYI
    1,299,512     22,233  Path.MYD
      581,632             Path.MYI
           36          1  Pool.MYD
        3,072             Pool.MYI
            5          1  Version.MYD
        1,024             Version.MYI

Die Datenbank hat eine Größe von ca. 450 Megabytes..

Hätten wir SQLite genommen, wäre die Bestimmung der Datenbankgröße viel einfacher gewesen, da SQLite alle Daten in einer einzigen Datei speichert, dann aber hätten wir nicht so einfach erkennen können, welche der Tabellen den meisten Speicherplatz benötigt.

SQLite Datenbanken können bis zu 50 % größer sein als MySQL-Datenbanken (bei gleichem Datenbestand), weil bei SQLite alle Daten als ASCII-Zeichenketten gespeichert werden. Sogar binäre Daten werden als ASCII-Zeichenkette dargestellt, und das scheint den Speicherverbrauch zu erhöhen.

Automatic Volume Recycling

By default, once Bacula starts writing a Volume, it can append to the volume, but it will not overwrite the existing data thus destroying it. However when Bacula recycles a Volume, the Volume becomes available for being reused, and Bacula can at some later time over write the previous contents of that Volume. Thus all previous data will be lost. If the Volume is a tape, the tape will be rewritten from the beginning. If the Volume is a disk file, the file will be truncated before being rewritten.

You may not want Bacula to automatically recycle (reuse) tapes. This would require a large number of tapes though, and in such a case, it is possible to manually recycle tapes. For more on manual recycling, see the section entitled Manually Recycling Volumes below in this chapter.

Most people prefer to have a Pool of tapes that are used for daily backups and recycled once a week, another Pool of tapes that are used for Full backups once a week and recycled monthly, and finally a Pool of tapes that are used once a month and recycled after a year or two. With a scheme like this, the number of tapes in your pool or pools remains constant.

By properly defining your Volume Pools with appropriate Retention periods, Bacula can manage the recycling (such as defined above) automatically.

Automatic recycling of Volumes is controlled by three records in the Pool resource definition in the Director's configuration file. These three records are:

AutoPrune = yes
VolumeRetention = <time>
Recycle = yes

Automatic recycling of Volumes is performed by Bacula only when it wants a new Volume and no appendable Volumes are available in the Pool. It will then search the Pool for any Volumes with the Recycle flag set and whose Volume Status is Full. At that point, the recycling occurs in two steps. The first is that the Catalog for a Volume must be purged of all Jobs and Files contained on that Volume, and the second step is the actual recycling of the Volume. The Volume will be purged if the VolumeRetention period has expired. When a Volume is marked as Purged, it means that no Catalog records reference that Volume, and the Volume can be recycled. Until recycling actually occurs, the Volume data remains intact. If no Volumes can be found for recycling for any of the reasons stated above, Bacula will request operator intervention (i.e. it will ask you to label a new volume).

A key point mentioned above, that can be a source of frustration, is that Bacula will only recycle purged Volumes if there is no other appendable Volume available, otherwise, it will always write to an appendable Volume before recycling even if there are Volume marked as Purged. This preserves your data as long as possible. So, if you wish to "force" Bacula to use a purged Volume, you must first ensure that no other Volume in the Pool is marked Append. If necessary, you can manually set a volume to Full. The reason for this is that Bacula wants to preserve the data on your old tapes (even though purged from the catalog) as long as absolutely possible before overwriting it.

Automatic Pruning

As Bacula writes files to tape, it keeps a list of files, jobs, and volumes in a database called the catalog. Among other things, the database helps Bacula to decide which files to back up in an incremental or differential backup, and helps you locate files on past backups when you want to restore something. However, the catalog will grow larger and larger as time goes on, and eventually it can become unacceptably large.

Bacula's process for removing entries from the catalog is called Pruning. The default is Automatic Pruning, which means that once an entry reaches a certain age (e.g. 30 days old) it is removed from the catalog. Once a job has been pruned, you can still restore it from the backup tape, but one additional step is required: scanning the volume with bscan. The alternative to Automatic Pruning is Manual Pruning, in which you explicitly tell Bacula to erase the catalog entries for a volume. You'd usually do this when you want to reuse a Bacula volume, because there's no point in keeping a list of files that USED TO BE on a tape. Or, if the catalog is starting to get too big, you could prune the oldest jobs to save space. Manual pruning is done with the prune command in the console. (thanks to Bryce Denney for the above explanation).

Prunning Directives

There are three pruning durations. All apply to catalog database records and not to the actual data in a Volume. The pruning (or retention) durations are for: Volumes (Media records), Jobs (Job records), and Files (File records). The durations inter-depend a bit because if Bacula prunes a Volume, it automatically removes all the Job records, and all the File records. Also when a Job record is pruned, all the File records for that Job are also pruned (deleted) from the catalog.

Having the File records in the database means that you can examine all the files backed up for a particular Job. They take the most space in the catalog (probably 90-95% of the total). When the File records are pruned, the Job records can remain, and you can still examine what Jobs ran, but not the details of the Files backed up. In addition, without the File records, you cannot use the Console restore command to restore the files.

When a Job record is pruned, the Volume (Media record) for that Job can still remain in the database, and if you do a "list volumes", you will see the volume information, but the Job records (and its File records) will no longer be available.

In each case, pruning removes information about where older files are, but it also prevents the catalog from growing to be too large. You choose the retention periods in function of how many files you are backing up and the time periods you want to keep those records online, and the size of the database. You can always re-insert the records (with 98% of the original data) by using "bscan" to scan in a whole Volume or any part of the volume that you want.

By setting AutoPrune to yes you will permit Bacula to automatically prune all Volumes in the Pool when a Job needs another Volume. Volume pruning means removing records from the catalog. It does not shrink the size of the Volume or affect the Volume data until the Volume gets overwritten. When a Job requests another volume and there are no Volumes with Volume Status Append available, Bacula will begin volume pruning. This means that all Jobs that are older than the VolumeRetention period will be pruned from every Volume that has Volume Status Full or Used and has Recycle set to yes. Pruning consists of deleting the corresponding Job, File, and JobMedia records from the catalog database. No change to the physical data on the Volume occurs during the pruning process. When all files are pruned from a Volume (i.e. no records in the catalog), the Volume will be marked as Purged implying that no Jobs remain on the volume. The Pool records that control the pruning are described below.

AutoPrune = <yes|no>

If AutoPrune is set to yes (default), Bacula will automatically apply the Volume retention period when running a Job and it needs a new Volume but no appendable volumes are available. At that point, Bacula will prune all Volumes that can be pruned (i.e. AutoPrune set) in an attempt to find a usable volume. If during the autoprune, all files are pruned from the Volume, it will be marked with VolStatus Purged. The default is yes. Note, that although the File and Job records may be pruned from the catalog, a Volume will be marked Purged (and hence ready for recycling) if the Volume status is Append, Full, Used, or Error. If the Volume has another status, such as Archive, Read-Only, Disabled, Busy, or Cleaning, the Volume status will not be changed to Purged.

Volume Retention = <time-period-specification>

The Volume Retention record defines the length of time that Bacula will guarantee that the Volume is not reused counting from the time the last job stored on the Volume terminated.

When this time period expires, and if AutoPrune is set to yes, and a new Volume is needed, but no appendable Volume is available, Bacula will prune (remove) Job records that are older than the specified Volume Retention period.

The Volume Retention period takes precedence over any Job Retention period you have specified in the Client resource. It should also be noted, that the Volume Retention period is obtained by reading the Catalog Database Media record rather than the Pool resource record. This means that if you change the VolumeRetention in the Pool resource record, you must ensure that the corresponding change is made in the catalog by using the update pool command. Doing so will insure that any new Volumes will be created with the changed Volume Retention period. Any existing Volumes will have their own copy of the Volume Retention period that can only be changed on a Volume by Volume basis using the update volume command.

When all file catalog entries are removed from the volume, its VolStatus is set to Purged. The files remain physically on the Volume until the volume is overwritten.

Retention periods are specified in seconds, minutes, hours, days, weeks, months, quarters, or years on the record. See the Configuration chapter of this manual for additional details of time specification.

The default is 1 year.

Recycle = <yes|no>

This statement tells Bacula whether or not the particular Volume can be recycled (i.e. rewritten). If Recycle is set to no (the default), then even if Bacula prunes all the Jobs on the volume and it is marked Purged, it will not consider the tape for recycling. If Recycle is set to yes and all Jobs have been pruned, the volume status will be set to Purged and the volume may then be reused when another volume is needed. If the volume is reused, it is relabeled with the same Volume Name, however all previous data will be lost.

It is also possible to "force" pruning of all Volumes in the Pool associated with a Job by adding Prune Files = yes to the Job resource.

Recycling Algorithm

After all Volumes of a Pool have been pruned (as mentioned above, this happens when a Job needs a new Volume and no appendable Volumes are available), Bacula will look for the oldest Volume that is Purged (all Jobs and Files expired), and if the Recycle flag is on (Recycle=yes) for that Volume, Bacula will relabel it and write new data on it.

The full algorithm that Bacula uses when it needs a new Volume is:

Search the Pool for a Volume with VolStatus=Append (if there is more than one, the Volume with the oldest date last written is chosen. If two have the same date then the one with the lowest MediaId is chosen).
Search the Pool for a Volume with VolStatus=Recycle and the InChanger flag is set true (if there is more than one, the Volume with the oldest date last written is chosen. If two have the same date then the one with the lowest MediaId is chosen).
Try recycling any purged Volumes.
Prune volumes applying Volume retention period (Volumes with VolStatus Full, Used, or Append are pruned).
Search the Pool for a Volume with VolStatus=Purged
If InChanger was set, go back to the first step above, but this second time, ignore the InChanger flag in step 2.
Attempt to create a new Volume if automatic labeling enabled If Python is enabled, a Python NewVolume even is generated before the Label Format check is used.
If a Pool named "Scratch" exists, search for a Volume and if found move it to the current Pool for the Job and use it.
Prune the oldest Volume if RecycleOldestVolume=yes (the Volume with the oldest LastWritten date and VolStatus equal to Full, Recycle, Purged, Used, or Append is chosen). This record ensures that all retention periods are properly respected.
Purge the oldest Volume if PurgeOldestVolume=yes (the Volume with the oldest LastWritten date and VolStatus equal to Full, Recycle, Purged, Used, or Append is chosen). We strongly recommend against the use of PurgeOldestVolume as it can quite easily lead to loss of current backup data.
Give up and ask operator.

The above occurs when Bacula has finished writing a Volume or when no Volume is present in the drive.

On the other hand, if you have inserted a different Volume after the last job, and Bacula recognizes the Volume as valid, it will request authorization from the Director to use this Volume. In this case, if you have set Recycle Current Volume = yes and the Volume is marked as Used or Full, Bacula will prune the volume and if all jobs were removed during the pruning (respecting the retention periods), the Volume will be recycled and used. The recycling algorithm in this case is:

If the VolStatus is Append or Recycle and Accept Any Volume is set, the volume will be used.
If Recycle Current Volume is set and the volume is marked Full or Used, Bacula will prune the volume (applying the retention period). If all Jobs are pruned from the volume, it will be recycled.

This permits users to manually change the Volume every day and load tapes in an order different from what is in the catalog, and if the volume does not contain a current copy of your backup data, it will be used.

Recycle Status

Each Volume inherits the Recycle status (yes or no) from the Pool resource record when the Media record is created (normally when the Volume is labeled). This Recycle status is stored in the Media record of the Catalog. Using the Console program, you may subsequently change the Recycle status for each Volume. For example in the following output from list volumes:

+----------+-------+--------+---------+------------+--------+-----+
| VolumeNa | Media | VolSta | VolByte | LastWritte | VolRet | Rec |
+----------+-------+--------+---------+------------+--------+-----+
| File0001 | File  | Full   | 4190055 | 2002-05-25 | 14400  | 1   |
| File0002 | File  | Full   | 1896460 | 2002-05-26 | 14400  | 1   |
| File0003 | File  | Full   | 1896460 | 2002-05-26 | 14400  | 1   |
| File0004 | File  | Full   | 1896460 | 2002-05-26 | 14400  | 1   |
| File0005 | File  | Full   | 1896460 | 2002-05-26 | 14400  | 1   |
| File0006 | File  | Full   | 1896460 | 2002-05-26 | 14400  | 1   |
| File0007 | File  | Purged | 1896466 | 2002-05-26 | 14400  | 1   |
+----------+-------+--------+---------+------------+--------+-----+

all the volumes are marked as recyclable, and the last Volume, File0007 has been purged, so it may be immediately recycled. The other volumes are all marked recyclable and when their Volume Retention period (14400 seconds or 4 hours) expires, they will be eligible for pruning, and possibly recycling. Even though Volume File0007 has been purged, all the data on the Volume is still recoverable. A purged Volume simply means that there are no entries in the Catalog. Even if the Volume Status is changed to Recycle, the data on the Volume will be recoverable. The data is lost only when the Volume is re-labeled and re-written.

To modify Volume File0001 so that it cannot be recycled, you use the update volume pool=File command in the console program, or simply update and Bacula will prompt you for the information.

+----------+------+-------+---------+-------------+-------+-----+
| VolumeNa | Media| VolSta| VolByte | LastWritten | VolRet| Rec |
+----------+------+-------+---------+-------------+-------+-----+
| File0001 | File | Full  | 4190055 | 2002-05-25  | 14400 | 0   |
| File0002 | File | Full  | 1897236 | 2002-05-26  | 14400 | 1   |
| File0003 | File | Full  | 1896460 | 2002-05-26  | 14400 | 1   |
| File0004 | File | Full  | 1896460 | 2002-05-26  | 14400 | 1   |
| File0005 | File | Full  | 1896460 | 2002-05-26  | 14400 | 1   |
| File0006 | File | Full  | 1896460 | 2002-05-26  | 14400 | 1   |
| File0007 | File | Purged| 1896466 | 2002-05-26  | 14400 | 1   |
+----------+------+-------+---------+-------------+-------+-----+

In this case, File0001 will never be automatically recycled. The same effect can be achieved by setting the Volume Status to Read-Only.

Making Bacula Use a Single Tape

Most people will want Bacula to fill a tape and when it is full, a new tape will be mounted, and so on. However, as an extreme example, it is possible for Bacula to write on a single tape, and every night to rewrite it. To get this to work, you must do two things: first, set the VolumeRetention to less than your save period (one day), and the second item is to make Bacula mark the tape as full after using it once. This is done using UseVolumeOnce = yes. If this latter record is not used and the tape is not full after the first time it is written, Bacula will simply append to the tape and eventually request another volume. Using the tape only once, forces the tape to be marked Full after each use, and the next time Bacula runs, it will recycle the tape.

An example Pool resource that does this is:

Pool {
  Name = DDS-4
  Use Volume Once = yes
  Pool Type = Backup
  AutoPrune = yes
  VolumeRetention = 12h # expire after 12 hours
  Recycle = yes
}

A Daily, Weekly, Monthly Tape Usage Example

This example is meant to show you how one could define a fixed set of volumes that Bacula will rotate through on a regular schedule. There are an infinite number of such schemes, all of which have various advantages and disadvantages.

We start with the following assumptions:

A single tape has more than enough capacity to do a full save.
There are 10 tapes that are used on a daily basis for incremental backups. They are prelabeled Daily1 ... Daily10.
There are 4 tapes that are used on a weekly basis for full backups. They are labeled Week1 ... Week4.
There are 12 tapes that are used on a monthly basis for full backups. They are numbered Month1 ... Month12
A full backup is done every Saturday evening (tape inserted Friday evening before leaving work).
No backups are done over the weekend (this is easy to change).
The first Friday of each month, a Monthly tape is used for the Full backup.
Incremental backups are done Monday - Friday (actually Tue-Fri mornings).

We start the system by doing a Full save to one of the weekly volumes or one of the monthly volumes. The next morning, we remove the tape and insert a Daily tape. Friday evening, we remove the Daily tape and insert the next tape in the Weekly series. Monday, we remove the Weekly tape and re-insert the Daily tape. On the first Friday of the next month, we insert the next Monthly tape in the series rather than a Weekly tape, then continue. When a Daily tape finally fills up, Bacula will request the next one in the series, and the next day when you notice the email message, you will mount it and Bacula will finish the unfinished incremental backup.

What does this give? Well, at any point, you will have the last complete Full save plus several Incremental saves. For any given file you want to recover (or your whole system), you will have a copy of that file every day for at least the last 14 days. For older versions, you will have at least 3 and probably 4 Friday full saves of that file, and going back further, you will have a copy of that file made on the beginning of the month for at least a year.

So you have copies of any file (or your whole system) for at least a year, but as you go back in time, the time between copies increases from daily to weekly to monthly.

What would the Bacula configuration look like to implement such a scheme?

Schedule {
  Name = "NightlySave"
  Run = Level=Full Pool=Monthly 1st sat at 03:05
  Run = Level=Full Pool=Weekly 2nd-5th sat at 03:05
  Run = Level=Incremental Pool=Daily tue-fri at 03:05
}
Job {
  Name = "NightlySave"
  Type = Backup
  Level = Full
  Client = LocalMachine
  FileSet = "File Set"
  Messages = Standard
  Storage = DDS-4
  Pool = Daily
  Schedule = "NightlySave"
}
# Definition of file storage device
Storage {
  Name = DDS-4
  Address = localhost
  SDPort = 9103
  Password = XXXXXXXXXXXXX
  Device = FileStorage
  Media Type = 8mm
}
FileSet {
  Name = "File Set"
  Include = signature=MD5 {
    fffffffffffffffff
  }
  Exclude = { *.o }
}
Pool {
  Name = Daily
  Pool Type = Backup
  AutoPrune = yes
  VolumeRetention = 10d   # recycle in 10 days
  Maximum Volumes = 10
  Recycle = yes
}
Pool {
  Name = Weekly
  Use Volume Once = yes
  Pool Type = Backup
  AutoPrune = yes
  VolumeRetention = 30d  # recycle in 30 days (default)
  Recycle = yes
}
Pool {
  Name = Monthly
  Use Volume Once = yes
  Pool Type = Backup
  AutoPrune = yes
  VolumeRetention = 365d  # recycle in 1 year
  Recycle = yes
}

Automatic Pruning and Recycling Example

Perhaps the best way to understand the various resource records that come into play during automatic pruning and recycling is to run a Job that goes through the whole cycle. If you add the following resources to your Director's configuration file:

Schedule {
  Name = "30 minute cycle"
  Run = Level=Full Pool=File Messages=Standard Storage=File
         hourly at 0:05
  Run = Level=Full Pool=File Messages=Standard Storage=File
         hourly at 0:35
}
Job {
  Name = "Filetest"
  Type = Backup
  Level = Full
  Client=XXXXXXXXXX
  FileSet="Test Files"
  Messages = Standard
  Storage = File
  Pool = File
  Schedule = "30 minute cycle"
}
# Definition of file storage device
Storage {
  Name = File
  Address = XXXXXXXXXXX
  SDPort = 9103
  Password = XXXXXXXXXXXXX
  Device = FileStorage
  Media Type = File
}
FileSet {
  Name = "Test Files"
  Include = signature=MD5 {
    fffffffffffffffff
  }
  Exclude = { *.o }
}
Pool {
  Name = File
  Use Volume Once = yes
  Pool Type = Backup
  LabelFormat = "File"
  AutoPrune = yes
  VolumeRetention = 4h
  Maximum Volumes = 12
  Recycle = yes
}

Where you will need to replace the ffffffffff's by the appropriate files to be saved for your configuration. For the FileSet Include, choose a directory that has one or two megabytes maximum since there will probably be approximately 8 copies of the directory that Bacula will cycle through.

In addition, you will need to add the following to your Storage daemon's configuration file:

Device {
  Name = FileStorage
  Media Type = File
  Archive Device = /tmp
  LabelMedia = yes;
  Random Access = Yes;
  AutomaticMount = yes;
  RemovableMedia = no;
  AlwaysOpen = no;
}

With the above resources, Bacula will start a Job every half hour that saves a copy of the directory you chose to /tmp/File0001 ... /tmp/File0012. After 4 hours, Bacula will start recycling the backup Volumes (/tmp/File0001 ...). You should see this happening in the output produced. Bacula will automatically create the Volumes (Files) the first time it uses them.

To turn it off, either delete all the resources you've added, or simply comment out the Schedule record in the Job resource.

Manually Recycling Volumes

Although automatic recycling of Volumes is implemented in version 1.20 and later (see the Automatic Recycling of Volumes chapter of this manual), you may want to manually force reuse (recycling) of a Volume.

Assuming that you want to keep the Volume name, but you simply want to write new data on the tape, the steps to take are:

Use the update volume command in the Console to ensure that the Recycle field is set to 1
Use the purge jobs volume command in the Console to mark the Volume as Purged. Check by using list volumes.

Once the Volume is marked Purged, it will be recycled the next time a Volume is needed.

If you wish to reuse the tape by giving it a new name, follow the following steps:

Use the purge jobs volume command in the Console to mark the Volume as Purged. Check by using list volumes.
In Bacula version 1.30 or greater, use the Console relabel command to relabel the Volume.

Please note that the relabel command applies only to tape Volumes.

For Bacula versions prior to 1.30 or to manually relabel the Volume, use the instructions below:

Use the delete volume command in the Console to delete the Volume from the Catalog.
If a different tape is mounted, use the unmount command, remove the tape, and insert the tape to be renamed.
Write an EOF mark in the tape using the following commands:
```
  mt -f /dev/nst0 rewind
  mt -f /dev/nst0 weof
```
where you replace /dev/nst0 with the appropriate device name on your system.
Use the label command to write a new label to the tape and to enter it in the catalog.

Please be aware that the delete command can be dangerous. Once it is done, to recover the File records, you must either restore your database as it was before the delete command, or use the bscan utility program to scan the tape and recreate the database entries.

Basic Volume Management

This chapter presents most all the features needed to do Volume management. Most of the concepts apply equally well to both tape and disk Volumes. However, the chapter was originally written to explain backing up to disk, so you will see it is slanted in that direction, but all the directives presented here apply equally well whether your volume is disk or tape.

If you have a lot of hard disk storage or you absolutely must have your backups run within a small time window, you may want to direct Bacula to backup to disk Volumes rather than tape Volumes. This chapter is intended to give you some of the options that are available to you so that you can manage either disk or tape volumes.

Key Concepts and Resource Records

Getting Bacula to write to disk rather than tape in the simplest case is rather easy. In the Storage daemon's configuration file, you simply define an Archive Device to be a directory. For example, if you want your disk backups to go into the directory /home/bacula/backups, you could use the following:

Device {
  Name = FileBackup
  Media Type = File
  Archive Device = /home/bacula/backups
  Random Access = Yes;
  AutomaticMount = yes;
  RemovableMedia = no;
  AlwaysOpen = no;
}

Assuming you have the appropriate Storage resource in your Director's configuration file that references the above Device resource,

Storage {
  Name = FileStorage
  Address = ...
  Password = ...
  Device = FileBackup
  Media Type = File
}

Bacula will then write the archive to the file /home/bacula/backups/<volume-name> where <volume-name> is the volume name of a Volume defined in the Pool. For example, if you have labeled a Volume named Vol001, Bacula will write to the file /home/bacula/backups/Vol001. Although you can later move the archive file to another directory, you should not rename it or it will become unreadable by Bacula. This is because each archive has the filename as part of the internal label, and the internal label must agree with the system filename before Bacula will use it.

Although this is quite simple, there are a number of problems. The first is that unless you specify otherwise, Bacula will always write to the same volume until you run out of disk space. This problem is addressed below.

In addition, if you want to use concurrent jobs that write to several different volumes at the same time, you will need to understand a number of other details. An example of such a configuration is given at the end of this chapter under Concurrent Disk Jobs.

Pool Options to Limit the Volume Usage

Some of the options you have, all of which are specified in the Pool record, are:

To write each Volume only once (i.e. one Job per Volume or file in this case), use:
UseVolumeOnce = yes.
To write nnn Jobs to each Volume, use:
Maximum Volume Jobs = nnn.
To limit the maximum size of each Volume, use:
Maximum Volume Bytes = mmmm.
To limit the use time (i.e. write the Volume for a maximum of 5 days), use:
Volume Use Duration = ttt.

Note that although you probably would not want to limit the number of bytes on a tape as you would on a disk Volume, the other options can be very useful in limiting the time Bacula will use a particular Volume (be it tape or disk). For example, the above directives can allow you to ensure that you rotate through a set of daily Volumes if you wish.

As mentioned above, each of those directives is specified in the Pool or Pools that you use for your Volumes. In the case of Maximum Volume Job, Maximum Volume Bytes, and Volume Use Duration, you can actually specify the desired value on a Volume by Volume basis. The value specified in the Pool record becomes the default when labeling new Volumes. Once a Volume has been created, it gets its own copy of the Pool defaults, and subsequently changing the Pool will have no effect on existing Volumes. You can either manually change the Volume values, or refresh them from the Pool defaults using the update volume command in the Console. As an example of the use of one of the above, suppose your Pool resource contains:

Pool {
  Name = File
  Pool Type = Backup
  Volume Use Duration = 23h
}

then if you run a backup once a day (every 24 hours), Bacula will use a new Volume for each backup, because each Volume it writes can only be used for 23 hours after the first write. Note, setting the use duration to 23 hours is not a very good solution for tapes unless you have someone on-site during the weekends, because Bacula will want a new Volume and no one will be present to mount it, so no weekend backups will be done until Monday morning.

Automatic Volume Labeling

Use of the above records brings up another problem -- that of labeling your Volumes. For automated disk backup, you can either manually label each of your Volumes, or you can have Bacula automatically label new Volumes when they are needed. While, the automatic Volume labeling in version 1.30 and prior is a bit simplistic, but it does allow for automation, the features added in version 1.31 permit automatic creation of a wide variety of labels including information from environment variables and special Bacula Counter variables. In version 1.37 and later, it is probably much better to use Python scripting and the NewVolume event since generating Volume labels in a Python script is much easier than trying to figure out Counter variables. See the Python Scripting chapter of this manual for more details.

Please note that automatic Volume labeling can also be used with tapes, but it is not nearly so practical since the tapes must be pre-mounted. This requires some user interaction. Automatic labeling from templates does NOT work with autochangers since Bacula will not access unknown slots. There are several methods of labeling all volumes in an autochanger magazine. For more information on this, please see the Autochanger chapter of this manual.

Automatic Volume labeling is enabled by making a change to both the Pool resource (Director) and to the Device resource (Storage daemon) shown above. In the case of the Pool resource, you must provide Bacula with a label format that it will use to create new names. In the simplest form, the label format is simply the Volume name, to which Bacula will append a four digit number. This number starts at 0001 and is incremented for each Volume the pool contains. Thus if you modify your Pool resource to be:

Pool {
  Name = File
  Pool Type = Backup
  Volume Use Duration = 23h
  LabelFormat = "Vol"
}

Bacula will create Volume names Vol0001, Vol0002, and so on when new Volumes are needed. Much more complex and elaborate labels can be created using variable expansion defined in the Variable Expansion chapter of this manual.

The second change that is necessary to make automatic labeling work is to give the Storage daemon permission to automatically label Volumes. Do so by adding LabelMedia = yes to the Device resource as follows:

Device {
  Name = File
  Media Type = File
  Archive Device = /home/bacula/backups
  Random Access = Yes;
  AutomaticMount = yes;
  RemovableMedia = no;
  AlwaysOpen = no;
  LabelMedia = yes
}

You can find more details of the Label Format Pool record in Label Format description of the Pool resource records.

Restricting the Number of Volumes and Recycling

Automatic labeling discussed above brings up the problem of Volume management. With the above scheme, a new Volume will be created every day. If you have not specified Retention periods, your Catalog will continue to fill keeping track of all the files Bacula has backed up, and this procedure will create one new archive file (Volume) every day.

The tools Bacula gives you to help automatically manage these problems are the following:

Catalog file record retention periods, the File Retention = ttt record in the Client resource.
Catalog job record retention periods, the Job Retention = ttt record in the Client resource.
The AutoPrune = yes record in the Client resource to permit application of the above two retention periods.
The Volume Retention = ttt record in the Pool resource.
The AutoPrune = yes record in the Pool resource to permit application of the Volume retention period.
The Recycle = yes record in the Pool resource to permit automatic recycling of Volumes whose Volume retention period has expired.
The Recycle Oldest Volume = yes record in the Pool resource tells Bacula to Prune the oldest volume in the Pool, and if all files were pruned to recycle this volume and use it.
The Recycle Current Volume = yes record in the Pool resource tells Bacula to Prune the currently mounted volume in the Pool, and if all files were pruned to recycle this volume and use it.
The Purge Oldest Volume = yes record in the Pool resource permits a forced recycling of the oldest Volume when a new one is needed. N.B. This record ignores retention periods! We highly recommend not to use this record, but instead use Recycle Oldest Volume
The Maximum Volumes = nnn record in the Pool resource to limit the number of Volumes that can be created.

The first three records (File Retention, Job Retention, and AutoPrune) determine the amount of time that Job and File records will remain in your Catalog, and they are discussed in detail in the Automatic Volume Recycling chapter of this manual.

Volume Retention, AutoPrune, and Recycle determine how long Bacula will keep your Volumes before reusing them, and they are also discussed in detail in the Automatic Volume Recycling chapter of this manual.

The Maximum Volumes record can also be used in conjunction with the Volume Retention period to limit the total number of archive Volumes (files) that Bacula will create. By setting an appropriate Volume Retention period, a Volume will be purged just before it is needed and thus Bacula can cycle through a fixed set of Volumes. Cycling through a fixed set of Volumes can also be done by setting Recycle Oldest Volume = yes or Recycle Current Volume = yes. In this case, when Bacula needs a new Volume, it will prune the specified volume.

Concurrent Disk Jobs

Above, we discussed how you could have a single device named FileBackup that writes to volumes in /home/bacula/backups. You can, in fact, run multiple concurrent jobs using the Storage definition given with this example, and all the jobs will simultaneously write into the Volume that is being written.

Now suppose you want to use multiple Pools, which means multiple Volumes, or suppose you want each client to have its own Volume and perhaps its own directory such as /home/bacula/client1 and /home/bacula/client2 ... With the single Storage and Device definition above, neither of these two is possible. Why? Because Bacula disk storage follows the same rules as tape devices. Only one Volume can be mounted on any Device at any time. If you want to simultaneously write multiple Volumes, you will need multiple Device resources in your bacula-sd.conf file, and thus multiple Storage resources in your bacula-dir.conf.

OK, so now you should understand that you need multiple Device definitions in the case of different directorys or different Pools, but you also need to know that the catalog data that Bacula keeps contains only the Media Type and not the specific storage device. This permits a tape for example to be re-read on any compatible tape drive. The compatibility being determined by the Media Type. The same applies to disk storage. Since a volume that is written by a Device in say directory /home/bacula/backups cannot be read by a Device with an Archive Device definition of /home/bacula/client1, you will not be able to restore all your files if you give both those devices Media Type = File. During the restore, Bacula will simply choose the first available device, which may not be the correct one. If this is confusing, just remember that the Directory has only the Media Type and the Volume name. It does not know the Archive Device (or the full path) that is specified in the Storage daemon. Thus you must explicitly tie your Volumes to the correct Device by using the Media Type.

The example shown below shows a case where there are two clients, each using its own Pool and storing their Volumes in different directories.

An Example

The following example is not very practical, but can be used to demonstrate the proof of concept in a relatively short period of time. The example consists of a two clients that are backed up to a set of 12 archive files (Volumes) for each client into different directories on the Storage maching. Each Volume is used (written) only once, and there are four Full saves done every hour (so the whole thing cycles around after three hours).

What is key here is that each physical device on the Storage daemon has a different Media Type. This allows the Director to choose the correct device for restores ...

The Director's configuration file is as follows:

Director {
  Name = my-dir
  QueryFile = "~/bacula/bin/query.sql"
  PidDirectory = "~/bacula/working"
  WorkingDirectory = "~/bacula/working"
  Password = dir_password
}
Schedule {
  Name = "FourPerHour"
  Run = Level=Full hourly at 0:05
  Run = Level=Full hourly at 0:20
  Run = Level=Full hourly at 0:35
  Run = Level=Full hourly at 0:50
}
Job {
  Name = "RecycleExample"
  Type = Backup
  Level = Full
  Client = Rufus
  FileSet= "Example FileSet"
  Messages = Standard
  Storage = FileStorage
  Pool = Recycle
  Schedule = FourPerHour
}

Job {
  Name = "RecycleExample2"
  Type = Backup
  Level = Full
  Client = Roxie
  FileSet= "Example FileSet"
  Messages = Standard
  Storage = FileStorage1
  Pool = Recycle1
  Schedule = FourPerHour
}

FileSet {
  Name = "Example FileSet"
  Include = compression=GZIP signature=SHA1 {
    /home/kern/bacula/bin
  }
}
Client {
  Name = Rufus
  Address = rufus
  Catalog = BackupDB
  Password = client_password
}

Client {
  Name = Roxie
  Address = roxie
  Catalog = BackupDB
  Password = client1_password
}

Storage {
  Name = FileStorage
  Address = rufus
  Password = local_storage_password
  Device = RecycleDir
  Media Type = File
}

Storage {
  Name = FileStorage1
  Address = rufus
  Password = local_storage_password
  Device = RecycleDir1
  Media Type = File1
}

Catalog {
  Name = BackupDB
  dbname = bacula; user = bacula; password = ""
}
Messages {
  Name = Standard
  ...
}
Pool {
  Name = Recycle
  Use Volume Once = yes
  Pool Type = Backup
  LabelFormat = "Recycle-"
  AutoPrune = yes
  VolumeRetention = 2h
  Maximum Volumes = 12
  Recycle = yes
}

Pool {
  Name = Recycle1
  Use Volume Once = yes
  Pool Type = Backup
  LabelFormat = "Recycle1-"
  AutoPrune = yes
  VolumeRetention = 2h
  Maximum Volumes = 12
  Recycle = yes
}

and the Storage daemon's configuration file is:

Storage {
  Name = my-sd
  WorkingDirectory = "~/bacula/working"
  Pid Directory = "~/bacula/working"
  MaximumConcurrentJobs = 10
}
Director {
  Name = my-dir
  Password = local_storage_password
}
Device {
  Name = RecycleDir
  Media Type = File
  Archive Device = /home/bacula/backups
  LabelMedia = yes;
  Random Access = Yes;
  AutomaticMount = yes;
  RemovableMedia = no;
  AlwaysOpen = no;
}

Device {
  Name = RecycleDir1
  Media Type = File1
  Archive Device = /home/bacula/backups1
  LabelMedia = yes;
  Random Access = Yes;
  AutomaticMount = yes;
  RemovableMedia = no;
  AlwaysOpen = no;
}

Messages {
  Name = Standard
  director = my-dir = all
}

With a little bit of work, you can change the above example into a weekly or monthly cycle (take care about the amount of archive disk space used).

Backing up to Multiple Disks

Bacula can, of course, use multiple disks, but in general, each disk must be a separate Device specification in the Storage daemon's conf file, and you must then select what clients to backup to each disk. You will also want to give each Device specification a different Media Type so that during a restore, Bacula will be able to find the appropriate drive.

The situation is a bit more complicated if you want to treat two different physical disk drives (or partitions) logically as a single drive, which Bacula does not directly support. However, it is possible to back up your data to multiple disks as if they were a single drive by linking the Volumes from the first disk to the second disk.

For example, assume that you have two disks named /disk1 and /disk2. If you then create a standard Storage daemon Device resource for backing up to the first disk, it will look like the following:

Device {
  Name = client1
  Media Type = File
  Archive Device = /disk1
  LabelMedia = yes;
  Random Access = Yes;
  AutomaticMount = yes;
  RemovableMedia = no;
  AlwaysOpen = no;
}

Since there is no way to get the above Device resource to reference both /disk1 and /disk2 we do it by pre-creating Volumes on /disk2 with the following:

ln -s /disk2/Disk2-vol001 /disk1/Disk2-vol001
ln -s /disk2/Disk2-vol002 /disk1/Disk2-vol002
ln -s /disk2/Disk2-vol003 /disk1/Disk2-vol003
...

At this point, you can label the Volumes as Volume Disk2-vol001, Disk2-vol002, ... and Bacula will use them as if they were on /disk1 but actually write the data to /disk2. The only minor inconvenience with this method is that you must explicitly name the disks and cannot use automatic labeling unless you arrange to have the labels exactly match the links you have created.

An important thing to know is that Bacula treats disks like tape drives as much as it can. This means that you can only have a single Volume mounted at one time on a disk as defined in your Device resource in the Storage daemon's conf file. You can have multiple concurrent jobs running that all write to the one Volume that is being used, but if you want to have multiple concurrent jobs that are writting to separate disks drives (or partitions), you will need to define separate Device resources for each one, exactly as you would do for two different tape drives. There is one fundamental difference, however. The Volumes that you creat on the two drives cannot be easily exchanged as they can for a tape drive, because they are physically resident (already mounted in a sense) on the particular drive. As a consequence, you will probably want to give them different Media Types so that Bacula can distinguish what Device resource to use during a restore. An example would be the following:

Device {
  Name = Disk1
  Media Type = File1
  Archive Device = /disk1
  LabelMedia = yes;
  Random Access = Yes;
  AutomaticMount = yes;
  RemovableMedia = no;
  AlwaysOpen = no;
}

Device {
  Name = Disk2
  Media Type = File2
  Archive Device = /disk2
  LabelMedia = yes;
  Random Access = Yes;
  AutomaticMount = yes;
  RemovableMedia = no;
  AlwaysOpen = no;
}

With the above device definitions, you can run two concurrent jobs each writing at the same time, one to /disk2 and the other to /disk2. The fact that you have given them different Media Types will allow Bacula to quickly choose the correct Storage resource in the Director when doing a restore.

Considerations for Multiple Clients

If we take the above example and add a second Client, here are a few considerations:

Although the second client can write to the same set of Volumes, you will probably want to write to a different set.
You can write to a different set of Volumes by defining a second Pool, which has a different name and a different LabelFormat.
If you wish the Volumes for the second client to go into a different directory (perhaps even on a different filesystem to spread the load), you would do so by defining a second Device resource in the Storage daemon. The Name must be different, and the Archive Device could be different. To ensure that Volumes are never mixed from one pool to another, you might also define a different MediaType (e.g. File1).

In this example, we have two clients, each with a different Pool and a different number of archive files retained. They also write to different directories with different Volume labeling.

The Director's configuration file is as follows:

Director {
  Name = my-dir
  QueryFile = "~/bacula/bin/query.sql"
  PidDirectory = "~/bacula/working"
  WorkingDirectory = "~/bacula/working"
  Password = dir_password
}
# Basic weekly schedule
Schedule {
  Name = "WeeklySchedule"
  Run = Level=Full fri at 1:30
  Run = Level=Incremental sat-thu at 1:30
}
FileSet {
  Name = "Example FileSet"
  Include = compression=GZIP signature=SHA1 {
    /home/kern/bacula/bin
  }
}
Job {
  Name = "Backup-client1"
  Type = Backup
  Level = Full
  Client = client1
  FileSet= "Example FileSet"
  Messages = Standard
  Storage = File1
  Pool = client1
  Schedule = "WeeklySchedule"
}
Job {
  Name = "Backup-client2"
  Type = Backup
  Level = Full
  Client = client2
  FileSet= "Example FileSet"
  Messages = Standard
  Storage = File2
  Pool = client2
  Schedule = "WeeklySchedule"
}
Client {
  Name = client1
  Address = client1
  Catalog = BackupDB
  Password = client1_password
  File Retention = 7d
}
Client {
  Name = client2
  Address = client2
  Catalog = BackupDB
  Password = client2_password
}
# Two Storage definitions with differen Media Types
#  permits different directories
Storage {
  Name = File1
  Address = rufus
  Password = local_storage_password
  Device = client1
  Media Type = File1
}
Storage {
  Name = File2
  Address = rufus
  Password = local_storage_password
  Device = client2
  Media Type = File2
}
Catalog {
  Name = BackupDB
  dbname = bacula; user = bacula; password = ""
}
Messages {
  Name = Standard
  ...
}
# Two pools permits different cycling periods and Volume names
# Cycle through 15 Volumes (two weeks)
Pool {
  Name = client1
  Use Volume Once = yes
  Pool Type = Backup
  LabelFormat = "Client1-"
  AutoPrune = yes
  VolumeRetention = 13d
  Maximum Volumes = 15
  Recycle = yes
}
# Cycle through 8 Volumes (1 week)
Pool {
  Name = client2
  Use Volume Once = yes
  Pool Type = Backup
  LabelFormat = "Client2-"
  AutoPrune = yes
  VolumeRetention = 6d
  Maximum Volumes = 8
  Recycle = yes
}

and the Storage daemon's configuration file is:

Storage {
  Name = my-sd
  WorkingDirectory = "~/bacula/working"
  Pid Directory = "~/bacula/working"
  MaximumConcurrentJobs = 10
}
Director {
  Name = my-dir
  Password = local_storage_password
}
# Archive directory for Client1
Device {
  Name = client1
  Media Type = File1
  Archive Device = /home/bacula/client1
  LabelMedia = yes;
  Random Access = Yes;
  AutomaticMount = yes;
  RemovableMedia = no;
  AlwaysOpen = no;
}
# Archive directory for Client2
Device {
  Name = client2
  Media Type = File2
  Archive Device = /home/bacula/client2
  LabelMedia = yes;
  Random Access = Yes;
  AutomaticMount = yes;
  RemovableMedia = no;
  AlwaysOpen = no;
}
Messages {
  Name = Standard
  director = my-dir = all
}

DVD Volumes

Bacula allows you to specify that you want to write to DVD. However, this feature is implemented only in version 1.37 or later. You may in fact write to DVD+RW, DVD+R, DVD-R, or DVD-RW media. The actual process used by Bacula is to first write the image to a spool directory, then when the Volume reaches a certain size or, at your option, at the end of a Job, Bacula will transfer the image from the spool directory to the DVD. The actual work of transferring the image is done by a script dvd-handler, and the heart of that script is a program called growisofs which allows creating or adding to a DVD ISO filesystem.

You must have dvd+rw-tools loaded on your system for DVD writing to work. Please note that the original dvd+rw-tools package does NOT work with Bacula. You must apply a patch which can be found in the patches directory of Bacula sources with the name dvd+rw-tools-5.21.4.10.8.bacula.patch.

The fact that Bacula cannot use the OS to write directly to the DVD makes the whole process a bit more error prone than writing to a disk or a tape, but nevertheless, it does work if you use some care to set it up properly. However, at the current time (28 October 2005) we still consider this code to be experimental and of BETA quality. As a consequence, please do careful testing before relying on DVD backups in production.

The remainder of this chapter explains the various directives that you can use to control the DVD writing.

DVD Specific SD Directives

The following directives are added to the Storage daemon's Device resource.

Requires Mount = Yes|No

You must set this directive to yes for DVD-writers, and to no for all other devices (tapes/files). This directive indicates if the device requires to be mounted using the Mount Command. To be able to write a DVD, the following directives must also be defined: Mount Point, Mount Command, Unmount Command and Write Part Command.

Mount Point = directory

Directory where the device can be mounted.

Mount Command = name-string

Command that must be executed to mount the device. Although the device is written directly, the mount command is necessary in order to determine the free space left on the DVD. Before the command is executed, %a is replaced with the Archive Device, and %m with the Mount Point.

Most frequently, you will define it as follows:

  Mount Command = "/bin/mount -t iso9660 -o ro %a %m"

Unmount Command = name-string

Command that must be executed to unmount the device. Before the command is executed, %a is replaced with the Archive Device, and %m with the Mount Point.

Most frequently, you will define it as follows:

  Unmount Command = "/bin/umount %m"

Write Part Command = name-string

Command that must be executed to write a part to the device. Before the command is executed, %a is replaced with the Archive Device, %m with the Mount Point, %e is replaced with 1 if we are writing the first part, and with 0 otherwise, and %v with the current part filename.

For a DVD, you will most frequently specify the Bacula supplied dvd-handler script as follows:

  Write Part Command = "/path/dvd-handler %a write %e %v"

Where /path is the path to your scripts install directory, and dvd-handler is the Bacula supplied script file. This command will already be present, but commented out, in the default bacula-sd.conf file. To use it, simply remove the comment (#) symbol.

Free Space Command = name-string

Command that must be executed to check how much free space is left on the device. Before the command is executed,%a is replaced with the Archive Device, %m with the Mount Point, %e is replaced with 1 if we are writing the first part, and with 0 otherwise, and %v with the current part filename.

For a DVD, you will most frequently specify the Bacula supplied dvd-handler script as follows:

  Free Space Command = "/path/dvd-handler %a free"

Where /path is the path to your scripts install directory, and dvd-freespace is the Bacula supplied script file. If you want to specify your own command, please look at the code in dvd-handler to see what output Bacula expects from this command. This command will already be present, but commented out, in the default bacula-sd.conf file. To use it, simply remove the comment (#) symbol.

If you do not set it, Bacula will expect there is always free space on the device.

In addition to the directives specified above, you must also specify the other standard Device resource directives. Please see the sample DVD Device resource in the default bacula-sd.conf file. Be sure to specify the raw device name for Archive Device. It should be a name such as /dev/cdrom or /media/cdrecorder or /dev/dvd depending on your system. It will not be a name such as /mnt/cdrom.

DVD Specific Director Directives

The following directives are added to the Director's Job resource.

Write Part After Job = <yes|no>

If this directive is set to yes (default no), the Volume written to a temporary spool file for the current Job will be written to the DVD as a new part file will be created after the job is finished.

It should be set to yes when writing to devices that require a mount (for example DVD), so you are sure that the current part, containing this job's data, is written to the device, and that no data is left in the temporary file on the hard disk. However, on some media, like DVD+R and DVD-R, a lot of space (about 10Mb) is lost everytime a part is written. So, if you run several jobs each after another, you could set this directive to no for all jobs, except the last one, to avoid wasting too much space, but to ensure that the data is written to the medium when all jobs are finished.

This directive is ignored for devices other than DVDs.

Other Points

Writing and reading of DVD+RW seems to work quite reliably provided you are using the patched dvd+rw-mediainfo programs. On the other hand, we do not have enough information to ensure that DVD-RW or other forms of DVDs work correctly.
DVD+RW supports only about 1000 overwrites. Every time you mount the filesystem read/write will count as one write. This can add up quickly, so it is best to mount your DVD+RW filesystem read-only. Bacula does not need the DVD to be mounted read-write, since it uses the raw device for writing.
Reformatting DVD+RW 10-20 times can apparently make the medium unusable. Normally you should not have to format or reformat DVD+RW media. If it is necessary, current versions of growisofs will do so automatically.
We have had several problems writing to DVD-RWs (this does NOT concern DVD+RW), because these media have two writing-modes: Incremental Sequential and Restricted Overwrite. Depending on your device and the media you use, one of these modes may not work correctly (e.g. Incremental Sequential does not work with my NEC DVD-writer and Verbatim DVD-RW).
To retrieve the current mode of a DVD-RW, run:
```
  dvd+rw-mediainfo /dev/xxx
```
where you replace xxx with your DVD device name.
Mounted Media line should give you the information.
To set the device to Restricted Overwrite mode, run:
```
  dvd+rw-format /dev/xxx
```
If you want to set it back to the default Incremental Sequential mode, run:
```
  dvd+rw-format -blank /dev/xxx
```
Bacula only accepts to write to blank DVDs. To quickly blank a DVD+/-RW, run this command:
```
  dd if=/dev/zero bs=1024 count=512 | growisofs -Z /dev/xxx=/dev/fd/0
```
Then, try to mount the device, if it cannot be mounted, it will be considered as blank by Bacula, if it can be mounted, try a full blank (see below).
If you wish to blank completely a DVD+/-RW, use the following:
```
  growisofs -Z /dev/xxx=/dev/zero
```
where you replace xxx with your DVD device name. However, note that this blanks the whole DVD, which takes quite a long time (16 minutes on mine).
DVD+RW and DVD-RW support only about 1000 overwrites (i.e. don't use the same medium for years if you don't want to have problems...).
For more informations about DVD writing, please look at the dvd+rw-tools homepage.

Automated Disk Backup

If you manage 5 or 10 machines and have a nice tape backup, you don't need Pools, and you may wonder what they are good for. In this chapter, you will see that Pools can help you optimize disk storage space. The same techniques can be applied to a shop that has multiple tape drives, or that wants to mount various different Volumes to meet their needs.

The rest of this chapter will give an example involving backup to disk Volumes, but most of the information applies equally well to tape Volumes.

The Problem

A site that I administer (a charitable organization) had a tape DDS-3 tape drive that was failing. The exact reason for the failure is still unknown. Worse yet, their full backup size is about 15GB whereas the capacity of their broken DDS-3 was at best 8GB (rated 6/12). A new DDS-4 tape drive and the necessary cassettes was more expensive than their budget could handle.

The Solution

They want to maintain 6 months of backup data, and be able to access the old files on a daily basis for a week, a weekly basis for a month, then monthly for 6 months. In addition, offsite capability was not needed (well perhaps it really is, but it was never used). Their daily changes amount to about 300MB on the average, or about 2GB per week.

As a consequence, the total volume of data they need to keep to meet their needs is about 100GB (15GB x 6 + 2GB x 5 + 0.3 x 7) = 102.1GB.

The chosen solution was to buy a 120GB hard disk for next to nothing -- far less than 1/10th the price of a tape drive and the cassettes to handle the same amount of data, and to have Bacula write to disk files.

The rest of this chapter will explain how to setup Bacula so that it would automatically manage a set of disk files with the minimum intervention on my part. The system has been running since 22 January 2004 until today (08 April 2004) with no intervention. Since we have not yet crossed the six month boundary, we still lack some data to be sure the system performs as desired.

Overall Design

Getting Bacula to write to disk rather than tape in the simplest case is rather easy, and is documented in the previous chapter. In addition, all the directives discussed here are explained in that chapter. We'll leave it to you to look at the details there. If you haven't read it and are not familiar with Pools, you probably should at least read it once quickly for the ideas before continuing here.

One needs to consider about what happens if we have only a single large Bacula Volume defined on our hard disk. Everything works fine until the Volume fills, then Bacula will ask you to mount a new Volume. This same problem applies to the use of tape Volumes if your tape fills. Being a hard disk and the only one you have, this will be a bit of a problem. It should be obvious that it is better to use a number of smaller Volumes and arrange for Bacula to automatically recycle them so that the disk storage space can be reused. The other problem with a single Volume, is that at the current time (1.34.0) Bacula does not seek within a disk Volume, so restoring a single file can take more time than one would expect.

As mentioned, the solution is to have multiple Volumes, or files on the disk. To do so, we need to limit the use and thus the size of a single Volume, by time, by number of jobs, or by size. Any of these would work, but we chose to limit the use of a single Volume by putting a single job in each Volume with the exception of Volumes containing Incremental backup where there will be 6 jobs (a week's worth of data) per volume. The details of this will be discussed shortly.

The next problem to resolve is recycling of Volumes. As you noted from above, the requirements are to be able to restore monthly for 6 months, weekly for a month, and daily for a week. So to simplify things, why not do a Full save once a month, a Differential save once a week, and Incremental saves daily. Now since each of these different kinds of saves needs to remain valid for differing periods, the simplest way to do this (and possibly the only) is to have a separate Pool for each backup type.

The decision was to use three Pools: one for Full saves, one for Differential saves, and one for Incremental saves, and each would have a different number of volumes and a different Retention period to accomplish the requirements.

Full Pool

Putting a single Full backup on each Volume, will require six Full save Volumes, and a retention period of six months. The Pool needed to do that is:

Pool {
  Name = Full-Pool
  Pool Type = Backup
  Recycle = yes
  AutoPrune = yes
  Volume Retention = 6 months
  Accept Any Volume = yes
  Maximum Volume Jobs = 1
  Label Format = Full-
  Maximum Volumes = 6
}

Since these are disk Volumes, no space is lost by having separate Volumes for each backup (done once a month in this case). The items to note are the retention period of six months (i.e. they are recycled after 6 months), that there is one job per volume (Maximum Volume Jobs = 1), the volumes will be labeled Full-0001, ... Full-0006 automatically. One could have labeled these manual from the start, but why not use the features of Bacula.

Differential Pool

For the Differential backup Pool, we choose a retention period of a bit longer than a month and ensure that there is at least one Volume for each of the maximum of five weeks in a month. So the following works:

Pool {
  Name = Diff-Pool
  Pool Type = Backup
  Recycle = yes
  AutoPrune = yes
  Volume Retention = 40 days
  Accept Any Volume = yes
  Maximum Volume Jobs = 1
  Label Format = Diff-
  Maximum Volumes = 6
}

As you can see, the Differential Pool can grow to a maximum of six volumes, and the Volumes are retained 40 days and thereafter they can be recycled. Finally there is one job per volume. This, of course, could be tightened up a lot, but the expense here is a few GB which is not too serious.

Incremental Pool

Finally, here is the resource for the Incremental Pool:

Pool {
  Name = Inc-Pool
  Pool Type = Backup
  Recycle = yes
  AutoPrune = yes
  Volume Retention = 20 days
  Accept Any Volume = yes
  Maximum Volume Jobs = 6
  Label Format = Inc-
  Maximum Volumes = 5
}

We keep the data for 20 days rather than just a week as the needs require. To reduce the proliferation of volume names, we keep a week's worth of data (6 incremental backups) in each Volume. In practice, the retention period should be set to just a bit more than a week and keep only two or three volumes instead of five. Again, the lost is very little and as the system reaches the full steady state, we can adjust these values so that the total disk usage doesn't exceed the disk capacity.

The Actual Conf Files

The following example shows you the actual files used, with only a few minor modifications to simplify things.

The Director's configuration file is as follows:

Director {          # define myself
  Name = bacula-dir
  DIRport = 9101
  QueryFile = "/home/bacula/bin/query.sql"
  WorkingDirectory = "/home/bacula/working"
  PidDirectory = "/home/bacula/working"
  Maximum Concurrent Jobs = 1
  Password = " "
  Messages = Standard
}
#   By default, this job will back up to disk in /tmp
Job {
  Name = client
  Type = Backup
  Client = client-fd
  FileSet = "Full Set"
  Schedule = "WeeklyCycle"
  Storage = File
  Messages = Standard
  Pool = Default
  Full Backup Pool = Full-Pool
  Incremental Backup Pool = Inc-Pool
  Differential Backup Pool = Diff-Pool
  Write Bootstrap = "/home/bacula/working/client.bsr"
  Priority = 10
}
# List of files to be backed up
FileSet {
  Name = "Full Set"
  Include = signature=SHA1 compression=GZIP9 {
    /
    /usr
    /home
  }
  Exclude = {
     /proc /tmp /.journal /.fsck
  }
}
Schedule {
  Name = "WeeklyCycle"
  Run = Full 1st sun at 1:05
  Run = Differential 2nd-5th sun at 1:05
  Run = Incremental mon-sat at 1:05
}
Client {
  Name = client-fd
  Address = client
  FDPort = 9102
  Catalog = MyCatalog
  Password = " "
  AutoPrune = yes      # Prune expired Jobs/Files
  Job Retention = 6 months
  File Retention = 60 days
}
Storage {
  Name = File
  Address = localhost
  SDPort = 9103
  Password = " "
  Device = FileStorage
  Media Type = File
}
Catalog {
  Name = MyCatalog
  dbname = bacula; user = bacula; password = ""
}
Pool {
  Name = Full-Pool
  Pool Type = Backup
  Recycle = yes           # automatically recycle Volumes
  AutoPrune = yes         # Prune expired volumes
  Volume Retention = 6 months
  Accept Any Volume = yes # write on any volume in the pool
  Maximum Volume Jobs = 1
  Label Format = Full-
  Maximum Volumes = 6
}
Pool {
  Name = Inc-Pool
  Pool Type = Backup
  Recycle = yes           # automatically recycle Volumes
  AutoPrune = yes         # Prune expired volumes
  Volume Retention = 20 days
  Accept Any Volume = yes
  Maximum Volume Jobs = 6
  Label Format = Inc-
  Maximum Volumes = 5
}
Pool {
  Name = Diff-Pool
  Pool Type = Backup
  Recycle = yes
  AutoPrune = yes
  Volume Retention = 40 days
  Accept Any Volume = yes
  Maximum Volume Jobs = 1
  Label Format = Diff-
  Maximum Volumes = 6
}
Messages {
  Name = Standard
  mailcommand = "bsmtp -h mail.domain.com -f \"\(Bacula\) %r\"
      -s \"Bacula: %t %e of %c %l\" %r"
  operatorcommand = "bsmtp -h mail.domain.com -f \"\(Bacula\) %r\"
      -s \"Bacula: Intervention needed for %j\" %r"
  mail = root@domain.com = all, !skipped
  operator = root@domain.com = mount
  console = all, !skipped, !saved
  append = "/home/bacula/bin/log" = all, !skipped
}

and the Storage daemon's configuration file is:

Storage {               # definition of myself
  Name = bacula-sd
  SDPort = 9103       # Director's port
  WorkingDirectory = "/home/bacula/working"
  Pid Directory = "/home/bacula/working"
}
Director {
  Name = bacula-dir
  Password = " "
}
Device {
  Name = FileStorage
  Media Type = File
  Archive Device = /files/bacula
  LabelMedia = yes;    # lets Bacula label unlabeled media
  Random Access = Yes;
  AutomaticMount = yes;   # when device opened, read it
  RemovableMedia = no;
  AlwaysOpen = no;
}
Messages {
  Name = Standard
  director = bacula-dir = all
}

Kern Sibbald 2008-01-31