bacula director crashing whole system

Bug #1026680 reported by Thomas Schweikle on 2012-07-19
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
bacula (Ubuntu)
Critical
Unassigned
Precise
Critical
Unassigned

Bug Description

I've installed bacula-fd, bacula-sd, bacula-director, and bacula-console on an ubuntu 10.04 system. Bacula working with an PostgreSQL-Database. Upgrading from 10.04 to 10.10, then 11.04 and 11.10 didn't reveal any problems of unstability of bacula or postgres. After upgrading to ubuntu 12.04 LTS the system started to crash randomly.
At now I am not able to finish even only one backup cycle without bacula crashing. The host runs, but isn't reachable any more over the network. Trying to log in at console lets me type username and password, but after successfully logging in, I am immedeately logged out again. Running a shell on a terminal directly doesn't work either: if it is started on tty12, you may switch to tty12, but no keypress accepted – the shell is dead.
Pinging the server will timeout. The only way to recover: reset it, or, if impossible, power cycle. The system will start again bacula running, but all logs truncated, because of file-system repairs and log replay. No messages telling about why the system crashed.

While bacula runs fine doing nothing, it does run fine too, backing up only a few system, it starts crashing backing up lots of systems with a realy huge database. I assume the problem exists with the database exceeding some size, but, since the logs truncated until the last commit, I do not have any information about what part of the system ist resposible for the crashes.

ProblemType: Bug
DistroRelease: Ubuntu 12.04
Package: bacula-director-pgsql 5.2.5-0ubuntu6.1
ProcVersionSignature: Ubuntu 3.2.0-26.41-virtual 3.2.19
Uname: Linux 3.2.0-26-virtual x86_64
ApportVersion: 2.0.1-0ubuntu11
Architecture: amd64
Date: Thu Jul 19 17:44:55 2012
InstallationMedia: Ubuntu-Server 10.04.1 LTS "Lucid Lynx" - Release amd64 (20100816.2)
ProcEnviron:
 TERM=screen-bce
 PATH=(custom, user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: bacula
UpgradeStatus: Upgraded to precise on 2012-04-12 (97 days ago)

Thomas Schweikle (tps) wrote :
Clint Byrum (clint-fewbar) wrote :

Thomas this sounds pretty serious. We'll have to setup a test to try and recreate your scenario, which may take a while. Any information you can provide to help us duplicate the issues, such as config files (redacted of course) and numbers (numbers of files, disks, etc) will be helpful.

Thanks!

Changed in bacula (Ubuntu):
importance: Undecided → Critical
Thomas Schweikle (tps) wrote :

I'll post the configuration files. But since they hold confidential stuff I'd like to keep them from public ...

Thomas Schweikle (tps) wrote :

The problem arises as soon as the database is accessed. Short after bacula director exhausting all available memory. This kills various running tasks, including sshd, cutting the server from access.

Thomas Schweikle (tps) wrote :

I did a few additional tests now. Some results:

1. I've installed bacula on a newly set up system (Ubuntu 12.04 LTS), with an empty database: works as expected.
2. I've installed bacula on a newly set up system (Ubuntu 10.04 LTS), with an empty database: works as expected.
3. I've installed bacula on a newly set up system (Ubuntu 12.04 LTS), then imported an old bacula database:
  exausts message about wrong database. After migrating the database works as expected.
4. I've installed bacula on a newly set up system (Ubuntu 10.04 LTS), then imported an old bacula database:
  works as expected. Then migrated to Ubuntu 10.10, 11.04, 11.10: works as expected. Then migrated to
  Ubuntu 12.04: crashes after accessing the database to start a backup.
5. I've installed bacula on a newly set up system (Ubuntu 10.04 LTS), then imported an old bacula database:
  works as expected. Then migrated to Ubuntu 10.10, 11.04, 11.10: works as expected. Then migrated to
  Ubuntu 12.04. Wiped the database, then imported an old one, then migrated the database as described
  on the bacula pages: works as expected.
6. I've installed bacula on a newly set up Ubuntu 10.04, then imported an old database: works as expected.
  Next thing done was migrating to 10.10, 11.04, 11.10. Testing after every migration. Each time it worked
  as expected. After migrating to 12.04 bacula crashed (what I awaited). Now I tried to migrate the database
  as described on the bacula web site. This time without success: various error messages, then one stating
  the database allready where migrated and nothing changed.

For me it looks a lot like migrating to Ubuntu 12.04, changing bacula from 5.0.x to 5.2.x breaks the
database, forcing bacula to crash. I did not look at what exactly causes bacula to crash only trying to
replicate the path I had taken leading to this crash.

Clint Byrum (clint-fewbar) wrote :

Thanks for all the testing Thomas. It sounds like there may also be something specific in that database that breaks because of migrations. I wonder, can you also try 10.04 -> 12.04 without all the intermediate steps?

Changed in bacula (Ubuntu Precise):
importance: Undecided → Critical
milestone: none → ubuntu-12.04.2
Thomas Schweikle (tps) wrote :

This is related to postgresql not using passwords if user "postgres" used to connect to databases. All database upgrade scripts are written to use passwords, but the upgrade script used by ubuntu switches to user postgres, then failing while providing passwords. Other parts of the script use the provided bacula-user then fail, because of missing privileges (these are only needed while upgrading, not for normal operation). After this messy routine ran, you'll have a database more or less working, but breaking bacula at various places, just because the database layout isn't what bacula is awaiting (some upgrades fail because a password is given, where no one is defined, some break because the necessary password isn't provided).

While searching for this error I had to find: the scripts provided to create the bacula database are not part of the install. create-bacula-pgsql-db and create-bacula-mysql-db are just missing. I am awaiting these scripts as found in bacula docs. At /etc/bacula/scripts! No where else. At sure not messing around with additional packages to find them. It is bad behaviour not installing these with the bacula-director-mysql- or bacula-director-pgsql-packages. I've already mentioned this before now again finding these scripts not provided. Are there any arguments against providing them??

James Page (james-page) wrote :

Hi Thomas

Thanks for the additional information; Please could you confirm how you would expect the database upgrade to take place in your environment?

The packages provide upgrade scripts for dbconfig-common; but if you are not using this method I can see how you might hit issues.

James Page (james-page) wrote :

I can see one thing that might be causing this even with dbconfig-common; this is present in the latest packages in quantal:

ALTER DATABASE _DBC_DBNAME_ SET datestyle TO 'ISO, YMD';

But its not present in precise; neither is it applied during upgrades AFAICT.

James Page (james-page) wrote :

Hmm - but it actually appears to be OK anyway:

ubuntu@server-16254:/var/log/dist-upgrade$ sudo -u postgres psql bacula
Use of uninitialized value $lib_path in concatenation (.) or string at /usr/bin/psql line 116.
psql (8.4.13)
Type "help" for help.

bacula=# show datestyle;
 DateStyle
-----------
 ISO, YMD
(1 row)

bacula=#

From a 10.04 -> 12.04 upgraded installation

James Page (james-page) wrote :

Note that some of the standard scripts for maintaining a bacula postgresql database can be found in /usr/share/bacula-director.

However the create_* scripts are not installed with the packages.

James Page (james-page) wrote :

Thomas

FYI I suspect the reason the scripts are installed to /usr/share/bacula-director is that /etc/bacula is reserved for configuration files; these files are not configuration files that users would be expected to edit so they are installed to /usr/share/bacula-director instead.

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in bacula (Ubuntu Precise):
status: New → Confirmed
Changed in bacula (Ubuntu):
status: New → Confirmed
Colin Watson (cjwatson) on 2013-02-13
Changed in bacula (Ubuntu Precise):
milestone: ubuntu-12.04.2 → ubuntu-12.04.3
James Page (james-page) on 2013-02-13
Changed in bacula (Ubuntu Precise):
milestone: ubuntu-12.04.3 → none
Thomas Schweikle (tps) on 2014-11-02
Changed in bacula (Ubuntu Precise):
status: Confirmed → Fix Released
Changed in bacula (Ubuntu):
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers