Backups not compressed because no compression level set

Bug #1815450 reported by Nick Moffitt on 2019-02-11
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
PostgreSQL Charm
Undecided
Unassigned

Bug Description

The templates/pg_backup_job.tmpl includes the following lines:

# Backup our databases. Go no further if this fails.
# And let's nice it too, since bzip has a tendency
# to spike the load average
nice -n 19 ${SCRIPTS}/dump-pg-db --dir=${DUMP_DIR} --compression=postgres ${DATABASES}

The dump-pg-db script is generated from pgbackup.py thusly:

    # The database backup script. Most of this is redundant now.
    source = os.path.join(hookenv.charm_dir(), 'scripts', 'pgbackup.py')
    destination = os.path.join(scripts_dir, 'dump-pg-db')
    with open(source, 'r') as f:
        helpers.write(destination, f.read(), mode=0o755)

The script in question has the following logic:

        if options.compression_cmd == 'postgres':
            cmd = " ".join([
                cmd,
                "--compress=%d" % options.compression_level
                    if options.compression_level else "",
                "--file=%s" % dest,
                database])

So, as there is no mechanism to pass compression level to the pg_backup_job template, the compression level is assumed to be 0 and compression options are not passed down. I'd argue that a compression level of 1 is probably a better default (as you get the best cost to benefits ratio going from 0 to 1 according to a friend of mine who did his doctoral dissertation on compression algorithms) but it would be nice (especially as the command is niced) to be able to crank it up to 9 via a config setting.

Stuart Bishop (stub) wrote :

pg_dump compresses by default when using custom-format output (at 'a moderate level'):

 --compress=0..9
     Specify the compression level to use. Zero means no compression.
     For the custom archive format, this specifies compression of
     individual table-data segments, and the default is to compress
     at a moderate level.

I think last time I touched this, I made the choice to stick with the PostgreSQL defaults, which have been pretty good for several revisions, and because doing better with something like parallel xz or bzip2 stopped being a good tradeoff in our deployments. Its cruft inherited from the pre-juju origins of this script.

We can certainly add a config option to pass through to increase or lower the compression level, but I'd like confirmation that someone is going to actually make use of it (so setting to needs information for now, I guess).

Changed in postgresql-charm:
status: New → Opinion
Nick Moffitt (nick-moffitt) wrote :

Hm, so I compressed one of these files and it came out no smaller. I suspect this is behaving correctly.

Changed in postgresql-charm:
status: Opinion → Invalid
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers