Default options for pingest can lead to massive WAL use
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Evergreen |
New
|
Undecided
|
Unassigned |
Bug Description
Evergreen version: 3.7.3
OpenSRF Version: N/A (using rel_3_2 branch)
PostgreSQL Version: 10.21-1.pdg18.04+1
Running a parallel ingest, pingest, with the default options to do all record ingests, with 8 max children and batch sizes of 10,000 records can lead to a massive backlog of WAL files on the database server.
As an example, our most recent use of pingest lead to the files in the pg_wal directory consuming 838GB of space. Our database was 511GB at the time. This caused our PostgreSQL partition to be 94% full. It would have crashed a lesser server.
I think we should lower the defaults for the max children and batch size options, but I've not done any testing to figure better values, yet.
tags: | added: cat-importexport |