Modify SST code to use fork()/exec() to allow cleanup on fatal signals

Bug #1382797 reported by Alexey Kopytov on 2014-10-18
24
This bug affects 5 people
Affects Status Importance Assigned to Milestone
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC
Status tracked in 5.6
5.5
Fix Released
Medium
Alexey Kopytov
5.6
Fix Released
Medium
Alexey Kopytov

Bug Description

The SST code in wsrep_util.cc uses posix_spawnp() to spawn a shell process executing the actual SST script. I don't know if there's any viable reason to use posix_spawnp() instead of fork()/exec(), but one problem we can solve by doing it with fork()/exec() is cleaning up children (i.e. processes spawned by the SST script) when mysqld crashes or is killed with a non-interceptable signal such as SIGKILL.

See comment #4 in https://bugs.launchpad.net/percona-xtradb-cluster/+bug/1380697.

Related branches

description: updated
description: updated
Alex Yurchenko (ayurchen) wrote :

Alexey, the reason posix_spawn() is used to fork SST script is that at that time it seemed to be the only way to spawn a process without creating a full copy of the parent memory image. On dedicated servers where mysqld used more than 50% of RAM and with no swap space, forking SST script used to fail due to memory constraints.

I suppose that the problem with cleaning children arises from the not-entirely-correct usage of posix_spawn() in SST code. Perhaps passing proper flags would have solved it.

Alexey Kopytov (akopytov) wrote :

OK, thanks for clarifications.

The reason posix_spawn() is faster and consumes less RAM is that it is implemented via vfork() / exec(), when the POSIX_SPAWN_USEVFORK attribute is used, which is what the SST code passes to posix_spawnattr_setflags().

I'm not sure if we can do anything about cleaning up children with the current code. For that, we need to make sure the _child_ SST process calls prctl(PR_SET_DEATHSIG, ...) as the first thing after it starts up. I don't see a way to achieve that with posix_spawn().

Howeven, we can "emulate" posix_spawn() with vfork()/exec() without the perfromance/resource penalty imposed by regular fork().
We can then call prctl(PR_SET_DEATHSIG, ...) in the child before we call exec().

Changed in percona-xtradb-cluster:
status: New → Confirmed

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXC-1114

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers