OOM errors in ZFS integration testing

Bug #1831634 reported by Dan Watkins
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
curtin
Triaged
Medium
Unassigned

Bug Description

In two separate curtin integration test runs, we've seen ZFS tests OOM:

ERROR: test suite for <class 'vmtests.test_zfsroot.CosmicTestZfsRootFsType'>
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/nose/plugins/multiprocess.py", line 788, in run
    self.setUp()
  File "/usr/lib/python3/dist-packages/nose/suite.py", line 293, in setUp
    self.setupContext(ancestor)
  File "/usr/lib/python3/dist-packages/nose/plugins/multiprocess.py", line 770, in setupContext
    super(NoSharedFixtureContextSuite, self).setupContext(context)
  File "/usr/lib/python3/dist-packages/nose/suite.py", line 316, in setupContext
    try_run(context, names)
  File "/usr/lib/python3/dist-packages/nose/util.py", line 471, in try_run
    return func()
  File "/var/lib/jenkins/slaves/torkoal/workspace/curtin-vmtest-devel-amd64/curtin-785/tests/vmtests/__init__.py", line 1220, in setUpClass
    '\n'.join(errors))
Exception: CosmicTestZfsRootFsType:Errors during curtin installer[ 260.793195] cloud-init[938]: Unpacking linux-headers-4.18.0-20-generic (4.18.0-20.21) ...
[ 262.860664] Out of memory: Kill process 938 (cloud-init) score 29 or sacrifice child
[ 262.861969] Killed process 973 (python3) total-vm:41748kB, anon-rss:14132kB, file-rss:0kB, shmem-rss:0kB

and

ERROR: test suite for <class 'vmtests.test_zfsroot.EoanTestZfsRootFsType'>
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/nose/plugins/multiprocess.py", line 788, in run
    self.setUp()
  File "/usr/lib/python3/dist-packages/nose/suite.py", line 293, in setUp
    self.setupContext(ancestor)
  File "/usr/lib/python3/dist-packages/nose/plugins/multiprocess.py", line 770, in setupContext
    super(NoSharedFixtureContextSuite, self).setupContext(context)
  File "/usr/lib/python3/dist-packages/nose/suite.py", line 316, in setupContext
    try_run(context, names)
  File "/usr/lib/python3/dist-packages/nose/util.py", line 471, in try_run
    return func()
  File "/var/lib/jenkins/slaves/torkoal/workspace/curtin-vmtest-devel-amd64-proposed/curtin-54/tests/vmtests/__init__.py", line 1220, in setUpClass
    '\n'.join(errors))
Exception: EoanTestZfsRootFsType:Errors during curtin installer[ 352.995578] cloud-init[756]: Setting up linux-headers-5.0.0-15 (5.0.0-15.16) ...
[ 352.997122] cloud-init[756]: Setting up linux-modules-5.0.0-15-generic (5.0.0-15.16) ...
[ 355.370729] Out of memory: Kill process 27158 (depmod) score 24 or sacrifice child
[ 355.372005] Killed process 27158 (depmod) total-vm:54192kB, anon-rss:48584kB, file-rss:1676kB, shmem-rss:0kB

(The source for these errors: https://jenkins.ubuntu.com/server/job/curtin-vmtest-devel-amd64/785/console and https://jenkins.ubuntu.com/server/job/curtin-vmtest-devel-amd64-proposed/54/console respectively.)

Tags: ci
Revision history for this message
Ryan Harper (raharper) wrote :

We already bump disco and eoan to 2G VMs, but if there are a lot of *updates* since the daily image was released, we may see these.

We could bump Disco/Eoan and up to 4G though I don't want to do that until we land something to address memory pressure on the host. Alternatively, we can bump to 4G in a branch and also see about reducing the total number of tests we run.

Revision history for this message
Dan Watkins (oddbloke) wrote : Re: [Bug 1831634] Re: OOM errors in ZFS integration testing

On Tue, Jun 04, 2019 at 03:07:44PM -0000, Ryan Harper wrote:
> We already bump disco and eoan to 2G VMs, but if there are a lot of
> *updates* since the daily image was released, we may see these.
>
> We could bump Disco/Eoan and up to 4G though I don't want to do that
> until we land something to address memory pressure on the host.
> Alternatively, we can bump to 4G in a branch and also see about reducing
> the total number of tests we run.

I've only seem this OOM in ZFS tests (admittedly in an extremely small
sample); are we able to bump memory on a per-test basis instead of
per-suite?

Revision history for this message
Ryan Harper (raharper) wrote :

On Tue, Jun 4, 2019 at 10:40 AM Dan Watkins <email address hidden>
wrote:

> On Tue, Jun 04, 2019 at 03:07:44PM -0000, Ryan Harper wrote:
> > We already bump disco and eoan to 2G VMs, but if there are a lot of
> > *updates* since the daily image was released, we may see these.
> >
> > We could bump Disco/Eoan and up to 4G though I don't want to do that
> > until we land something to address memory pressure on the host.
> > Alternatively, we can bump to 4G in a branch and also see about reducing
> > the total number of tests we run.
>
> I've only seem this OOM in ZFS tests (admittedly in an extremely small
> sample); are we able to bump memory on a per-test basis instead of
> per-suite?
>

Actually yes. It has shown up in a few other places but it's most common
with ZFS.

>
> --
> You received this bug notification because you are subscribed to curtin.
> Matching subscriptions: curtin-bugs-all
> https://bugs.launchpad.net/bugs/1831634
>
> Title:
> OOM errors in ZFS integration testing
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/curtin/+bug/1831634/+subscriptions
>

Dan Watkins (oddbloke)
Changed in curtin:
status: New → Triaged
importance: Undecided → Medium
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.