Comment 14 for bug 1799497

Revision history for this message
Luis Rodriguez (laragones) wrote :

Hello, I sumbitted the report on LXD since that is the only thing I have installed on the server that is actively running as Stéphane mentioned on https://github.com/lxc/lxd/issues/5197

I also thought it maybe hardware issue, but since upgrading to 18.04 in May I have experienced this on a variety of hardware, and even though I thought it may be upgrade issue it is also not the case.

I also thought it was memory related, since now it occurs, as Stéphane mentiones around once a week, but in my case on different servers. THe last server where it happened didn't have any issue for the last maybe two months and was not that loaded in terms of memory, but it seems more frequent in servers that are actively used in both memory and CPU.

It doesn't happen on blade hosts that only have 2-4 LXD containers and 4GB of RAM, it has only happened on 16GB, 24GB, 48GB and 128GB of RAM HP and Dell servers, that have a little more load (minimum 6 containers up to 20)

At least I a not alone, but have no clue how to recreate or address this issue (since also logs provide no information)

I could also try some kernels. On 4.4 as Stephane mentioned didn't happen, int only started happening on GA (as he also mentiones) of 18.04. I have been constantly upgrading the kernel to no avail. So it seems it could have been introduced before.

strangely and thankfully it doesn't happen on my main production server (Except yesterday crash on one of them). Mostly on development servers that are actively used (developers are not happy)