2019-07-17 14:09:21 |
Christian Ehrhardt |
bug |
|
|
added bug |
2019-07-18 10:54:21 |
Christian Ehrhardt |
summary |
crash (on ppc64) hen restarting numad while huge guest is active |
crash (on ppc64) when restarting numad while huge guest is active |
|
2019-07-18 10:55:47 |
Christian Ehrhardt |
description |
I found that "by accident" while verifying another fix for numad.
It seems (at least on a power 9 box) that if you have a huge kvm guest running and restart numad that it crashes.
The crash seems related to some re-init of a static structure:
stack trace ---
#0 tcache_get (tc_idx=<optimized out>) at malloc.c:2950
e = 0x9a5ddc1950
e = <optimized out>
__PRETTY_FUNCTION__ = "tcache_get"
#1 __GI___libc_malloc (bytes=16) at malloc.c:3058
ar_ptr = <optimized out>
victim = <optimized out>
hook = <optimized out>
tbytes = <optimized out>
tc_idx = <optimized out>
__PRETTY_FUNCTION__ = "__libc_malloc"
#2 0x0000009a300279a0 in ?? ()
No symbol table info available.
#3 0x0000009a3002cad8 in ?? ()
No symbol table info available.
#4 0x0000009a30023794 in ?? ()
No symbol table info available.
#5 0x00007a6150998278 in generic_start_main (main=0x9a30022a00, argc=<optimized out>, argv=0x7fffe93a7828, auxvec=0x7fffe93a7880, init=<optimized out>, rtld_fini=<optimized out>, stack_end=<optimized out>, fini=<optimized out>) at ../csu/libc-start.c:308
self = 0x7a6150dc38d0
result = <optimized out>
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {8465053667230565969, 134558384812288, 8465057470262718529, 0 <repeats 13 times>, 134558387008032, 0, 134558387008040, 662230455376, 0, 2449962883098869759, 0 <repeats 42 times>}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x7fffe93a7700, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = -382044416}}}
not_first_call = <optimized out>
#6 0x00007a6150998484 in __libc_start_main (argc=<optimized out>, argv=<optimized out>, ev=<optimized out>, auxvec=<optimized out>, rtld_fini=<optimized out>, stinfo=<optimized out>, stack_on_entry=<optimized out>) at ../sysdeps/unix/sysv/linux/powerpc/libc-start.c:116
No locals.
#7 0x0000000000000000 in ?? ()
No symbol table info available.
--- source code stack trace ---
#0 tcache_get (tc_idx=<optimized out>) at malloc.c:2950
[Error: malloc.c was not found in source tree]
#1 __GI___libc_malloc (bytes=16) at malloc.c:3058
[Error: malloc.c was not found in source tree]
#2 0x0000009a300279a0 in ?? ()
#3 0x0000009a3002cad8 in ?? ()
#4 0x0000009a30023794 in ?? ()
#5 0x00007a6150998278 in generic_start_main (main=0x9a30022a00, argc=<optimized out>, argv=0x7fffe93a7828, auxvec=0x7fffe93a7880, init=<optimized out>, rtld_fini=<optimized out>, stack_end=<optimized out>, fini=<optimized out>) at ../csu/libc-start.c:308
[Error: libc-start.c was not found in source tree]
#6 0x00007a6150998484 in __libc_start_main (argc=<optimized out>, argv=<optimized out>, ev=<optimized out>, auxvec=<optimized out>, rtld_fini=<optimized out>, stinfo=<optimized out>, stack_on_entry=<optimized out>) at ../sysdeps/unix/sysv/linux/powerpc/libc-start.c:116
[Error: libc-start.c was not found in source tree]
#7 0x0000000000000000 in ?? ()
I thought at first this would be related to my debug rebuilds, but it seems to appear as-is. |
while verifying bug 1832915 I found "by accident" that this crash (at least on our power 9 box seems to happen often.
Case:
- huge kvm guest running
- restart numad
=> Numad crashes.
The crash seems related to some re-init of a static structure:
stack trace ---
#0 tcache_get (tc_idx=<optimized out>) at malloc.c:2950
e = 0x9a5ddc1950
e = <optimized out>
__PRETTY_FUNCTION__ = "tcache_get"
#1 __GI___libc_malloc (bytes=16) at malloc.c:3058
ar_ptr = <optimized out>
victim = <optimized out>
hook = <optimized out>
tbytes = <optimized out>
tc_idx = <optimized out>
__PRETTY_FUNCTION__ = "__libc_malloc"
#2 0x0000009a300279a0 in ?? ()
No symbol table info available.
#3 0x0000009a3002cad8 in ?? ()
No symbol table info available.
#4 0x0000009a30023794 in ?? ()
No symbol table info available.
#5 0x00007a6150998278 in generic_start_main (main=0x9a30022a00, argc=<optimized out>, argv=0x7fffe93a7828, auxvec=0x7fffe93a7880, init=<optimized out>, rtld_fini=<optimized out>, stack_end=<optimized out>, fini=<optimized out>) at ../csu/libc-start.c:308
self = 0x7a6150dc38d0
result = <optimized out>
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {8465053667230565969, 134558384812288, 8465057470262718529, 0 <repeats 13 times>, 134558387008032, 0, 134558387008040, 662230455376, 0, 2449962883098869759, 0 <repeats 42 times>}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x7fffe93a7700, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = -382044416}}}
not_first_call = <optimized out>
#6 0x00007a6150998484 in __libc_start_main (argc=<optimized out>, argv=<optimized out>, ev=<optimized out>, auxvec=<optimized out>, rtld_fini=<optimized out>, stinfo=<optimized out>, stack_on_entry=<optimized out>) at ../sysdeps/unix/sysv/linux/powerpc/libc-start.c:116
No locals.
#7 0x0000000000000000 in ?? ()
No symbol table info available.
--- source code stack trace ---
#0 tcache_get (tc_idx=<optimized out>) at malloc.c:2950
[Error: malloc.c was not found in source tree]
#1 __GI___libc_malloc (bytes=16) at malloc.c:3058
[Error: malloc.c was not found in source tree]
#2 0x0000009a300279a0 in ?? ()
#3 0x0000009a3002cad8 in ?? ()
#4 0x0000009a30023794 in ?? ()
#5 0x00007a6150998278 in generic_start_main (main=0x9a30022a00, argc=<optimized out>, argv=0x7fffe93a7828, auxvec=0x7fffe93a7880, init=<optimized out>, rtld_fini=<optimized out>, stack_end=<optimized out>, fini=<optimized out>) at ../csu/libc-start.c:308
[Error: libc-start.c was not found in source tree]
#6 0x00007a6150998484 in __libc_start_main (argc=<optimized out>, argv=<optimized out>, ev=<optimized out>, auxvec=<optimized out>, rtld_fini=<optimized out>, stinfo=<optimized out>, stack_on_entry=<optimized out>) at ../sysdeps/unix/sysv/linux/powerpc/libc-start.c:116
[Error: libc-start.c was not found in source tree]
#7 0x0000000000000000 in ?? ()
I thought at first this would be related to my debug rebuilds, but it seems to appear as-is. |
|
2019-07-18 11:02:56 |
Christian Ehrhardt |
numad (Ubuntu): importance |
Undecided |
Low |
|
2019-07-18 11:03:00 |
Christian Ehrhardt |
numad (Ubuntu): status |
New |
Confirmed |
|
2019-07-18 11:06:18 |
Christian Ehrhardt |
description |
while verifying bug 1832915 I found "by accident" that this crash (at least on our power 9 box seems to happen often.
Case:
- huge kvm guest running
- restart numad
=> Numad crashes.
The crash seems related to some re-init of a static structure:
stack trace ---
#0 tcache_get (tc_idx=<optimized out>) at malloc.c:2950
e = 0x9a5ddc1950
e = <optimized out>
__PRETTY_FUNCTION__ = "tcache_get"
#1 __GI___libc_malloc (bytes=16) at malloc.c:3058
ar_ptr = <optimized out>
victim = <optimized out>
hook = <optimized out>
tbytes = <optimized out>
tc_idx = <optimized out>
__PRETTY_FUNCTION__ = "__libc_malloc"
#2 0x0000009a300279a0 in ?? ()
No symbol table info available.
#3 0x0000009a3002cad8 in ?? ()
No symbol table info available.
#4 0x0000009a30023794 in ?? ()
No symbol table info available.
#5 0x00007a6150998278 in generic_start_main (main=0x9a30022a00, argc=<optimized out>, argv=0x7fffe93a7828, auxvec=0x7fffe93a7880, init=<optimized out>, rtld_fini=<optimized out>, stack_end=<optimized out>, fini=<optimized out>) at ../csu/libc-start.c:308
self = 0x7a6150dc38d0
result = <optimized out>
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {8465053667230565969, 134558384812288, 8465057470262718529, 0 <repeats 13 times>, 134558387008032, 0, 134558387008040, 662230455376, 0, 2449962883098869759, 0 <repeats 42 times>}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x7fffe93a7700, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = -382044416}}}
not_first_call = <optimized out>
#6 0x00007a6150998484 in __libc_start_main (argc=<optimized out>, argv=<optimized out>, ev=<optimized out>, auxvec=<optimized out>, rtld_fini=<optimized out>, stinfo=<optimized out>, stack_on_entry=<optimized out>) at ../sysdeps/unix/sysv/linux/powerpc/libc-start.c:116
No locals.
#7 0x0000000000000000 in ?? ()
No symbol table info available.
--- source code stack trace ---
#0 tcache_get (tc_idx=<optimized out>) at malloc.c:2950
[Error: malloc.c was not found in source tree]
#1 __GI___libc_malloc (bytes=16) at malloc.c:3058
[Error: malloc.c was not found in source tree]
#2 0x0000009a300279a0 in ?? ()
#3 0x0000009a3002cad8 in ?? ()
#4 0x0000009a30023794 in ?? ()
#5 0x00007a6150998278 in generic_start_main (main=0x9a30022a00, argc=<optimized out>, argv=0x7fffe93a7828, auxvec=0x7fffe93a7880, init=<optimized out>, rtld_fini=<optimized out>, stack_end=<optimized out>, fini=<optimized out>) at ../csu/libc-start.c:308
[Error: libc-start.c was not found in source tree]
#6 0x00007a6150998484 in __libc_start_main (argc=<optimized out>, argv=<optimized out>, ev=<optimized out>, auxvec=<optimized out>, rtld_fini=<optimized out>, stinfo=<optimized out>, stack_on_entry=<optimized out>) at ../sysdeps/unix/sysv/linux/powerpc/libc-start.c:116
[Error: libc-start.c was not found in source tree]
#7 0x0000000000000000 in ?? ()
I thought at first this would be related to my debug rebuilds, but it seems to appear as-is. |
while verifying bug 1832915 I found "by accident" that this crash (at least on our power 9 box seems to happen often.
Case:
- huge kvm guest running
- restart numad
=> Numad crashes.
Steps to recreate:
1. deploy P9 Bionic (or later) system
2. install uvtool
$ apt install uvttool-libvirt
3. log out & in to get permissions right
4. sync images
$ uvt-simplestreams-libvirt --verbose sync --source http://cloud- images.ubuntu.com/daily arch=ppc64el label=daily release=eoan
6. install and manually start numad
$ apt install numad
$ systemctl start numad
5. spawn guest
$ uvt-kvm create --memory $((1024*64)) --cpu 64 --password ubuntu eoan arch=ppc64el release=eoan label=daily
6. restart numad
$ systemctl restart numad
The crash seems related to some re-init of a static structure:
stack trace ---
#0 tcache_get (tc_idx=<optimized out>) at malloc.c:2950
e = 0x9a5ddc1950
e = <optimized out>
__PRETTY_FUNCTION__ = "tcache_get"
#1 __GI___libc_malloc (bytes=16) at malloc.c:3058
ar_ptr = <optimized out>
victim = <optimized out>
hook = <optimized out>
tbytes = <optimized out>
tc_idx = <optimized out>
__PRETTY_FUNCTION__ = "__libc_malloc"
#2 0x0000009a300279a0 in ?? ()
No symbol table info available.
#3 0x0000009a3002cad8 in ?? ()
No symbol table info available.
#4 0x0000009a30023794 in ?? ()
No symbol table info available.
#5 0x00007a6150998278 in generic_start_main (main=0x9a30022a00, argc=<optimized out>, argv=0x7fffe93a7828, auxvec=0x7fffe93a7880, init=<optimized out>, rtld_fini=<optimized out>, stack_end=<optimized out>, fini=<optimized out>) at ../csu/libc-start.c:308
self = 0x7a6150dc38d0
result = <optimized out>
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {8465053667230565969, 134558384812288, 8465057470262718529, 0 <repeats 13 times>, 134558387008032, 0, 134558387008040, 662230455376, 0, 2449962883098869759, 0 <repeats 42 times>}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x7fffe93a7700, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = -382044416}}}
not_first_call = <optimized out>
#6 0x00007a6150998484 in __libc_start_main (argc=<optimized out>, argv=<optimized out>, ev=<optimized out>, auxvec=<optimized out>, rtld_fini=<optimized out>, stinfo=<optimized out>, stack_on_entry=<optimized out>) at ../sysdeps/unix/sysv/linux/powerpc/libc-start.c:116
No locals.
#7 0x0000000000000000 in ?? ()
No symbol table info available.
--- source code stack trace ---
#0 tcache_get (tc_idx=<optimized out>) at malloc.c:2950
[Error: malloc.c was not found in source tree]
#1 __GI___libc_malloc (bytes=16) at malloc.c:3058
[Error: malloc.c was not found in source tree]
#2 0x0000009a300279a0 in ?? ()
#3 0x0000009a3002cad8 in ?? ()
#4 0x0000009a30023794 in ?? ()
#5 0x00007a6150998278 in generic_start_main (main=0x9a30022a00, argc=<optimized out>, argv=0x7fffe93a7828, auxvec=0x7fffe93a7880, init=<optimized out>, rtld_fini=<optimized out>, stack_end=<optimized out>, fini=<optimized out>) at ../csu/libc-start.c:308
[Error: libc-start.c was not found in source tree]
#6 0x00007a6150998484 in __libc_start_main (argc=<optimized out>, argv=<optimized out>, ev=<optimized out>, auxvec=<optimized out>, rtld_fini=<optimized out>, stinfo=<optimized out>, stack_on_entry=<optimized out>) at ../sysdeps/unix/sysv/linux/powerpc/libc-start.c:116
[Error: libc-start.c was not found in source tree]
#7 0x0000000000000000 in ?? ()
I thought at first this would be related to my debug rebuilds, but it seems to appear as-is in the version as it is in the Ubuntu Archive. |
|
2019-07-18 13:34:29 |
Manoj Iyer |
numad (Ubuntu): assignee |
|
bugproxy (bugproxy) |
|
2019-07-18 21:49:36 |
bugproxy |
tags |
|
architecture-ppc64le bugnameltc-179340 severity-low targetmilestone-inin--- |
|
2019-07-19 09:00:29 |
Frank Heimes |
bug task added |
|
ubuntu-power-systems |
|
2019-07-19 09:00:35 |
Frank Heimes |
ubuntu-power-systems: status |
New |
Confirmed |
|
2019-07-19 09:00:46 |
Frank Heimes |
ubuntu-power-systems: assignee |
|
bugproxy (bugproxy) |
|
2019-07-19 09:00:55 |
Frank Heimes |
ubuntu-power-systems: importance |
Undecided |
Medium |
|
2019-08-05 13:55:05 |
Frank Heimes |
tags |
architecture-ppc64le bugnameltc-179340 severity-low targetmilestone-inin--- |
architecture-ppc64le bugnameltc-179340 severity-low targetmilestone-inin--- universe |
|
2019-08-12 14:15:41 |
Frank Heimes |
tags |
architecture-ppc64le bugnameltc-179340 severity-low targetmilestone-inin--- universe |
architecture-ppc64le bugnameltc-179340 reverse-proxy-bugzilla severity-low targetmilestone-inin--- universe |
|
2019-08-19 13:54:24 |
Andrew Cloke |
ubuntu-power-systems: status |
Confirmed |
Incomplete |
|
2019-09-16 09:39:27 |
Andrew Cloke |
ubuntu-power-systems: status |
Incomplete |
Triaged |
|
2019-09-18 08:23:24 |
Manoj Iyer |
nominated for series |
|
Ubuntu Bionic |
|
2019-09-18 08:23:24 |
Manoj Iyer |
bug task added |
|
numad (Ubuntu Bionic) |
|
2019-09-24 12:51:10 |
Christian Ehrhardt |
numad (Ubuntu Bionic): status |
New |
Incomplete |
|
2019-09-24 13:23:26 |
Andrew Cloke |
ubuntu-power-systems: status |
Triaged |
Incomplete |
|
2019-09-30 13:58:49 |
Andrew Cloke |
ubuntu-power-systems: importance |
Medium |
Low |
|
2020-04-20 10:33:30 |
Frank Heimes |
tags |
architecture-ppc64le bugnameltc-179340 reverse-proxy-bugzilla severity-low targetmilestone-inin--- universe |
architecture-ppc64le bugnameltc-179340 hwe-long-running reverse-proxy-bugzilla severity-low targetmilestone-inin--- universe |
|
2020-08-24 13:05:48 |
Frank Heimes |
numad (Ubuntu): status |
Confirmed |
Incomplete |
|
2020-10-26 04:17:29 |
Launchpad Janitor |
numad (Ubuntu Bionic): status |
Incomplete |
Expired |
|
2022-12-06 15:14:46 |
Frank Heimes |
ubuntu-power-systems: assignee |
bugproxy (bugproxy) |
|
|
2022-12-06 15:14:49 |
Frank Heimes |
numad (Ubuntu): assignee |
bugproxy (bugproxy) |
|
|