xen kernel crashes in domU NX-protected page

Bug #236389 reported by mattsteven
34
This bug affects 4 people
Affects Status Importance Assigned to Milestone
xen-source (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

Using the latest (vmlinuz-2.6.24-17-xen 386) kernel, and also but less frequently with the -19 kernel provided by Hirano at http://www.il.is.s.u-tokyo.ac.jp/~hiranotaka/ I get this error frequently, and it does lead to a complete system lock-up (both cpus get stuck in "CPU Soft lock") I deal with it by detecting it with a regular script and rebooting the system automatically but it's definitely a big problem.

I am using hardy with the default xen installation. It is triggered by something happening with apache2 evidently, and happens irregularly, possibly as a result of malicious input to apache2?

[18211.970960] kernel tried to execute NX-protected page - exploit attempt? (uid: 33)
[18211.970974] BUG: unable to handle kernel paging request at virtual address c1d5dbe0
[18211.970981] printing eip: c1d5dbe0
[18211.970989] 017bb000 -> *pde = 00000001:676f3001
[18211.970993] 017bc000 -> *pme = 00000001:42a7f067
[18211.970997] 00000000 -> *pte = 80000001:67152063
[18211.971004] Oops: 0011 [#1] SMP
[18211.971010] Modules linked in: nf_conntrack_ftp nf_conntrack_ipv4 xt_state nf_conntrack xt_multiport iptable_filter ip_tables x_tables quota_v1 ipv6 evdev ext3 jbd mbcache dm_mirror dm_snapshot dm_mod fuse
[18211.971044]
[18211.971049] Pid: 24093, comm: apache2 Not tainted (2.6.24-19-xen #2)
[18211.971054] EIP: 0061:[<c1d5dbe0>] EFLAGS: 00010206 CPU: 0
[18211.971062] EIP is at 0xc1d5dbe0
[18211.971066] EAX: c1d5cca0 EBX: c1d5cca0 ECX: 00000004 EDX: 00000000
[18211.971070] ESI: 0000000c EDI: 40040000 EBP: 00000000 ESP: dfc1de94
[18211.971074] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
[18211.971079] Process apache2 (pid: 24093, ti=dfc1c000 task=ecb4e330 task.ti=dfc1c000)
[18211.971083] Stack: c01623a5 00000000 c03fe800 dfc1dee0 0000000c 0000000e dfc1ded0 c0162456
[18211.971100] c1d9fd60 c15bd86c 0000000e 0000000d c01658c8 0000000e 00000000 0000000e
[18211.971116] 00000000 c1bc8ea0 c1e85d00 c1d5cca0 c22951c0 c1bc8120 c1d9f4c0 c1e82f00
[18211.971133] Call Trace:
[18211.971137] [<c01623a5>] free_hot_cold_page+0x195/0x220
[18211.971155] [<c0162456>] __pagevec_free+0x26/0x30
[18211.971163] [<c01658c8>] release_pages+0x68/0x160
[18211.971171] [<c017a5f4>] free_pages_and_swap_cache+0x74/0xa0
[18211.971179] [<c01737b7>] exit_mmap+0xe7/0x100
[18211.971187] [<c0124303>] mmput+0x23/0x80
[18211.971195] [<c0129d95>] do_exit+0x165/0x8b0
[18211.971203] [<c0174250>] do_munmap+0x180/0x1f0
[18211.971210] [<c0183649>] filp_close+0x49/0x80
[18211.971217] [<c012a50a>] do_group_exit+0x2a/0xa0
[18211.971224] [<c0105832>] syscall_call+0x7/0xb
[18211.971232] [<c0320000>] vcc_def_wakeup+0x10/0x60
[18211.971240] =======================
[18211.971243] Code: 00 00 00 c0 02 40 ed 00 d0 65 e9 80 00 00 40 01 00 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 c0 13 42 ed 00 e0 65 e9 <00> 20 00 40 01 00 00 00 ff ff ff ff b4 cc d5 c1 00 00 00 00 a0
[18211.971340] EIP: [<c1d5dbe0>] 0xc1d5dbe0 SS:ESP 0069:dfc1de94
[18211.971352] ---[ end trace 627327b2b71cc16d ]---
[18211.971356] Fixing recursive fault but reboot is needed!

Revision history for this message
mattsteven (matthew-matts) wrote :
Download full text (27.1 KiB)

Additional crash info including the CPU soft lock that it degrades into. This is a pretty effective DoS in any case.

[27790.285754] kernel tried to execute NX-protected page - exploit attempt? (uid: 33)
[27790.285768] BUG: unable to handle kernel paging request at virtual address c1d7f160
[27790.285776] printing eip: c1d7f160
[27790.285784] 017bb000 -> *pde = 00000001:676f3001
[27790.285787] 017bc000 -> *pme = 00000001:435aa067
[27790.285791] 00000000 -> *pte = 80000001:67130063
[27790.285798] Oops: 0011 [#1] SMP
[27790.285804] Modules linked in: nf_conntrack_ftp nf_conntrack_ipv4 xt_state nf_conntrack xt_multiport iptable_filter ip_tables x_tables quota_v1 ipv6 evdev ext3 jbd mbcache dm_mirror dm_snapshot dm_mod fuse
[27790.285838]
[27790.285843] Pid: 21030, comm: apache2 Not tainted (2.6.24-19-xen #2)
[27790.285849] EIP: 0061:[<c1d7f160>] EFLAGS: 00010206 CPU: 0
[27790.285856] EIP is at 0xc1d7f160
[27790.285860] EAX: c1db96a0 EBX: c1db96a0 ECX: 00000004 EDX: 00000000
[27790.285864] ESI: 00000005 EDI: 40040000 EBP: 00000000 ESP: ea567e94
[27790.285868] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
[27790.285872] Process apache2 (pid: 21030, ti=ea566000 task=dffbd190 task.ti=ea566000)
[27790.285876] Stack: c01623a5 00000000 c03fef80 ea567ef8 00000005 0000000d ea567ed0 c0162456
[27790.285893] c1d7b6e0 c15bd2f8 c03fef80 0000000e c0165997 0000000e 00000000 0000000d
[27790.285911] 00000000 c1e6f780 c1fcc660 c2187c60 c1fad920 c1fabbe0 c1d89aa0 c1fa9500
[27790.285928] Call Trace:
[27790.285932] [<c01623a5>] free_hot_cold_page+0x195/0x220
[27790.285948] [<c0162456>] __pagevec_free+0x26/0x30
[27790.285956] [<c0165997>] release_pages+0x137/0x160
[27790.285963] [<c017a5f4>] free_pages_and_swap_cache+0x74/0xa0
[27790.285972] [<c01737b7>] exit_mmap+0xe7/0x100
[27790.285979] [<c0124303>] mmput+0x23/0x80
[27790.285987] [<c0129d95>] do_exit+0x165/0x8b0
[27790.285995] [<c0183649>] filp_close+0x49/0x80
[27790.286002] [<c012a50a>] do_group_exit+0x2a/0xa0
[27790.286009] [<c0105832>] syscall_call+0x7/0xb
[27790.286016] [<c0320000>] vcc_def_wakeup+0x10/0x60
[27790.286024] =======================
[27790.286027] Code: 00 00 00 c0 08 40 ed 20 84 de e7 80 00 00 40 01 00 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 c0 3c 40 ed 00 a0 70 ea <00> 20 00 40 01 00 00 00 ff ff ff ff b4 96 db c1 00 00 00 00 c0
[27790.286125] EIP: [<c1d7f160>] 0xc1d7f160 SS:ESP 0069:ea567e94
[27790.286137] ---[ end trace 1f3ceb0e275558e5 ]---
[27790.286142] Fixing recursive fault but reboot is needed!
[27873.920667] kernel tried to execute NX-protected page - exploit attempt? (uid: 33)
[27873.920681] BUG: unable to handle kernel paging request at virtual address c1db8f60
[27873.920689] printing eip: c1db8f60
[27873.920696] 017bb000 -> *pde = 00000001:676f3001
[27873.920700] 017bc000 -> *pme = 00000001:435aa067
[27873.920704] 00000000 -> *pte = 80000001:670f7063
[27873.920710] Oops: 0011 [#2] SMP
[27873.920717] Modules linked in: nf_conntrack_ftp nf_conntrack_ipv4 xt_state nf_conntrack xt_multiport iptable_filter ip_tables x_tables quota_v1 ipv6 evdev ext3 jbd mbcache dm_mirror dm_snapshot dm_mod fuse
[27873.920750]
[27873.920755] Pid: 21305, comm:...

Revision history for this message
mattsteven (matthew-matts) wrote :

I can't duplicate this on another nearly identical machine, the only difference being the machine that doesn't crash has 1G of RAM while the one that does crash has 5G. Both are 32 bit, so I wonder if this is a high memory related bug.

Revision history for this message
a.polli (a-polli) wrote :
Download full text (4.1 KiB)

I have a similar problem with vmlinuz-2.6.22-15-xen: in that moment I see many attempt to log in via ssh; then the server is blocked and I can't even ping it.
it have 9 GB of RAM

-------------------
Jul 10 01:32:48 hypervisor kernel: [1075570.401200] kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
Jul 10 01:32:48 hypervisor kernel: [1075570.401262] BUG: unable to handle kernel paging request at virtual address c210a680
Jul 10 01:32:48 hypervisor kernel: [1075570.401317] printing eip:
Jul 10 01:32:48 hypervisor kernel: [1075570.401342] c210a680
Jul 10 01:32:48 hypervisor kernel: [1075570.401345] 01d2c000 -> *pde = 00000000:7fd30001
Jul 10 01:32:48 hypervisor kernel: [1075570.401375] 01d30000 -> *pme = 00000000:7e001067
Jul 10 01:32:48 hypervisor kernel: [1075570.401404] 00001000 -> *pte = 80000002:7b90a063
Jul 10 01:32:48 hypervisor kernel: [1075570.401436] Oops: 0011 [#1]
Jul 10 01:32:48 hypervisor kernel: [1075570.401461] SMP
Jul 10 01:32:48 hypervisor kernel: [1075570.401493] Modules linked in: xt_tcpudp xt_physdev iptable_filter ip_tables x_tables af_packet bridge lp loop sr_mod cdrom s erial_core pc spkr parport_pc parport shpchp pci_hotplug ipv6 evdev ext3 jbd mbcache sg sd_mod ata_piix floppy tg3 ata_generic libata ehci_hcd uhci_hcd usbcore mptsas mptscsih mptbase scsi_tra nsport_sas scsi_mod thermal processor fan fuse apparmor commoncap
Jul 10 01:32:48 hypervisor kernel: [1075570.401822] CPU: 0
Jul 10 01:32:48 hypervisor kernel: [1075570.401823] EIP: 0061:[<c210a680>] Not tainted VLI
Jul 10 01:32:48 hypervisor kernel: [1075570.401824] EFLAGS: 00210206 (2.6.22-15-xen #1)
Jul 10 01:32:48 hypervisor kernel: [1075570.401907] EIP is at 0xc210a680
Jul 10 01:32:48 hypervisor kernel: [1075570.401934] eax: c201a660 ebx: c201a660 ecx: c03c0c00 edx: 00000000
Jul 10 01:32:48 hypervisor kernel: [1075570.401968] esi: d356dec4 edi: 00040000 ebp: 00000000 esp: d356de94
Jul 10 01:32:48 hypervisor kernel: [1075570.402012] ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0069
Jul 10 01:32:48 hypervisor kernel: [1075570.402051] Process sshd (pid: 25153, ti=d356c000 task=dabf6000 task.ti=d356c000)
Jul 10 01:32:48 hypervisor kernel: [1075570.402085] Stack: c01569b9 00000000 c03c0c00 00000004 d356dec4 c14662a4 00000007 c0156a4f
Jul 10 01:32:48 hypervisor kernel: [1075570.402176] c201a180 c03c0c00 c01598e6 00000007 00000007 00000000 c2142300 c2018540
Jul 10 01:32:48 hypervisor kernel: [1075570.402261] c201bda0 c2018260 c20f05c0 c201a660 c201a180 c201ba60 d3474ff8 bffaafff
Jul 10 01:32:48 hypervisor kernel: [1075570.402347] Call Trace:
Jul 10 01:32:48 hypervisor kernel: [1075570.402392] [free_hot_cold_page+393/512] free_hot_cold_page+0x189/0x200
Jul 10 01:32:48 hypervisor kernel: [1075570.402433] [__pagevec_free+31/48] __pagevec_free+0x1f/0x30
Jul 10 01:32:48 hypervisor kernel: [1075570.402468] [release_pages+390/496] release_pages+0x186/0x1f0
Jul 10 01:32:48 hypervisor kernel: [1075570.402508] [free_pages_and_swap_cache+116/160] free_pages_and_swap_cache+0x74/0xa0
Jul 10 01:32:48 hypervisor kernel: [1075570.402546] [exit_mmap+225/240] exit_mmap+0xe1/0xf0
Jul 10 01:32:48 hypervisor kernel: [107...

Read more...

Revision history for this message
Dimitrios Symeonidis (azimout) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. You reported this bug a while ago and there hasn't been any activity in it recently. We were wondering if this is still an issue for you. Can you try with the latest Ubuntu release? Thanks in advance.

Changed in xen-source (Ubuntu):
status: New → Incomplete
Revision history for this message
mattsteven (matthew-matts) wrote :

I'm sorry Dimitrios, I have switched to a different solution and the machine has been in production for some time now so I don't have it handy to test a more recent Ubuntu.

It really is a deadly serious bug though which prevents using Ubuntu on servers with huge amounts of RAM. If I buy any new hardware in the near future I will try ubuntu again, revisit the bug and post my findings. Thanks for taking an interest.

Revision history for this message
Dimitrios Symeonidis (azimout) wrote :

closing this bug, as per your last comment. feel free to switch back to new if you upgrade to the latest version and still face this issue. thank you, and good luck.

Changed in xen-source (Ubuntu):
status: Incomplete → Invalid
Revision history for this message
Martynas Sklizmantas (saint-ghost) wrote :
Download full text (3.3 KiB)

I have experiencing this/similar issue with 2.6.24-24.. (RAM - 4gb)

Jul 13 18:15:57 saturn kernel: [254201.343326] kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
Jul 13 18:15:57 saturn kernel: [254201.343336] BUG: unable to handle kernel paging request at virtual address c20632e0
Jul 13 18:15:57 saturn kernel: [254201.343341] printing eip: c20632e0
Jul 13 18:15:57 saturn kernel: [254201.343345] 1e35b000 -> *pde = 00000001:18120001
Jul 13 18:15:57 saturn kernel: [254201.343348] 1dabc000 -> *pme = 00000001:31636067
Jul 13 18:15:57 saturn kernel: [254201.343351] 00000000 -> *pte = 80000001:5ab79063
Jul 13 18:15:57 saturn kernel: [254201.343355] Oops: 0011 [#1] SMP
Jul 13 18:15:57 saturn kernel: [254201.343360] Modules linked in: ipmi_msghandler lock_dlm gfs2 dlm configfs ext3 jbd mbcache ipv6 softdog evdev reiserfs raid10 raid456 async_xor async_memcpy async_tx xor raid1 raid0 multipath linear md_mod dm_mirror dm_snapshot dm_mod fuse
Jul 13 18:15:57 saturn kernel: [254201.343397]
Jul 13 18:15:57 saturn kernel: [254201.343401] Pid: 25517, comm: sshfs Not tainted (2.6.24-24-xen #1)
Jul 13 18:15:57 saturn kernel: [254201.343405] EIP: 0061:[<c20632e0>] EFLAGS: 00010206 CPU: 1
Jul 13 18:15:57 saturn kernel: [254201.343410] EIP is at 0xc20632e0
Jul 13 18:15:57 saturn kernel: [254201.343413] EAX: c2033960 EBX: c2033960 ECX: 00000000 EDX: 00000000
Jul 13 18:15:57 saturn kernel: [254201.343416] ESI: 00000001 EDI: 40040000 EBP: 00000000 ESP: ddb6fea0
Jul 13 18:15:57 saturn kernel: [254201.343419] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069
Jul 13 18:15:57 saturn kernel: [254201.343423] Process sshfs (pid: 25517, ti=ddb6e000 task=e6793350 task.ti=ddb6e000)
Jul 13 18:15:57 saturn kernel: [254201.343426] Stack: c0162395 80000001 c03fd800 ddb6fef0 00000001 00000004 ddb6fedc c0162446
Jul 13 18:15:57 saturn kernel: [254201.343439] c2033960 c19031b8 c03fd800 00000004 c0165987 00000004 00000000 00000004
Jul 13 18:15:57 saturn kernel: [254201.343453] 00000000 c1f66720 c1f66740 c1f84040 c2033960 c1f86f80 c1f86f00 c1f86f20
Jul 13 18:15:57 saturn kernel: [254201.343466] Call Trace:
Jul 13 18:15:57 saturn kernel: [254201.343470] [<c0162395>] free_hot_cold_page+0x195/0x220
Jul 13 18:15:57 saturn kernel: [254201.343481] [<c0162446>] __pagevec_free+0x26/0x30
Jul 13 18:15:57 saturn kernel: [254201.343488] [<c0165987>] release_pages+0x137/0x160
Jul 13 18:15:57 saturn kernel: [254201.343495] [<c017a654>] free_pages_and_swap_cache+0x74/0xa0
Jul 13 18:15:57 saturn kernel: [254201.343502] [<c01736bb>] unmap_region+0xfb/0x120
Jul 13 18:15:57 saturn kernel: [254201.343508] [<c0174277>] do_munmap+0x147/0x1f0
Jul 13 18:15:57 saturn kernel: [254201.343515] [<c017435c>] sys_munmap+0x3c/0x60
Jul 13 18:15:57 saturn kernel: [254201.343520] [<c0105832>] syscall_call+0x7/0xb
Jul 13 18:15:57 saturn kernel: [254201.343527] [<c0320000>] vcc_ioctl+0x120/0x2d0
Jul 13 18:15:57 saturn kernel: [254201.343534] =======================
Jul 13 18:15:57 saturn kernel: [254201.343536] Code: 7a 1f c2 00 01 10 00 00 02 20 00 00 00 00 40 01 00 00 00 ff ff ff ff 78 71 47 c0 00 00 00 00 a0 7c 05 c2 00 01 10 00 00 02 20 00 <00> 00 00 40 00 00 00 00 ff ff ff ...

Read more...

Revision history for this message
Dimitrios Symeonidis (azimout) wrote :

martynas, can you please try with the latest ubuntu version? thank you

Revision history for this message
Martynas Sklizmantas (saint-ghost) wrote :

hi Dimitrios,

could you please be more specific about the package/kernel version you would like me to test as jaunty has no linux-image-xen..

m

Revision history for this message
Dimitrios Symeonidis (azimout) wrote :

you're right, sorry about that. setting back to new

Changed in xen-source (Ubuntu):
status: Invalid → New
Changed in xen-source (Ubuntu):
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.