kernel linux-lng-preempt-rt wont boot kvm guest when hugepages are enabled

Bug #1234718 reported by Anders Roxell
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linaro-networking
Fix Released
Critical
Kim Phillips

Bug Description

when enabling hugepages in the kernel the kvm guest does not boot.

Revision history for this message
Mike Holmes (mike-holmes) wrote :
Changed in linaro-networking:
status: New → Triaged
importance: Undecided → Critical
Revision history for this message
Mike Holmes (mike-holmes) wrote :

There are two failure cases, this one has no initial network issue

http://validation.linaro.org/dashboard/attachment/487796/view#L121

Revision history for this message
Kim Phillips (kim-phillips) wrote :

perhaps that's because it's not supported yet?:

https://lists.cs.columbia.edu/pipermail/kvmarm/2013-October/007278.html

Revision history for this message
Zi Shen Lim (zlim) wrote :

Is this (1) hugepage enabled in Guest kernel, (2) hugepage enabled in host kernel, or (3) both?

Revision history for this message
Anders Roxell (aroxell) wrote :

when hugepages are enabled on the host.

if its not supported yet the linux-lng kernel should fail as well to brig up the kvm guest right?
but according to the log that works or?

Revision history for this message
Kim Phillips (kim-phillips) wrote :

> when hugepages are enabled on the host.

what .conf to use to enable hugepages? I don't see any appropriate ones...do you do it manually? If so, what are the CONFIG_ symbols to set?

Also, is it known whether this still occurs with the 3.10.14 RT9 kernel (esp. the networking failure)?

Revision history for this message
Mike Holmes (mike-holmes) wrote :

See https://bugs.launchpad.net/linaro-networking/+bug/1236655 which also uses networking and also fails in only the RT case with a case seen in the KVM testing, i.e. the udhcpc failure.

Revision history for this message
Kim Phillips (kim-phillips) wrote :
Download full text (6.1 KiB)

note: I'm not experiencing the udhcpc problem on my local board, but I am in the LAVA lab.

I traced the kernel where qemu was hanging, occupying 99% cpu to find __get_user_pages_fast was being called:

##### CPU 0 buffer started ####
 qemu-system-arm-1743 [000] ....21. 99.008416: unpin_current_cpu <-migrate_enable
 qemu-system-arm-1743 [000] ....1.. 99.008417: handle_exit <-kvm_arch_vcpu_ioctl_run
 qemu-system-arm-1743 [000] ....1.. 99.008417: kvm_condition_valid <-handle_exit
 qemu-system-arm-1743 [000] ....1.. 99.008418: kvm_handle_guest_abort <-handle_exit
 qemu-system-arm-1743 [000] ....1.. 99.008418: __srcu_read_lock <-kvm_handle_guest_abort
 qemu-system-arm-1743 [000] ....1.. 99.008419: kvm_is_visible_gfn <-kvm_handle_guest_abort
 qemu-system-arm-1743 [000] ....1.. 99.008420: gfn_to_memslot <-kvm_handle_guest_abort
 qemu-system-arm-1743 [000] ....1.. 99.008421: gfn_to_hva <-kvm_handle_guest_abort
 qemu-system-arm-1743 [000] ....1.. 99.008421: __gfn_to_hva_many <-gfn_to_hva
 qemu-system-arm-1743 [000] ....1.. 99.008422: rt_down_read <-kvm_handle_guest_abort
 qemu-system-arm-1743 [000] ....1.. 99.008422: __rt_down_read <-kvm_handle_guest_abort
 qemu-system-arm-1743 [000] ....1.. 99.008423: rt_mutex_lock <-__rt_down_read
 qemu-system-arm-1743 [000] ....1.. 99.008424: rt_mutex_slowlock <-__rt_down_read
 qemu-system-arm-1743 [000] ....1.. 99.008424: _raw_spin_lock <-rt_mutex_slowlock
 qemu-system-arm-1743 [000] ....2.. 99.008425: __try_to_take_rt_mutex <-rt_mutex_slowlock
 qemu-system-arm-1743 [000] ....2.. 99.008426: _raw_spin_unlock <-rt_mutex_slowlock
 qemu-system-arm-1743 [000] ....1.. 99.008427: find_vma <-kvm_handle_guest_abort
 qemu-system-arm-1743 [000] ....1.. 99.008428: rt_up_read <-kvm_handle_guest_abort
 qemu-system-arm-1743 [000] ....1.. 99.008428: rt_mutex_unlock <-kvm_handle_guest_abort
 qemu-system-arm-1743 [000] ....1.. 99.008429: _raw_spin_lock <-rt_mutex_unlock
 qemu-system-arm-1743 [000] ....2.. 99.008430: _raw_spin_unlock <-kvm_handle_guest_abort
 qemu-system-arm-1743 [000] ....1.. 99.008431: gfn_to_pfn_prot <-kvm_handle_guest_abort
 qemu-system-arm-1743 [000] ....1.. 99.008431: __gfn_to_pfn <-gfn_to_pfn_prot
 qemu-system-arm-1743 [000] ....1.. 99.008432: __gfn_to_pfn_memslot <-gfn_to_pfn_prot
 qemu-system-arm-1743 [000] ....1.. 99.008432: __gfn_to_hva_many <-__gfn_to_pfn_memslot
 qemu-system-arm-1743 [000] ....1.. 99.008433: get_user_pages_fast <-__gfn_to_pfn_memslot
 qemu-system-arm-1743 [000] ....1.. 99.008434: rt_down_read <-get_user_pages_fast
 qemu-system-arm-1743 [000] ....1.. 99.008434: __rt_down_read <-get_user_pages_fast
 qemu-system-arm-1743 [000] ....1.. 99.008435: rt_mutex_lock <-__rt_down_read
 qemu-system-arm-1743 [000] ....1.. 99.008436: rt_mutex_slowlock <-__rt_down_read
 qemu-system-arm-1743 [000] ....1.. 99.008436: _raw_spin_lock <-rt_mutex_slowlock
 qemu-system-arm-1743 [000] ....2.. 99.008437: __try_to_take_rt_mutex <-rt_mutex_slowlock
 qemu-system-arm-1743 [000] ....2.. 99.008438: _raw_spin_unlock <-rt_mutex_slowlock
 qemu-system-arm-1743 [000] ....1.. 99.008439: g...

Read more...

Revision history for this message
Kim Phillips (kim-phillips) wrote :

On another note, one of the KVM guys brought my attention to this build warning:

  CC arch/arm/mm/dma-mapping.o
arch/arm/mm/dma-mapping.c:253:2: warning: #warning ARM Coherent DMA allocator does not (yet) support huge TLB [-Wcpp]

which appears to be cleared up with this:

http://lists.infradead.org/pipermail/linux-arm-kernel/2013-July/184116.html

which suggests hugepage support was added in v3.11-rc1, which I think is a typo because this commit:

commit 1355e2a6eb88f04d76125c057dc5fca64d4b6a9e
Author: Catalin Marinas <email address hidden>
Date: Wed Jul 25 14:32:38 2012 +0100

    ARM: mm: HugeTLB support for LPAE systems.

has a v3.10-rc3-3-g1355e2a description, and is in our v3.10.14 LNG tree.

In looking into getting rid of the build warning by applying those two patches, I see:

commit 4bfab2034bab9374eba1921cf7bd51fd8d48661b
Author: Steven Capper <email address hidden>
Date: Fri Jul 26 14:58:22 2013 +0100

    ARM: 7792/1: mm: Remove general hugetlb code from ARM

sitting in upstream Linus' tree, but no "ARM: mm: Remove HugeTLB warning from dma-mapping.c"..sigh, because there were no negative responses to the thread. I don't know enough about hugetlb: can anyone else tell what's going on here?

Revision history for this message
Maxim Uvarov (maxim-uvarov) wrote :

This patch removes this warning:
http://patches.linaro.org/18402/
[2/2] ARM: mm: Remove HugeTLB warning from dma-mapping.c

Revision history for this message
Steve Capper (steve-capper) wrote :

I've resent the HugeTLB warning fix patch just now:
http://lists.infradead.org/pipermail/linux-arm-kernel/2013-October/203767.html

I am applying the finishing touches to the fast_gup patch for ARM. I expect to have some code available soon.

Revision history for this message
Steve Capper (steve-capper) wrote :

Ok, I've put up a branch at:
https://git.linaro.org/gitweb?p=people/stevecapper/linux.git;a=summary

for-lng/fast_gup is where I'll keep the implementation.
I've tagged the code to: fast_gup_lng_20131010

This works for me on my Arndale when I try a futex on THP tail.

I am going to read over this a bit more and then will submit the patches to lakml.

Please let me know if there are any problems with KVM.

Thanks,
--
Steve

Revision history for this message
Christoffer Dall (cdall) wrote :

Just to clarify the KVM support for THP.

What we are talking about is host kernel support for THP and how that interacts with KVM.

There were bugs in the previous huge page patch, please keep an eye on kvm-arm-next the next few days, we will be merging a more well-tested and reviewed patch soon.

For the record, without the THP patches for KVM, if THP is enabled on the host kernel, guest memory may be backed by pages that linux groups as THPs, but the Stage-2 page tables would map the pages using 4K mappings. When the KVM THP patches are present the Stage-2 entries will be 2MB huge mappings.

Hope this clarifies things.

-Christoffer

Revision history for this message
Kim Phillips (kim-phillips) wrote :

I seem to have found a potential fixto the problem in the LAVA lab.
I've since gone from 100% failure to 100% success, although I've only
tried two so far :) :

http://validation.linaro.org/scheduler/job/78394/log_file#L_29_365

http://validation.linaro.org/scheduler/job/78390/log_file#L_29_368

the change I made was to the bin/busybox.nosuid binary:

-udhcpc -R -n -p /var/run/udhcpc.%iface%.pid -i %iface%
+udhcpc -t 10 -p /var/run/udhcpc.%iface%.pid -i %iface%

that is, omit the:

-n,--now Exit with failure if lease is not immediately obtained

and jack up the retries parameter:

-t,--retries=N Send up to N request packets

(although I can't tell what the default retries is).

for more info on the parameters, go to
http://busybox.net/downloads/BusyBox.html and search the page for udhcp.

The fix-vs.-workaround argument here is that the lab's DHCP server has
a much higher latency than local development systems. If that's
acceptable, we need to amend the rootfs build to perform the above
changes in the busybox configuration. Any tips on where that lives in
the massively overloaded meta-maze called OE would be appreciated.

If not, I'd like to request root access to the DHCP server for diagnostics.

Revision history for this message
Mike Holmes (mike-holmes) wrote : Re: [Bug 1234718] Re: kernel linux-lng-preempt-rt wont boot kvm guest when hugepages are enabled

Matt has also reported sluggish dhcp response in the lng lab so we should
check if the lab server is performing aceptably.
On Oct 11, 2013 7:10 PM, "Kim Phillips" <email address hidden> wrote:

> I seem to have found a potential fixto the problem in the LAVA lab.
> I've since gone from 100% failure to 100% success, although I've only
> tried two so far :) :
>
> http://validation.linaro.org/scheduler/job/78394/log_file#L_29_365
>
> http://validation.linaro.org/scheduler/job/78390/log_file#L_29_368
>
> the change I made was to the bin/busybox.nosuid binary:
>
> -udhcpc -R -n -p /var/run/udhcpc.%iface%.pid -i %iface%
> +udhcpc -t 10 -p /var/run/udhcpc.%iface%.pid -i %iface%
>
> that is, omit the:
>
> -n,--now Exit with failure if lease is not immediately obtained
>
> and jack up the retries parameter:
>
> -t,--retries=N Send up to N request packets
>
> (although I can't tell what the default retries is).
>
> for more info on the parameters, go to
> http://busybox.net/downloads/BusyBox.html and search the page for udhcp.
>
> The fix-vs.-workaround argument here is that the lab's DHCP server has
> a much higher latency than local development systems. If that's
> acceptable, we need to amend the rootfs build to perform the above
> changes in the busybox configuration. Any tips on where that lives in
> the massively overloaded meta-maze called OE would be appreciated.
>
> If not, I'd like to request root access to the DHCP server for
> diagnostics.
>
> --
> You received this bug notification because you are a member of Linaro
> Networking Group, which is subscribed to linaro-networking.
> Matching subscriptions: LNG all, all issues
> https://bugs.launchpad.net/bugs/1234718
>
> Title:
> kernel linux-lng-preempt-rt wont boot kvm guest when hugepages are
> enabled
>
> Status in Linaro networking Group:
> Triaged
>
> Bug description:
> when enabling hugepages in the kernel the kvm guest does not boot.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/linaro-networking/+bug/1234718/+subscriptions
>

Revision history for this message
Mike Holmes (mike-holmes) wrote :

This canoot be verified untill the udhcp issue in the lng lab is fixed

Changed in linaro-networking:
status: Triaged → Fix Committed
assignee: nobody → Kim Phillips (kim-phillips)
Revision history for this message
Mike Holmes (mike-holmes) wrote :
Revision history for this message
Kim Phillips (kim-phillips) wrote :

verify genuine failure due to specific removal of (THP) && !PREEMPT_RT_FULL config exception clause in LNG kernel in order to decide whether to switch from bug to effectively reopen card LNG-17, i.e., new work to provide rationale enough to submit patch upstream.

Revision history for this message
Kim Phillips (kim-phillips) wrote :

this appears to be working now:

http://validation.linaro.org/scheduler/job/82285/log_file#L_23_785

Can anyone point to the original non-DHCP-related failure, if any? If none, this bug is just the same as all the other bugs suffering from DHCP problems, and should probably be closed/made duplicate of the bug that specifically targets the DHCP bug.

Revision history for this message
Mike Holmes (mike-holmes) wrote :

Gary is monitoring this patch upstream and it has been applied to the current LNG kernel.
It will be closed when it comes back from upstream.

Revision history for this message
Mike Holmes (mike-holmes) wrote :

Gary to check with Steve Capper, since it is not in 3.12, the next version LNG is moving to.

Revision history for this message
Gary S. Robertson (gary-robertson) wrote :

Checked with Steve Capper and his updated patches are still in progress. He expects them to be finished sometime after the first of 2014. In the mean time we are porting his existing patches to the 3.12 LNG kernel and will revert these and replace them with the updated patches as those are made available.

Revision history for this message
Gary S. Robertson (gary-robertson) wrote :

Need to check back with Steve Capper about latest patches. An intermediate version of the patches applied to the 3.10 kernel appeared to cause instability, so we are waiting on the official version to be available.

Revision history for this message
Gary S. Robertson (gary-robertson) wrote :

Oops - cancel that last comment about the instability in the 3.10 kernel - that actually involved the NO_HZ patches rather than the THP patches. However we still need an update on the status of Capper's new patches.

Revision history for this message
Gary S. Robertson (gary-robertson) wrote :

Steve Capper is actively working to get these patches accepted upstream, but is having to revise the patches to make them more palatable for the upstream maintainers. See Capper's comments in the email thread at:

https://mail.google.com/mail/u/0/?shva=1#inbox/1439148fabe7a5e2

Revision history for this message
Gary S. Robertson (gary-robertson) wrote :

Steve Capper's latest comments for those who can't follow the link in the previous comment:

The fast_gup is still being actively worked on. I've summarised what's
going on in the following page:
 https://wiki.linaro.org/Internal/People/SteveCapper/fast-gup

Essentially I'm trying to grab some database performance data to
justify the patches because I'm getting raised eyebrows. I am swearing
at databases as we speak...

The hugetlb warning patch has had the commit log rewritten to try and
make it more palatable for upstream and I've sent off a V2 just now
(with you on CC).

Revision history for this message
Mike Holmes (mike-holmes) wrote :

Still waiting on upstream

Revision history for this message
Gary S. Robertson (gary-robertson) wrote :

Friday 07 February - Added Steve Capper's latest patches (written for the 3.13 kernel) to our staging 3.12 kernel. First patch of four did not apply cleanly but after a couple of tries I managed to adapt it. As written it caused the compiler to attempt to pull in an include file <asm>/perf_regs.h, which does not exist in the 3.12.9 kernel. After resolving this issue and applying all four patches successfully, attempted to build and test the resulting kernel.

Unfortunately the kernel build dies with an 'error 2' somewhere in a sub-make trying to build either fs/ext4/built-in.o or fs/built-in.o. I was unable to quickly determine the source or details of the error and consequently tabled further efforts to build with the patch. Maybe I can resume this effort after completing some other tasks needed for LCA14.

Revision history for this message
Mike Holmes (mike-holmes) wrote :

Maybe Steve has a few cycles to help us back port it, It would be good to
get it onto 3.10 also that the Keystone can take advantage of it.

On 7 February 2014 19:33, Gary S. Robertson <email address hidden>wrote:

> Friday 07 February - Added Steve Capper's latest patches (written for
> the 3.13 kernel) to our staging 3.12 kernel. First patch of four did
> not apply cleanly but after a couple of tries I managed to adapt it. As
> written it caused the compiler to attempt to pull in an include file
> <asm>/perf_regs.h, which does not exist in the 3.12.9 kernel. After
> resolving this issue and applying all four patches successfully,
> attempted to build and test the resulting kernel.
>
> Unfortunately the kernel build dies with an 'error 2' somewhere in a
> sub-make trying to build either fs/ext4/built-in.o or fs/built-in.o. I
> was unable to quickly determine the source or details of the error and
> consequently tabled further efforts to build with the patch. Maybe I
> can resume this effort after completing some other tasks needed for
> LCA14.
>
> --
> You received this bug notification because you are a member of Linaro
> Networking Group, which is subscribed to linaro-networking.
> Matching subscriptions: LNG all, all issues
> https://bugs.launchpad.net/bugs/1234718
>
> Title:
> kernel linux-lng-preempt-rt wont boot kvm guest when hugepages are
> enabled
>
> Status in Linaro networking Group:
> Fix Committed
>
> Bug description:
> when enabling hugepages in the kernel the kvm guest does not boot.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/linaro-networking/+bug/1234718/+subscriptions
>

Revision history for this message
Gary S. Robertson (gary-robertson) wrote :

Since KVM guests are booting okay now I think we might close this bug and open a new one instead which is specific to the fact that the latest THP fast-gup patches break our 3.12 staging kernel build. Priority could be lowered as well since our existing patches seem to be doing okay for now.

Revision history for this message
Mike Holmes (mike-holmes) wrote :

Closing as the original issue is fixed, a new related issue has been found https://bugs.launchpad.net/linaro-networking/+bug/1282658

Changed in linaro-networking:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.