System locks when hot plugging CPUs in different clusters

Bug #1188778 reported by Mathieu Poirier on 2013-06-07
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Linaro big.LITTLE
In Progress
Undecided
Mathieu Poirier

Bug Description

This is the same problem that we've seen before. It resurfaced because of:

b2deabe arm: arch_timer: add arch_counter_set_user_access

where the value of bit EVNTEN in register CNTKCTL is masked as part of the PL0 access restriction. The fix is as simple as masking with 0x3 rather than 0x7. Investigation at ARM is under way to see if doing so will have an effect on KVM.

The solution should be pushed upsteam.

Changed in linaro-big-little-system:
assignee: nobody → Mathieu Poirier (mathieu.poirier)
status: New → In Progress

Just to add a little more information on this problem. It has nothing to do with the MMC/DMA/cache flush associated with the "reboot hotplug" (#1166246) currently being investigated by ARM.

This bug is caused by event notification between clusters being too short to be noticed when the frequency delta of the clusters is too big. As such messages are being lost and processors wait on wfe() calls resulting in a system lockup. Going to a stream base system to deliver events prevent this situation from occurring.

Masking the EVNTEN bit of register CNTKCTL negate events from being delivered, once again causing notifications to be lost.

A patch was submitted internally to alleviate the problem. The real fix is coming from ARM.

The real patch set was submitted:

https://lkml.org/lkml/2013/6/18/538

tags: added: bl-mp
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers