Isolation might be broken for SoCs due to different arch-timers

Bug #1333545 reported by viresh kumar
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linaro-networking
Fix Released
Medium
viresh kumar

Bug Description

Currently isolation scripts strictly depend on output from cat /proc/interrupts and search for a pattern with this string:

cat /proc/interrupts | grep arch_timer | grep 30 | sed 's/\s\+/ /g' | sed 's/^\s//g' | cut -d' ' -f$((2+$ISOL_CPU))

And this strictly works only if arch_timer is present, otherwise it might report wrong results as well.. Somebody needs to look into improving this part..

Revision history for this message
Hongbo Zhang (hongbo-zhang) wrote :

On Snoball with dual A9 cores, this works:
cat /proc/interrupts | grep twd | sed 's/\s\+/ /g' | sed 's/^\s//g' | cut -d' ' -f$((2+$ISOL_CPU))

Revision history for this message
Hongbo Zhang (hongbo-zhang) wrote :

I guess A9 uses twd, while A15 uses arch_timer, this needs to be verified.

Revision history for this message
viresh kumar (viresh.kumar) wrote :

Yes, ofcourse. I knew it. I am updating my scripts to get rid of this dependency.

Changed in linaro-networking:
assignee: nobody → viresh kumar (viresh.kumar)
importance: Undecided → Medium
status: New → Confirmed
Revision history for this message
Mike Holmes (mike-holmes) wrote :

New version up for review https://review.linaro.org/2414

Revision history for this message
Mike Holmes (mike-holmes) wrote :

This has been merged

Changed in linaro-networking:
status: Confirmed → Fix Committed
Revision history for this message
Mike Holmes (mike-holmes) wrote :

This patch may now be correct but the result appears to be a reduction in the passing rate, The Arndale LE run on the 3rd had the patch and the isolation time is down to 72 seconds in this one case, this appears to be reflected on the other platforms as well.

https://validation.linaro.org/dashboard/image-charts/LNG-NO_HZ

Revision history for this message
viresh kumar (viresh.kumar) wrote :

Something surely happened with Arndale as it has started getting the timeout interruptions as the timer-counter overflows after 90 seconds.

For all other platforms: x86, amarilo, etc, the kernel version is incorrect. We must use 3.14 as the patches are present there.

--
viresh

Revision history for this message
Anders Roxell (aroxell) wrote :
Revision history for this message
Mike Holmes (mike-holmes) wrote :

X86 was apparently working prior to this patch, the failure after this patch is applied is ok.

Changed in linaro-networking:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.