linaro-networking

Preempt_rt kernel enters idle loop even when there are processes ready

Bug #1224318 reported by Magnus Karlsson on 2013-09-12

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	linaro-networking	Invalid	High	viresh kumar

Bug Description

Hi,

I am examining the core isolation properties between 3.6/3.7 and 3.10 with preempt_rt and for some reason the numbers are much worse for the 3.10 preempt_rt kernel (LNG version). This is the experiment. I boot up Linux with the following boot options.

setenv bootargs "isolcpus=1 nohz_full=1 rcu_nocbs=1 root=/dev/mmcblk1p2 rw rootwait console=ttySAC2,115200n8 init --no-log"

Linux config options can be found in the attached file (or so I thought. I seem only to be able to attach one single file. How to attach multiple?). I run a heavy load on core 0 and only a busy loop on core 1 where I measure how long it takes to traverse the loop. Usually a loop is very quick, but once in a while there is a tick or some other disturbance and the time it takes to traverse the loop becomes much longer. I am only interested in the maximum latency for traversing the loop when measured over a minute or so.

In 3.6/3.7 the max latency was 15 usec, but with 3.10 it is 10 times as much 150 usec. If you take a look at the function trace log attached (filtered to only show core 1), it seems like core 1 goes into idle even though I am executing a busy loop on highest real time priority, locked to core 1, SCHED_FIFO and there is nothing else on that core. It should get 100% of that core. But if I do a top, I see that it gets around 95% and sometimes even less than that. It is also dependent on the load I run on core 0. If I do not load it, the numbers get slightly better. This did not happen in 3.6/3.7. I am currently running linux-lng-preempt-rt-v3.10.10-rt7. When I was running the 3.10.6-rt3 this behavior was even worse. As soon as I started to load core 0, core 1 went into idle for long periods of time for some reason. It is much better in 3.10.10-rt7, but still no cigar. Problems with the RCU implementation in preempt_rt? It would be great if you could take a look at this, since the latencies are way too high at the moment.

In 3.6/3.7, only the tick has an impact on the max latency of the loop.

BTW, I have tried without NO_HZ_FULL in the kernel with the same results.

Thanks: Magnus

Revision history for this message

Magnus Karlsson (magnus-karlsson) wrote on 2013-09-12:

function trace Edit (27.1 KiB, text/plain)

Revision history for this message

Magnus Karlsson (magnus-karlsson) wrote on 2013-09-12:

Config Edit (61.2 KiB, text/plain)

Added config file.

Revision history for this message

Gary S. Robertson (gary-robertson) wrote on 2013-09-13:

Since this occurs with or without NO_HZ_FULL configured in, it actually sounds more like a CPU isolation failure than a NO_HZ_FULL failure. It rather sounds as though the scheduler isn't fully honoring the 'isolcpus=1' boot command line option, and is causing the idle task to run on the 'isolated' CPU since no other tasks are 'scheduled' there. But this is just conjecture until we can do some actual testing.

Revision history for this message

viresh kumar (viresh.kumar) wrote on 2013-09-24:

Magnus,

I hope the testcase would be similar if this happens on our non-RT kernel as well.. (i.e. linux-lng)..
I don't thing there should be any difference here with or with RT support.. as isolcpus should work
for you busy task..

Can you give it a try on non-RT kernel? That will enable us to separate the offending piece of code..

Revision history for this message

viresh kumar (viresh.kumar) wrote on 2013-09-25:

Magnus,

Can you provide us your test scripts that you run... Also what hardware you used for your tests? Exynos? and maybe any other detail that you think is important for reproducing this issue... Probably all the minute steps to follow ?

Revision history for this message

Magnus Karlsson (magnus-karlsson) wrote on 2013-09-25:

Latency benchmark Edit (3.9 KiB, application/x-tar)

Viresh,

Here is the latency benchmark that was used.

Revision history for this message

Magnus Karlsson (magnus-karlsson) wrote on 2013-09-25:

Config file Edit (65.2 KiB, text/plain)

Here is the configuration.

I am running on an Arndale board at 1.4 GHz. Boot cmdline options in previous posting. Once up and running, I load core 0 with:

find /usr -type f -exec scp -q \{\} mkarlsso@10.0.0.1:/dev/null \; &

But you can probably skip this as the number will be bad even without it.

Then I launch the latency benchmark with ./latency. A max latency around 15 us is expected from 3.6 and 3.7. ignore the first run though. On 3.10 I get nearly 5 times as much.

I am using linux-linaro-lng-preempt-rt-3.10.6-2013.08.

BTW, this might not have anything to do with preempt_rt as I can get bad numbers with the regular preempt in linux-linaro-lng-3.10.6-2013.08 too.

Let me know if you need more information.

/Magnus

Revision history for this message

Magnus Karlsson (magnus-karlsson) wrote on 2013-09-25:

Forgot to add that you have to launch the latency application on core 1 by using e.g., cpuset file system or taskset. Or you could add this into the benchmark itself:

cpu_set_t cpu_set;

  CPU_ZERO(&cpu_set);
  CPU_SET(1, &cpu_set);
  if (sched_setaffinity(0, sizeof(cpu_set_t), &cpu_set) == -1)
  {
      perror("sched_setaffinity");
  }

Revision history for this message

Kevin Hilman (khilman-deactivatedaccount) wrote on 2013-09-25: Re: [Bug 1224318] Re: Preempt_rt kernel enters idle loop even when there are processes ready

Magnus Karlsson <email address hidden> writes:

> Here is the configuration.
>
> I am running on an Arndale board at 1.4 GHz. Boot cmdline options in
> previous posting. Once up and running, I load core 0 with:
>
> find /usr -type f -exec scp -q \{\} mkarlsso@10.0.0.1:/dev/null \; &
>
> But you can probably skip this as the number will be bad even without
> it.
>
> Then I launch the latency benchmark with ./latency. A max latency around
> 15 us is expected from 3.6 and 3.7. ignore the first run though. On 3.10
> I get nearly 5 times as much.
>
> I am using linux-linaro-lng-preempt-rt-3.10.6-2013.08.
>
> BTW, this might not have anything to do with preempt_rt as I can get bad
> numbers with the regular preempt in linux-linaro-lng-3.10.6-2013.08 too.

Can you reproduce against mainline?

Kevin
<>

Revision history for this message

Magnus Karlsson (magnus-karlsson) wrote on 2013-09-27:

#10

trace Edit (7.0 MiB, text/plain)

I can reproduce this in mainline. Tried tip of linux-3.10.y (3.10.13) from linux.org. Problem is still there. I have attached the function trace. Search for the "Max latency" marker that I print out when the latency in the loop is above 500 us in the run. If you go back in time in the trace, you can see that the "latency" app is context switched in from the idle thread on core 1.

Note that I am running with the standard preempt option (not the preempt_full patch or the server version option).

Revision history for this message

Magnus Karlsson (magnus-karlsson) wrote on 2013-09-27:

#11

Config file Edit (70.4 KiB, text/plain)

Here is the config.

Revision history for this message

Mike Holmes (mike-holmes) wrote on 2013-10-17:

#12

The latency bechmark is being is added to CI, pending B&B supporting out of tree kernel module builds.

Changed in linaro-networking:
importance:	Undecided → Medium

Mike Holmes (mike-holmes) on 2013-10-17

Changed in linaro-networking:
importance:	Medium → High

viresh kumar (viresh.kumar) on 2013-10-24

Changed in linaro-networking:
assignee:	nobody → viresh kumar (viresh.kumar)

Revision history for this message

Magnus Karlsson (magnus-karlsson) wrote on 2013-10-30:

#13

I cannot reproduce this in latest kernel, so I am going to close it. Do not know exactly what fixed it.

Changed in linaro-networking:
status:	New → Invalid

Revision history for this message

viresh kumar (viresh.kumar) wrote on 2013-11-12:

#14

Updated latency.C with cpuset code Edit (9.0 KiB, text/plain)

Magnus,

I tried the cpuset code you talked about earlier, but with that my task isn't actually attached to cpu 1. I would be attaching my updated code as well..

I traced it using ps -aF with running latency in background.. And on a number of occasions it went to cpu 0.

Then I used `taskset -c 1 ./latency` and it worked without any issues and so task is sticking to cpu1..

Revision history for this message

viresh kumar (viresh.kumar) wrote on 2013-11-12:

#15

The numbers I am getting on 3.10.13, with top most commit:
cff43fc Linux 3.10.13

are:

maxrange=16301 cycles = ~12 us.

And so I am unable to reproduce bug with the kernel where Magnus has reported it.

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Bug attachments

Add attachment

Remote bug watches

Bug watches keep track of this bug in other bug trackers.