non-epicsThreads may receive wrong priority and crash

Bug #1816841 reported by Till Straumann
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
EPICS Base
New
Undecided
Unassigned

Bug Description

The following applies to the POSIX threading implementation.

If an application is launched with enough privilege to use the real-time scheduler (SCHED_FIFO) and creates some non-epicsThreads which use SCHED_OTHER then a bug is triggered when the non-epicsThread gets a 'pseudo-epics context' attached (via 'createImplicit()' e.g, by calling epicsThreadSuspendSelf()):

createImplicit() computes a thread priority for the non-epics thread assuming SCHED_FIFO -- which is actually wrong. Under some circumstances the 'illegal' priority value can lead to a crash.

The attached patch fixes this and cleans up some cruft. It also makes sure that whenever 'pcommonAttr' is used the scheduling policy of pcommonAttr matches the scheduling policy of the target thread.

The patch also adds a test program which exercises the described behavior - however, it is necessary to execute the test with sufficient privilege for epicsThreads to use SCHED_FIFO (if the privilege is lacking then the test is skipped).

The patch is against R7.0.2.

Revision history for this message
Till Straumann (strauman) wrote :
Revision history for this message
Andrew Johnson (anj) wrote :

Thanks Till.

Ralph, does ITER use SCHED_FIFO at all, and if so would you be able to test this patch?

There might be some conflicts between these changes and Heinz Junkes' modifications to support RTEMS-5 (using pthreads), although I doubt they would be large. We should however check whether this patch will apply along with his changes, and try to run the tests on RTEMS-5 too (which may be a little tricky to do).

The RTEMS question is orthogonal to the decision whether this patch would be applied to the 3.15 branch and merged up instead of applying it straight to the 7.0 branch.

Revision history for this message
Heinz Junkes (junkes) wrote :

Thanks Till,

I'll apply that for my RTESM5 port too and test it.
(Meanwhile I use the posix osdTthread variant also for RTEMS5).

If "we" are already there, the following call has been causing problems for a long time.

- pthreadInfo = init_threadInfo("_main_",0,epicsThreadGetStackSize(epicsThreadStackSmall),0,0);
+ pthreadInfo = init_threadInfo("_main_",0,0,0);

So that the "ur-task" is started, during POSIX_Init. But as non Epics-task it gets the OSI-prio "0".
Shouldn't something like IocShPrio 91 be set here?

Unfortunately I can't set it in rtems_init with epicsThreadSetPriority because it is a
non-epics-task.

But in some tests we read the epicsPrio from this thread (and you get it back as 0) and then e.g. in the ringPointerTest.c we create some threads with the prio 0 and 1 ;-)
Then they hang ...

"ringPointerTest.c" line 171 of 248

static void testPair(int locked)
{
    unsigned int myprio = epicsThreadGetPrioritySelf(), consumerprio;
    pairPvt pvt;

...
testDiag("single producer, single consumer with%s locking", locked?"":"out");

    /* give the consumer thread a slightly higher priority so that
     * it can preempt us on RTOS targets. On non-RTOS targets
     * we expect to be preempted at some random time
     */
    if(!epicsThreadLowestPriorityLevelAbove(myprio, &consumerprio))
        testAbort("Can't run test from thread with highest priority");

    epicsThreadMustCreate("pair", consumerprio,
                          epicsThreadGetStackSize(epicsThreadStackSmall),
                          &pairConsumer, &pvt);
    /* wait for worker to start */
    epicsEventMustWait(pvt.sync);

Revision history for this message
Till Straumann (strauman) wrote : Re: [Bug 1816841] Re: non-epicsThreads may receive wrong priority and crash
Download full text (3.3 KiB)

With the changes I made, epicsThreadSetPriority() should -- in principle
-- also work for non-EPICS threads,
as long as the target thread uses the same scheduling policy as pcommonAttr.

However, for sake of backwards compatibility I did not yet remove the test

     if(!pthreadInfo->isEpicsThread) {
         fprintf(stderr,"epicsThreadSetPriority called by non epics
thread\n");
         return;
     }

(but you can find the comment:

     /* T.S.: There is in principle no problem with changing the priority of
      * a non-EPICS thread -- as long as it uses the same scheduling policy.
      * For sake of backwards compatibility I leave this test for now but
      * it could as well be removed (and the 'isEpicsThread' member as
well).
      */
).

I.e., as a result of this discussion we could remove the above test and
you then should be
able to set the priority of the 'main' thread.

More comments in-line below...

On 2/21/19 9:33 AM, Heinz Junkes wrote:
> Thanks Till,
>
> I'll apply that for my RTESM5 port too and test it.
> (Meanwhile I use the posix osdTthread variant also for RTEMS5).
>
> If "we" are already there, the following call has been causing problems
> for a long time.
>
> - pthreadInfo = init_threadInfo("_main_",0,epicsThreadGetStackSize(epicsThreadStackSmall),0,0);
> + pthreadInfo = init_threadInfo("_main_",0,0,0);
>
> So that the "ur-task" is started, during POSIX_Init. But as non Epics-task it gets the OSI-prio "0".
> Shouldn't something like IocShPrio 91 be set here?
>
> Unfortunately I can't set it in rtems_init with epicsThreadSetPriority because it is a
> non-epics-task.
>
> But in some tests we read the epicsPrio from this thread (and you get it back as 0) and then e.g. in the ringPointerTest.c we create some threads with the prio 0 and 1 ;-)
> Then they hang ...
Sure, the fallback to '0' works under linux where the main thread uses
SCHED_OTHER. Under RTEMS, however,
the 'main' thread is IIRC a 'normal' SCHED_FIFO thread which has a
priority higher than 0 and thus the true
ordering of priorities is different from what it seems.

The aforementioned changes may allow you to fix this, but IMHO,
regardless or any decision (re. removal of the 'if ! isEpicsThread')
it might be better to change the test so that it uses epicsThreads only
(by spawning separate threads for producer and consumer) because
ultimately that is the only guarantee that the two threads use the same
scheduling policy (in a portable way).

- Till
> "ringPointerTest.c" line 171 of 248
>
> static void testPair(int locked)
> {
> unsigned int myprio = epicsThreadGetPrioritySelf(), consumerprio;
> pairPvt pvt;
>
> ...
> testDiag("single producer, single consumer with%s locking", locked?"":"out");
>
> /* give the consumer thread a slightly higher priority so that
> * it can preempt us on RTOS targets. On non-RTOS targets
> * we expect to be preempted at some random time
> */
> if(!epicsThreadLowestPriorityLevelAbove(myprio, &consumerprio))
> testAbort("Can't run test from thread with highest priority");
>
> epicsThreadMustCreate("pair", consumerprio,
> epicsThreadGetSta...

Read more...

Revision history for this message
Heinz Junkes (junkes) wrote :
Download full text (6.1 KiB)

Yes, I can confirm that with your changes the behavior has now changed and everything works even with the _main_ task
with prio 0. (Tested with RTEMS5)

I also already removed the test for isEpicsThread in epicsThreadSetPriority

 if(!pthreadInfo->isEpicsThread) {
>
> fprintf(stderr,"epicsThreadSetPriority called by non epics
> thread\n");
> return;
> }

in rtems_init.c I set the priority in the epicsThread-Structure:

...
    if (epicsRtemsInitPostSetBootConfigFromNVRAM(&rtems_bsdnet_config) != 0)
        delayedPanic("epicsRtemsInitPostSetBootConfigFromNVRAM");

    #if __RTEMS_MAJOR__>4

    /*
     * Override RTEMS Posix configuration, as it get started with posix prio 2
     */

    epicsThreadSetPriority(epicsThreadGetIdSelf(), epicsThreadPriorityIocsh);

#else
     /*
     * Override RTEMS configuration
     */
    rtems_task_set_priority (
                     RTEMS_SELF,
                     epicsThreadGetOssPriorityValue(epicsThreadPriorityIocsh),
                     &newpri);
#endif

    /*
     * Create a reasonable environment
     */
    initConsole ();

Works great for RTEMS5. Thanks for your changes. I've been sitting there for a long time.
Heinz

> On 21. Feb 2019, at 19:25, Till Straumann <email address hidden> wrote:
>
> With the changes I made, epicsThreadSetPriority() should -- in principle
> -- also work for non-EPICS threads,
> as long as the target thread uses the same scheduling policy as pcommonAttr.
>
> However, for sake of backwards compatibility I did not yet remove the
> test
>
> if(!pthreadInfo->isEpicsThread) {
> fprintf(stderr,"epicsThreadSetPriority called by non epics
> thread\n");
> return;
> }
>
> (but you can find the comment:
>
> /* T.S.: There is in principle no problem with changing the priority of
> * a non-EPICS thread -- as long as it uses the same scheduling policy.
> * For sake of backwards compatibility I leave this test for now but
> * it could as well be removed (and the 'isEpicsThread' member as
> well).
> */
> ).
>
> I.e., as a result of this discussion we could remove the above test and
> you then should be
> able to set the priority of the 'main' thread.
>
> More comments in-line below...
>
> On 2/21/19 9:33 AM, Heinz Junkes wrote:
>> Thanks Till,
>>
>> I'll apply that for my RTESM5 port too and test it.
>> (Meanwhile I use the posix osdTthread variant also for RTEMS5).
>>
>> If "we" are already there, the following call has been causing problems
>> for a long time.
>>
>> - pthreadInfo = init_threadInfo("_main_",0,epicsThreadGetStackSize(epicsThreadStackSmall),0,0);
>> + pthreadInfo = init_threadInfo("_main_",0,0,0);
>>
>> So that the "ur-task" is started, during POSIX_Init. But as non Epics-task it gets the OSI-prio "0".
>> Shouldn't something like IocShPrio 91 be set here?
>>
>> Unfortunately I can't set it in rtems_init with epicsThreadSetPriority because it is a
>> non-epics-task.
>>
>> But in some tests we read the epicsPrio from this thread (and you get it back as 0) and then e.g. in the ringPointerTest.c we create some threads with the prio 0 and 1 ;-)
>> Then they hang ...
> Su...

Read more...

Revision history for this message
Till Straumann (strauman) wrote :
Download full text (6.5 KiB)

On 2/21/19 7:42 PM, Heinz Junkes wrote:
> Yes, I can confirm that with your changes the behavior has now changed and everything works even with the _main_ task
> with prio 0. (Tested with RTEMS5)
>
> I also already removed the test for isEpicsThread in
> epicsThreadSetPriority
Just for the record: IIRC, this is the only place where 'isEpicsThread'
is used. If the
test is removed then we should remove the struct member, too, IMHO.

- Till
>
> if(!pthreadInfo->isEpicsThread) {
>> fprintf(stderr,"epicsThreadSetPriority called by non epics
>> thread\n");
>> return;
>> }
> in rtems_init.c I set the priority in the epicsThread-Structure:
>
> ...
> if (epicsRtemsInitPostSetBootConfigFromNVRAM(&rtems_bsdnet_config) != 0)
> delayedPanic("epicsRtemsInitPostSetBootConfigFromNVRAM");
>
> #if __RTEMS_MAJOR__>4
>
> /*
> * Override RTEMS Posix configuration, as it get started with posix prio 2
> */
>
> epicsThreadSetPriority(epicsThreadGetIdSelf(),
> epicsThreadPriorityIocsh);
>
> #else
> /*
> * Override RTEMS configuration
> */
> rtems_task_set_priority (
> RTEMS_SELF,
> epicsThreadGetOssPriorityValue(epicsThreadPriorityIocsh),
> &newpri);
> #endif
>
> /*
> * Create a reasonable environment
> */
> initConsole ();
> …
>
> Works great for RTEMS5. Thanks for your changes. I've been sitting there for a long time.
> Heinz
>
>
>> On 21. Feb 2019, at 19:25, Till Straumann <email address hidden> wrote:
>>
>> With the changes I made, epicsThreadSetPriority() should -- in principle
>> -- also work for non-EPICS threads,
>> as long as the target thread uses the same scheduling policy as pcommonAttr.
>>
>> However, for sake of backwards compatibility I did not yet remove the
>> test
>>
>> if(!pthreadInfo->isEpicsThread) {
>> fprintf(stderr,"epicsThreadSetPriority called by non epics
>> thread\n");
>> return;
>> }
>>
>> (but you can find the comment:
>>
>> /* T.S.: There is in principle no problem with changing the priority of
>> * a non-EPICS thread -- as long as it uses the same scheduling policy.
>> * For sake of backwards compatibility I leave this test for now but
>> * it could as well be removed (and the 'isEpicsThread' member as
>> well).
>> */
>> ).
>>
>> I.e., as a result of this discussion we could remove the above test and
>> you then should be
>> able to set the priority of the 'main' thread.
>>
>> More comments in-line below...
>>
>> On 2/21/19 9:33 AM, Heinz Junkes wrote:
>>> Thanks Till,
>>>
>>> I'll apply that for my RTESM5 port too and test it.
>>> (Meanwhile I use the posix osdTthread variant also for RTEMS5).
>>>
>>> If "we" are already there, the following call has been causing problems
>>> for a long time.
>>>
>>> - pthreadInfo = init_threadInfo("_main_",0,epicsThreadGetStackSize(epicsThreadStackSmall),0,0);
>>> + pthreadInfo = init_threadInfo("_main_",0,0,0);
>>>
>>> So that the "ur-task" is started, during POSIX_Init. But as non Epics-task it gets the OSI-prio "0".
>>> Shouldn't something like IocShPri...

Read more...

Revision history for this message
Ralph Lange (ralph-lange) wrote :

We are able to use SCHED_FIFO on our fast controllers.
I need to get my hands on such a machine, though. Will keep you posted.

Revision history for this message
Heinz Junkes (junkes) wrote :

I removed isEpicsThread from the struct too.

Put #if defined(_POSIX_THREAD_PRIORITY_SCHEDULING) ... around
the two new functions osi2posixPriority and posix2osiPriority to prevent warnings
if _POSIX_THREAD_PRIORITY_SCHEDULING is not set.

#if defined(_POSIX_THREAD_PRIORITY_SCHEDULING) && _POSIX_THREAD_PRIORITY_SCHEDULING > 0
static int osi2posixPriority(unsigned osiPriority)
{

    double maxPriority,minPriority,slope;
    int oss;

    if (!pcommonAttr->valid)
        return 0; /* assume SCHED_OTHER is in effect */

    if(pcommonAttr->maxPriority==pcommonAttr->minPriority)
        return(pcommonAttr->minPriority);

    maxPriority = (double)pcommonAttr->maxPriority;
    minPriority = (double)pcommonAttr->minPriority;
    slope = (double)(maxPriority - minPriority)/(double)(epicsThreadPriorityMax - epicsThreadPriorityMin);
    oss = (int)round( (double)(osiPriority - epicsThreadPriorityMin) * slope + minPriority );
    if ( oss < pcommonAttr->minPriority )
        oss = pcommonAttr->minPriority;
    if ( oss > pcommonAttr->maxPriority )
        oss = pcommonAttr->maxPriority;
    return oss;
}

static unsigned posix2osiPriority(int posixPriority)
{
double slope, osid;
unsigned osi;

    if ( ! pcommonAttr->valid )
        return epicsThreadPriorityMedium;

    if ( pcommonAttr->maxPriority == pcommonAttr->minPriority )
        return epicsThreadPriorityMedium;

    slope = (double)(epicsThreadPriorityMax - epicsThreadPriorityMin)
            / (double)(pcommonAttr->maxPriority - pcommonAttr->minPriority);

    osid = round( (double)(posixPriority - pcommonAttr->minPriority) * slope + epicsThreadPriorityMin );
    if ( osid < (double) epicsThreadPriorityMin ) {
        /* may be negative! */
        return epicsThreadPriorityMin;
    }
    osi = (unsigned)osid;

    if ( osi > epicsThreadPriorityMax )
        osi = epicsThreadPriorityMax;

    return osi;
}
#endif /* _POSIX_THREAD_PRIORITY_SCHEDULING */

Heinz

Revision history for this message
mdavidsaver (mdavidsaver) wrote :

@Till, What is the status of this patch? Does it need a refresh?

imo. it is too large to be reviewed as a single patch. Could you commit it and start a merge request here on launchpad, or a PR on github? (hint, a github PR will also run our unit tests)

Revision history for this message
Heinz Junkes (junkes) wrote :

I created Till's patch as a separate commit:

https://github.com/epics-base/epics-base/commit/ddaf464de8e899220fed87dc5fd6e2e7af5278e0

Till's patch made it very easy for me. I had to make a lot of changes to the code before to set the priorities correctly, especially in relation to RTEMS priorities (eg. used by nfs tasks)
and nonEpicsTask. The elimination of "nonEpicsTasks" was very helpful.

> On 7. Mar 2019, at 03:42, mdavidsaver <email address hidden> wrote:
>
> @Till, What is the status of this patch? Does it need a refresh?
>
> imo. it is too large to be reviewed as a single patch. Could you commit
> it and start a merge request here on launchpad, or a PR on github?
> (hint, a github PR will also run our unit tests)
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1816841
>
> Title:
> non-epicsThreads may receive wrong priority and crash
>
> Status in EPICS Base:
> New
>
> Bug description:
> The following applies to the POSIX threading implementation.
>
> If an application is launched with enough privilege to use the real-
> time scheduler (SCHED_FIFO) and creates some non-epicsThreads which
> use SCHED_OTHER then a bug is triggered when the non-epicsThread gets
> a 'pseudo-epics context' attached (via 'createImplicit()' e.g, by
> calling epicsThreadSuspendSelf()):
>
> createImplicit() computes a thread priority for the non-epics thread
> assuming SCHED_FIFO -- which is actually wrong. Under some
> circumstances the 'illegal' priority value can lead to a crash.
>
> The attached patch fixes this and cleans up some cruft. It also makes
> sure that whenever 'pcommonAttr' is used the scheduling policy of
> pcommonAttr matches the scheduling policy of the target thread.
>
> The patch also adds a test program which exercises the described
> behavior - however, it is necessary to execute the test with
> sufficient privilege for epicsThreads to use SCHED_FIFO (if the
> privilege is lacking then the test is skipped).
>
> The patch is against R7.0.2.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/epics-base/+bug/1816841/+subscriptions

Revision history for this message
Till Straumann (strauman) wrote :

@Michael D.: Heinz has already created a commit with my patch applied. He added an #ifdef (see his comment above) and removed the 'isEpicsThread' member (except from os/Linux/osdThread.h from where it should be removed as well). I had suggested this step as it then allows non-Epics threads to user epicsThreadSetPriority().

I'll do this removal shortly and submit a PR on github.

Revision history for this message
Till Straumann (strauman) wrote :
Revision history for this message
Till Straumann (strauman) wrote :

Hi Michael.

So I created a PR on github...

https://github.com/epics-base/epics-base/pull/24

On 3/6/19 6:42 PM, mdavidsaver wrote:
> @Till, What is the status of this patch? Does it need a refresh?
>
> imo. it is too large to be reviewed as a single patch. Could you commit
> it and start a merge request here on launchpad, or a PR on github?
> (hint, a github PR will also run our unit tests)
>

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.