aarch64 clock_gettime with CLOCK_REALTIME_COARSE or CLOCK_MONOTONIC_COARSE fails with SIGBUS or SIGSEGV

Bug #1239109 reported by William Grant
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
libqb (Ubuntu)
Fix Released
Undecided
Unassigned
linux (Ubuntu)
Fix Released
Medium
Unassigned

Bug Description

The aarch64 vDSO __kernel_clock_gettime implementation crashes when clock_gettime is called with CLOCK_MONOTONIC_COARSE or CLOCK_REALTIME_COARSE, with a SIGSEGV or SIGBUS respectively.

In the implementation (http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/kernel/vdso/gettimeofday.S#n89) a value other than CLOCK_REALTIME or CLOCK_MONOTONIC branches past the usual "mov x2, x30" which preserves lr for return later. Anything other than CLOCK_REALTIME_COARSE and CLOCK_MONOTONIC_COARSE then branches directly to the svc call, which correctly returns to the caller. But CLOCK_REALTIME_COARSE and CLOCK_MONOTONIC_COARSE execute the special coarse path then fall through to the normal CLOCK_REALTIME/CLOCK_MONOTONIC path, which does a 'ret x2' at the end, despite not having saved x30 to x2 in the _COARSE case. So it ends up setting pc to clk_id, which is either 4 or 5, giving a translation or alignment fault.

Related branches

Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1239109

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Adam Conrad (adconrad) wrote :

This bug doesn't need logs, it has all one should need in the description (except maybe a simple reproducer, but that's literally a 1-liner).

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

It sounds like this bug exists in Linus' tree as well as all the stable kernels. Have you also sent a message upstream, or opened an upstream bug report?

Changed in linux (Ubuntu):
importance: Undecided → Medium
tags: added: bot-stop-nagging saucy
tags: added: kernel-da-key
Revision history for this message
Logan Rosen (logan) wrote :
Revision history for this message
Tim Gardner (timg-tpi) wrote :

git describe --contains 069b918623e1510e58dacf178905a72c3baa3ae4
v3.14-rc2~12^2~7

Revision history for this message
Jon Grimm (jgrimm) wrote :

Any idea if this is really fixed now or not. Asking as there appears to be a workaround patch carried around in libqb that seems possible to drop if so:

"
    - debian/patches/aarch64_no_coarse_clock.patch: Avoid
      CLOCK_REALTIME_COARSE on aarch64 due to a kernel bug.
"

Adding libqb task to help remember to remove that patch if this bug closes .

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

This very very likely was fixed a long time ago, but I wanted to be sure.
So I used this simple test program:

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <time.h>

void timespec_diff(const struct timespec *start, const struct timespec *stop,
                   struct timespec *result)
{
    if ((stop->tv_nsec - start->tv_nsec) < 0) {
        result->tv_sec = stop->tv_sec - start->tv_sec - 1;
        result->tv_nsec = stop->tv_nsec - start->tv_nsec + 1000000000UL;
    } else {
        result->tv_sec = stop->tv_sec - start->tv_sec;
        result->tv_nsec = stop->tv_nsec - start->tv_nsec;
    }

    return;
}

void timecheck(clockid_t clkid, int argc, char **argv)
{
    struct timespec start, stop, dur;

    if( clock_gettime( clkid, &start) == -1 ) {
      perror( "clock gettime" );
      exit( EXIT_FAILURE );
    }

    system( argv[1] );

    if( clock_gettime( clkid, &stop) == -1 ) {
      perror( "clock gettime" );
      exit( EXIT_FAILURE );
    }

    timespec_diff(&start, &stop, &dur);
    printf( "%4ld.%-12ld\n", dur.tv_sec, dur.tv_nsec);
}

int main( int argc, char **argv )
{
    timecheck(CLOCK_REALTIME, argc, argv);
    timecheck(CLOCK_REALTIME_COARSE, argc, argv);
    timecheck(CLOCK_MONOTONIC, argc, argv);
    timecheck(CLOCK_MONOTONIC_COARSE, argc, argv);
    timecheck(CLOCK_MONOTONIC_RAW, argc, argv);
    timecheck(CLOCK_BOOTTIME, argc, argv);
    timecheck(CLOCK_PROCESS_CPUTIME_ID, argc, argv);
    timecheck(CLOCK_THREAD_CPUTIME_ID, argc, argv);
    return( EXIT_SUCCESS );
}

$ gcc -Wall -o test test.c
$ ./test "sleep 0.3s"

All working on aarch64.

That said we can drop this Delta from libqb.
I have used a libqb synced from Debian in a ppa that doesn't have this fix without issues.
That said we can drop this on the next merge/sync of libqb.
Setting the tasks here to Fix released (the issue reported in this bug no more exists).

Changed in linux (Ubuntu):
status: Confirmed → Fix Released
Changed in libqb (Ubuntu):
status: New → Fix Released
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Actually lets close libqb with the sync then (when the Delta is dropped).

Changed in libqb (Ubuntu):
status: Fix Released → Triaged
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

This bug was fixed in the package libqb - 1.0.3-1

---------------
libqb (1.0.3-1) unstable; urgency=medium

  [ Christoph Berg ]
  * [3e6a1ea] Remove Richard and myself from Uploaders

  [ Ferenc Wágner ]
  * [2dbb472] Update old style gbp.conf section names
  * [c566381] New upstream release (1.0.3) (Closes: #871153, #877562)
  * [2fa9704] Remove upstreamed/obsoleted patches, refresh the Hurd support
    patch
  * [13e68c7] Update symbols file.
    Remove some internal symbols (see c011b12) and add a new one.
  * [dc0438b] New patch: hurd: definition of PATH_MAX must be included
    separately
  * [737a79d] Update Standards-Version to 4.1.3 (no changes required)
  * [1fc3d87] Switch to Debhelper compat level 11
  * [930dba8] Combat test failures with a world-writeable socket directory.
    On Linux systems libqb uses abstract sockets by default, which lack
    access control. However, they aren't available on other platforms.
    The other option is using file system sockets, by default under
    /var/run. This directory is only writable by root, though, which
    makes it inapproriate for unprivileged applications. So use /tmp
    instead.
    See also: https://github.com/ClusterLabs/libqb/issues/294
  * [b7d5dea] New patch: tests: always run the SHM suite, just expect failures
  * [419537a] New patch: hurd: the socket tests are expected to fail
    (Closes: #803777)
  * [ae9b078] Switch gbp dch to verbose changelog entries
  * [75fc9d2] Stop repeating the common description
  * [6e9aa99] Migrate to salsa.debian.org/ha-team
  * [2093569] Whitespace cleanup in debian/changelog
  * [5ad582f] Ship example code in the doc package
  * [c6d7de2] Use secure URI in the Homepage field
  * [0d73506] Modernize watch file, add signature checking
  * [343b790] qb-blackbox makes libqb-dev not co-installable
  * [fe6e555] Lintian does not emit embedded-javascript-library for Doxygen
    anymore
  * [42afbde] New patch: Fix spelling: plaform -> platform

 -- Ferenc Wágner <email address hidden> Wed, 14 Mar 2018 12:42:20 +0100

Changed in libqb (Ubuntu):
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.