random errors on aarch64 when executing __aarch64_cas8_acq_rel

Bug #1883268 reported by Christophe Lyon
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
QEMU
Invalid
Undecided
Unassigned

Bug Description

Hello,

Since I upgraded to qemu-5.0 when executing the GCC testsuite,
I've noticed random failures of g++.dg/ext/sync-4.C.

I'm attaching the source of the testcase, the binary executable and the qemu traces (huge, 111MB!) starting at main (with qemu-aarch64 -cpu cortex-a57 -R 0 -d in_asm,int,exec,cpu,unimp,guest_errors,nochain)

The traces where generated by a CI build, I built the executable manually but I expect it to be the same as the one executed by CI.

In seems the problem occurs in f13, which leads to a call to abort()

The preprocessed version of f13/t13 are as follows:
static bool f13 (void *p) __attribute__ ((noinline));
static bool f13 (void *p)
{
  return (__sync_bool_compare_and_swap((ditype*)p, 1, 2));
}
static void t13 ()
{
  try {
    f13(0);
  }
  catch (...) {
    return;
  }
  abort();
}

When looking at the execution traces at address 0x00400c9c, main calls f13, which in turn calls __aarch64_cas8_acq_rel (at 0x00401084)
__aarch64_cas8_acq_rel returns to f13 (address 0x0040113c), then f13 returns to main (0x0040108c) which then calls abort (0x00400ca0)

I'm not quite sure what's wrong :-(

I've not noticed such random problems with native aarch64 hardware.

Tags: arm testcase
Revision history for this message
Christophe Lyon (christophe-lyon) wrote :
Revision history for this message
Christophe Lyon (christophe-lyon) wrote :
  • Binary Edit (23.2 KiB, application/x-msdos-program)
Revision history for this message
Christophe Lyon (christophe-lyon) wrote :
Alex Bennée (ajbennee)
tags: added: arm testcase
Revision history for this message
Richard Henderson (rth) wrote :

FWIW, I cannot reproduce the problem with x86_64 host,
but I can reproduce it on a 32-bit i686 host.

Changed in qemu:
status: New → Confirmed
Revision history for this message
Richard Henderson (rth) wrote :

There's nothing wrong with the atomic operation, which
makes sense since it's against a NULL pointer. The
problem that I see is in the unwinding -- the catch
never happens and std::terminate gets called.

There must be some sort of 32-bit TCG error though,
because the same binary works on x86_64 host.

The most confusing thing about this test case is that
12 previous throws work correctly, but the 13th fails.

Changed in qemu:
status: Confirmed → In Progress
Revision history for this message
Christophe Lyon (christophe-lyon) wrote :

Hi Richard,

Thanks for taking a look and confirming that you managed to reproduce the problem.
I forgot to mention that I'm using x86_64 hosts, not i686. I hope there are not two unrelated issues...

Revision history for this message
Thomas Huth (th-huth) wrote :

The QEMU project is currently moving its bug tracking to another system.
For this we need to know which bugs are still valid and which could be
closed already. Thus we are setting the bug state to "Incomplete" now.

If the bug has already been fixed in the latest upstream version of QEMU,
then please close this ticket as "Fix released".

If it is not fixed yet and you think that this bug report here is still
valid, then you have two options:

1) If you already have an account on gitlab.com, please open a new ticket
for this problem in our new tracker here:

    https://gitlab.com/qemu-project/qemu/-/issues

and then close this ticket here on Launchpad (or let it expire auto-
matically after 60 days). Please mention the URL of this bug ticket on
Launchpad in the new ticket on GitLab.

2) If you don't have an account on gitlab.com and don't intend to get
one, but still would like to keep this ticket opened, then please switch
the state back to "New" or "Confirmed" within the next 60 days (other-
wise it will get closed as "Expired"). We will then eventually migrate
the ticket automatically to the new system (but you won't be the reporter
of the bug in the new system and thus you won't get notified on changes
anymore).

Thank you and sorry for the inconvenience.

Changed in qemu:
status: In Progress → Incomplete
Revision history for this message
Christophe Lyon (christophe-lyon) wrote :
Revision history for this message
Thomas Huth (th-huth) wrote :

Thanks for moving the ticket to gitlab! ... so I'm closing this on Launchpad now.

Changed in qemu:
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.