DEP8 segfault on arm64

Bug #1843804 reported by Andreas Hasenack
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
glibc (Ubuntu)
Invalid
Undecided
Unassigned
ruby-ferret (Ubuntu)
Fix Released
Low
Unassigned
ruby2.5 (Ubuntu)
Invalid
High
Unassigned

Bug Description

DEP8 tests are segfaulting on arm64:
...Stack trace:
Not available
E
===============================================================================
Error: test_key_used_for_id_field(IndexTest):
  StandardError: Signal occurred at <global.c>:422 in sighandler_crash
  Exiting on signal SIGSEGV (11)
/tmp/autopkgtest.emjxzb/build.gxX/src/test/unit/index/tc_index.rb:202:in `new'
/tmp/autopkgtest.emjxzb/build.gxX/src/test/unit/index/tc_index.rb:202:in `test_key_used_for_id_field'
     199: def test_key_used_for_id_field
     200: fs_path = File.expand_path(File.join(File.dirname(__FILE__), '../../temp/fsdir'))
     201:
  => 202: index = Index.new(:path => fs_path, :key => :my_id, :create => true)
     203: [
     204: {:my_id => "three", :id => "me"},
     205: {:my_id => "one", :field2 => "three"},
===============================================================================

Closest I got to debugging this is:
#0 0x0000ffffbf5a0744 in generic_ivar_get (undef=52, id=85987, obj=<optimized out>) at variable.c:990
        iv_index_tbl = <error reading variable iv_index_tbl (Cannot access memory at address 0x0)>
        index = 187649990798752
        ivtbl = 0xaaaaab0b2da0
        ivtbl = <optimized out>
        iv_index_tbl = <optimized out>
        index = <optimized out>
        ret = <optimized out>
#1 rb_ivar_lookup (obj=<optimized out>, id=id@entry=85987, undef=undef@entry=52) at variable.c:1205
        val = <optimized out>
        ptr = <optimized out>
        iv_index_tbl = <optimized out>
        len = <optimized out>
        index = 187649990798752
#2 0x0000ffffbf5a15e0 in rb_ivar_lookup (undef=52, id=85987, obj=<optimized out>) at variable.c:1184
        val = <optimized out>
        len = <optimized out>
        ptr = <optimized out>
        iv_index_tbl = <optimized out>
        index = <optimized out>
        val = <optimized out>
        ptr = <optimized out>
        iv_index_tbl = <optimized out>
        len = <optimized out>
        index = <optimized out>
#3 rb_ivar_get (obj=<optimized out>, id=85987) at variable.c:1214
        iv = <optimized out>
#4 0x0000ffffbeb1c12c in frb_fsdir_new (argc=<optimized out>, argv=<optimized out>, klass=<optimized out>) at r_store.c:375
        ref_cnt = <optimized out>
        self = 187649991624280
        rpath = 187649996041240
        rcreate = <optimized out>
        store = 0xaaaaab391410
        create = <optimized out>
        verify = <optimized out>

The full stack trace has 85 frames.

The NULL iv_index_tbl comes from;
        st_table *iv_index_tbl = RCLASS_IV_INDEX_TBL(rb_obj_class(obj));

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

If I change this test to use another directory:

test/unit/index/tc_index.rb:
  def test_key_used_for_id_field
    fs_path = File.expand_path(File.join(File.dirname(__FILE__), '../../temp/fsdir'))
...

Like:
    fs_path = File.expand_path(File.join(File.dirname(__FILE__), '../../temp/fsdir2'))

Then the test passes without a segfault on arm64.

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

To reproduce the segfault without having to run the full test suite, run this from within the root of the ruby-ferret source package directory:

export RUBYLIB=.

And then:
ruby2.5 debian/ruby-tests.rb

or

gdb ruby2.5
r debian/ruby-tests.rb

Revision history for this message
Andreas Hasenack (ahasenack) wrote :
Revision history for this message
Andreas Hasenack (ahasenack) wrote :

The test also passes if I drop this hunk from d/p/disable_load_path_manipulation.patch:

https://salsa.debian.org/ruby-team/ruby-ferret/blob/master/debian/patches/disable_load_path_manipulation.patch#L18

tags: added: update-excuse
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

By blocking the glibc migration this also affects glibc (for tracking)

Changed in glibc (Ubuntu):
status: New → Invalid
Revision history for this message
Andreas Hasenack (ahasenack) wrote :

By blocking the ruby2.5 migration this also affects ruby2.5 (for tracking)

Changed in ruby-ferret (Ubuntu):
status: New → Triaged
Changed in ruby2.5 (Ubuntu):
status: New → Triaged
Changed in ruby-ferret (Ubuntu):
importance: Undecided → High
Changed in ruby2.5 (Ubuntu):
importance: Undecided → High
tags: added: server-next
Revision history for this message
Bryce Harrington (bryce) wrote :

@Lucas subbing you to take a look at, maybe you can help it get progress upstream?

Revision history for this message
Lucas Kanashiro (lucaskanashiro) wrote :
Revision history for this message
Lucas Kanashiro (lucaskanashiro) wrote :

During my tests the debdiff attached fixed the DEP-8 tests on arm64 and the build in armhf. @Andreas could you please test it on your side?

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

This seems fine, but I haven't been able to reproduce the failure anymore, at least not on arm64. I wanted to check in dmesg if there was also a BUS error.

Changed in ruby-ferret (Ubuntu):
assignee: nobody → Lucas Kanashiro (lucaskanashiro)
Revision history for this message
Lucas Kanashiro (lucaskanashiro) wrote :

Agreed, I had the same problem. This DEP-8 test doesn't fail in arm64 every time, on average it fails 2 out of 5 sequential executions for me. It should happen because in some executions we are lucky and the memory access is aligned. Applying the proposed patch I've not faced any failure after more than 5 sequential executions.

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

The error indeed takes a varying amount of runs to manifest itself.
- 13 runs
- 170 runs
- 62 runs

The 4th attempt is still running, and the count is at 309 currently.

Changed in ruby2.5 (Ubuntu):
status: Triaged → Invalid
Revision history for this message
Lucas Kanashiro (lucaskanashiro) wrote :

This test has not failed since Cosmic on armhf [1], so it is not a blocker. Moreover, upstream is dead (more than 3 years with no new commits), I submitted the patch I mentioned above and there is no answer so far [2].

This package has no reverse dependencies, if it turns out a problem we should consider remove it from the archive.

Due to what I mentioned above, I believe this bug does not qualify to be in the server-next queue anymore.

[1] http://autopkgtest.ubuntu.com/packages/ruby-ferret
[2] https://github.com/jkraemer/ferret/pull/17

tags: removed: server-next
Changed in ruby-ferret (Ubuntu):
assignee: Lucas Kanashiro (lucaskanashiro) → nobody
Revision history for this message
Balint Reczey (rbalint) wrote :

The package started failing the same way on ppc64el in Impish:

https://autopkgtest.ubuntu.com/packages/r/ruby-ferret/impish/ppc64el

Revision history for this message
Lucas Kanashiro (lucaskanashiro) wrote :

Since this seems to not be a blocker anymore I am lowering the importance to low.

Changed in ruby-ferret (Ubuntu):
importance: High → Low
Revision history for this message
Lucas Kanashiro (lucaskanashiro) wrote :

ruby-ferret/0.11.8.7-2 in Jammy is behaving well on arm64:

https://autopkgtest.ubuntu.com/packages/ruby-ferret/jammy/arm64

Changed in ruby-ferret (Ubuntu):
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.