Test t/06190 hangs on arm64

Bug #1860592 reported by Andreas Hasenack
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
gscan2pdf (Debian)
Fix Released
Unknown
gscan2pdf (Ubuntu)
Fix Released
Critical
Unassigned

Bug Description

Test t/06190 hangs during an autopkgtest run on arm64, under xvfb-run, at least in our launchpad testers. The last output is from the previous t/0618_Dialog_Scan_Image_Sane.t one and then nothing, until the test is killed. This was reproduced on an arm64 VM in canonistack:

t/0618_Dialog_Scan_Image_Sane.t ............... ok
autopkgtest [16:37:37]: ERROR: timed out on command

While 0618 is on screen, process listing shows:
  13635 pts/0 S+ 0:00 \_ sudo autopkgtest -o dep8 -U -s -B ./gscan2pdf-2.6.3/ -- null
  13636 pts/0 S+ 0:00 \_ /usr/bin/python3 -u /usr/bin/autopkgtest -o dep8 -U -s -B ./gscan2pdf-2.6.3/ -- null
  13637 pts/0 S+ 0:00 \_ tee /tmp/autopkgtest-fifo.lijj3_tl
  13638 pts/0 S+ 0:00 \_ tee /tmp/autopkgtest-fifo.lijj3_tl -a /dev/stderr
  13639 pts/0 S+ 0:00 \_ cat /tmp/autopkgtest-fifo.lijj3_tl
  13640 pts/0 S+ 0:00 \_ /usr/bin/python3 /usr/bin/autopkgtest-virt-null
  50649 pts/0 S+ 0:00 \_ sh -ec exec 3>&1 >&2; cd /tmp/autopkgtest.ZoJL07/build.4GO/real-tree; DEB_BUILD_OPTIONS="parallel=1 $DEB_BUILD_OPTIONS" dpkg-buildpackage -us -uc -b; dpkg-source --before-build
  50650 pts/0 S+ 0:00 \_ /usr/bin/perl /usr/bin/dpkg-buildpackage -us -uc -b
  50666 pts/0 S+ 0:00 \_ /usr/bin/make -f debian/rules binary
  50667 pts/0 S+ 0:00 \_ /usr/bin/perl /usr/bin/dh binary
  50730 pts/0 S+ 0:00 \_ /usr/bin/make -f debian/rules override_dh_auto_test
  50731 pts/0 S+ 0:00 \_ /bin/sh /usr/bin/xvfb-run -a dh_auto_test
  50741 pts/0 S+ 0:01 \_ Xvfb :99 -screen 0 1280x1024x24 -nolisten tcp -auth /tmp/xvfb-run.M49Z5X/Xauthority
  50744 pts/0 S+ 0:00 \_ /usr/bin/perl /usr/bin/dh_auto_test
  50751 pts/0 S+ 0:00 \_ make -j1 test TEST_VERBOSE=1
  50767 pts/0 S+ 0:00 \_ /usr/bin/perl -MExtUtils::Command::MM -MTest::Harness -e undef *Test::Harness::Switches; test_harness(1, 'blib/lib', 'blib/arch') t/01_NetPBM.t t
  54751 pts/0 Sl+ 0:01 \_ /usr/bin/perl t/06190_Dialog_Scan_Image_Sane.t

strace shows:
root@focal-arm64:~# strace -f -p 54751
strace: Process 54751 attached with 2 threads
[pid 54755] futex(0xffff90000cbc, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 54751] ppoll([{fd=5, events=POLLIN}, {fd=10, events=POLLIN}], 2, NULL, NULL, 0l

Open fds:
root@focal-arm64:~# ll /proc/54751/fd
total 0
dr-x------ 2 root root 0 Jan 22 18:36 ./
dr-xr-xr-x 9 root root 0 Jan 22 18:30 ../
lr-x------ 1 root root 64 Jan 22 18:36 0 -> /dev/null
l-wx------ 1 root root 64 Jan 22 18:36 1 -> 'pipe:[77465]'
lrwx------ 1 root root 64 Jan 22 18:36 10 -> 'socket:[77473]'
l-wx------ 1 root root 64 Jan 22 18:36 2 -> 'pipe:[77466]'
l-wx------ 1 root root 64 Jan 22 18:36 3 -> 'pipe:[74229]'
lr-x------ 1 root root 64 Jan 22 18:36 4 -> /tmp/autopkgtest.ZoJL07/build.4GO/real-tree/t/06190_Dialog_Scan_Image_Sane.t
lrwx------ 1 root root 64 Jan 22 18:36 5 -> 'anon_inode:[eventfd]'
l-wx------ 1 root root 64 Jan 22 18:36 6 -> 'pipe:[77465]'
l-wx------ 1 root root 64 Jan 22 18:36 7 -> 'pipe:[77466]'
l-wx------ 1 root root 64 Jan 22 18:36 8 -> 'pipe:[77465]'
l-wx------ 1 root root 64 Jan 22 18:36 9 -> 'pipe:[77466]'

This is blocking the migration of sane-backends, which in turn is blocking the migration of net-snmp 5.8.

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

To more easily reproduce this, without having to run the whole test suite until you get to test 06190, run:

xvfb-run -a make TEST_FILES=t/06190_Dialog_Scan_Image_Sane.t test

from inside the deb source package directory.

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

I managed to get this to work if I run the test under strace (!).

I also enabled debugging, and have two files I'm going to attach. One where it worked (under strace), and the other one where it stalled.

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Output of this command:

xvfb-run -a strace -f -s 500 -o bla make TEST_FILES=t/06190_Dialog_Scan_Image_Sane.t test > works.txt 2>&1

the strace output I discarded

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Output of this command:

xvfb-run -a make TEST_FILES=t/06190_Dialog_Scan_Image_Sane.t test > fails.txt 2>&1

Changed in gscan2pdf (Debian):
status: Unknown → New
Revision history for this message
Jeffrey Ratcliffe (jeffreyratcliffe) wrote :

I've just downloaded an arm64 image for focal from here:

http://cloud-images.ubuntu.com/focal/20200126/focal-server-cloudimg-arm64.img

and booted it following the instructions here:

https://askubuntu.com/questions/281763/is-there-any-prebuilt-qemu-ubuntu-image32bit-online/1081171#1081171

i.e.

sudo qemu-system-aarch64 \
> -M virt \
> -cpu cortex-a57 \
> -device rtl8139,netdev=net0 \
> -m 4096 \
> -netdev user,id=net0 \
> -nographic \
> -smp 4 \
> -drive "if=none,file=focal-server-cloudimg-arm64.img,id=hd0" \
> -device virtio-blk-device,drive=hd0 \
> -drive "file=user-data.img,format=raw" \
> -pflash flash0.img \
> -pflash flash1.img

Having installed the build dependencies,

xvfb-run -a make TEST_FILES=t/06190_Dialog_Scan_Image_Sane.t test

runs without error.

Which focal image are you using?

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

I used an image from our internal cloud, but it should be the same published out there. I'll give it another run, maybe there was an update. If that still fails, I'll try upgrading a public cloud image/instance.

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

I tried an arm64 focal VM on scaleway, and there the test passed as well :/

So it must be something very particular to our infrastructure.

I think you can mark the debian bug as invalid then.

Changed in gscan2pdf (Debian):
status: New → Fix Released
Revision history for this message
Till Kamppeter (till-kamppeter) wrote :
Changed in gscan2pdf (Ubuntu):
importance: Undecided → Critical
milestone: none → ubuntu-20.04
status: New → Confirmed
Revision history for this message
Gianfranco Costamagna (costamagnagianfranco) wrote :

For some reasons in groovy and the new release, it works now...

Revision history for this message
Brian Murray (brian-murray) wrote :

Comment #5 indicates that 4096M of memory was used for test and I wonder how much memory was used in the test on scaleway mentioned in comment #7.

The Ubuntu autopkgtest infrastructure by default runs on systems with 2G of memory. I ran this on a m1.large system (8G of memory) and the test passed so I've added gscan2pdf to "big_packages" on arm64.

https://git.launchpad.net/autopkgtest-cloud/commit/worker-config-production?id=73c749d95286c0f3394b87aed8e507b5022404ee

Once this is deployed the tests should be retried and I suspect they will pass.

Revision history for this message
Brian Murray (brian-murray) wrote :

gscan2pdf has migrated to the release pocket as the test did end up passing on a system with more memory.

Changed in gscan2pdf (Ubuntu):
status: Confirmed → Fix Released
Revision history for this message
Jeffrey Ratcliffe (jeffreyratcliffe) wrote :

Thanks for debugging this.

I've looked at the test, and I don't know why it needs that much memory.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.