Version 5.2.0 causing Illegal instruction

Bug #2059910 reported by Michael Brunnbauer
54
This bug affects 7 people
Affects Status Importance Assigned to Milestone
lxml
Fix Released
Medium
scoder

Bug Description

hi all,

version 5.2.0 causes an "Illegal instruction" crash for me on a Xeon E5603 while version 5.1.1 works. I guess this is due to raising the minimum CPU architecture in 5.2.0, which I would strongly recommend to roll back.

Regards,

Michael Brunnbauer

Revision history for this message
Paulo Peres Jr (r4t40) wrote :

Having the same issue here, I have 4 customers that use my software and the use older Processors and it's causing the same problem.

Revision history for this message
scoder (scoder) wrote :

Ok, looks like 15 year old systems are still in active use with up-to-date software.

lxml used to target Intel "core2" for the last five years, which was 13 years old then, back in 2019. I now used "sandybridge" as a target for 5.2.0 to enable AVX instructions, but these reports indicate that at least "nehalem" still needs to be supported, which is just one step up. I'll try that for 5.2.1, to get SSE4.2, which is en-par with AMD's Bulldozer chips from 2011+.

Changed in lxml:
assignee: nobody → scoder (scoder)
status: New → Confirmed
milestone: none → 5.2.1
Revision history for this message
Daniel Garcia Briseno (dgarciabriseno) wrote :

It's not just 15 year old systems.
This is an issue on Macs with apple silicon that need to run x86 programs too.
Apparently the emulation doesn't support AVX instructions.

Revision history for this message
Daniel Garcia Briseno (dgarciabriseno) wrote :

My case is an edge case though. I know the mac version works fine.
I have to deal with an old program that I only have an x86 binary for, which I run in an x86 docker container that happens to also need lxml.

Revision history for this message
scoder (scoder) wrote :

> Apparently the emulation doesn't support AVX instructions.

A web search confirms that, but it's still unclear to me whether Apple's Rosetta-2 emulation supports SSE4.2. It might not. I found hints that it supports SSE2, but that's not what I'm after.

Can anyone get the "CPUID" flags from the emulation and post them here? There's supposed to be a "sysctl" command for this, but I can't tell whether it works in emulation mode:

    sysctl -a | grep machdep.cpu.features

https://stackoverflow.com/questions/6121792/how-to-check-if-a-cpu-supports-the-sse3-instruction-set

On Linux, there's the "cpuid" command or "lscpu", or you can read from "/proc/cpuinfo".

Revision history for this message
Cédric Krier (cedk) wrote :

Some runners of our CI are still using Ivy Bridge processors.
Will it not be better to use the same minimal target as cpython and let users who want more optimizations compile the package with their custom CFLAGS from source instead of wheel?

Revision history for this message
scoder (scoder) wrote :

Ivy Bridge supports SSE4.2.

After the initial report, I changed the options to this:

https://github.com/lxml/lxml/blob/a1b9c66891ca3a3ae6db01274a18bb1cc45cece1/pyproject.toml

[[tool.cibuildwheel.overrides]]
select = "*linux_i686"
environment.CFLAGS = "-O3 -g1 -pipe -fPIC -flto -march=core2 -mtune=generic"

[[tool.cibuildwheel.overrides]]
select = "*linux_x86_64"
environment.CFLAGS = "-O3 -g1 -pipe -fPIC -flto -march=core2 -msse4.1 -msse4.2 -mtune=generic"

Revision history for this message
Vasily (vasachi) wrote :

Here's what is shown on my M1 under rosetta:

$ sysctl -a | grep machdep.cpu.features
machdep.cpu.features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM PBE SSE3 PCLMULQDQ DTSE64 MON DSCPL VMX EST TM2 SSSE3 CX16 TPR PDCM SSE4.1 SSE4.2 AES SEGLIM64

So looks like SSE4.2 is supported.

Revision history for this message
scoder (scoder) wrote :

> Here's what is shown on my M1 under rosetta

Cool, thanks! Then I think "core2" + SSE4.2 is a reasonable choice.
The reports so far were all for AVX instructions, not for SSE.

Revision history for this message
scoder (scoder) wrote :

Fix released in lxml 5.2.1.

Changed in lxml:
importance: Undecided → Medium
status: Confirmed → Fix Released
Revision history for this message
scoder (scoder) wrote :
Revision history for this message
hhesse (hhesse) wrote :

Hi there,

Not to add more unneeded noise to this, but FYI we also still ran into this issue with 5.2.1, since our CI runner is using an AMD Opteron 6174 (from 2010) which seems to only support SSE4a, and not SSE4.2.

As a workaround we have set PIP_NO_BINARY=lxml in our pipelines to build the module ourselves, which seems to work fine. This does extend the runtime of our pipeline by ~7 mins which is a slight nuisance, but we can live with it if we need to. We realize our server is probably a statistical outlier :)

Revision history for this message
Martin (martin22) wrote :

I confirm that the issue persists with 5.2.1 on my machine (Intel Atom CPU N2800 with sse and sse2).

Revision history for this message
Trenton Holmes (stumpylog) wrote :

This issue also affects our users, particularly those using NAS hardware from Synology, for example an Intel Atom D2700, which reaches only to SSE3 according to Ark.

Given we don't control the hardware in this instance, rebuilding lxml isn't an option. Is there no way to detect instruction support and enable/disable as needed?

Revision history for this message
c0mputerguru (c0mputerguru) wrote (last edit ):

I too am running hardware that is 10+ years old that still runs up to date software. Similar to hhesse, I've got a CPU that supports SSE4a and SSE2, but not SSE4.1 or SSE4.2 (AMD Phenom II X6 1090T).

I hit this via my use of certbot to automate creating/renewing SSL certificates (https://discourse.linuxserver.io/t/certbot-illegal-instruction/8832), which for me is a core component of my network and functionality that I don't feel comfortable leaving unpatched. I've also got other containers that use lxml; I don't like that they are also unpatched, but less of an immediate concern.

I'd rather not set up a separate build pipeline for all the docker images I use that depend on lxml, so my options are either to upgrade hardware or for lxml to go back to supporting my CPU. Since my hardware currently meets my compute needs, I'd rather not upgrade my hardware. However, I understand that supporting 10+ year old hardware isn't necessarily at the top of the requirements list for most folks.

Selfishly I'd ask that older instruction sets be supported, but if there's good reason why the minimum required instruction set was changed, it'd help me sleep better at night.

What would make sense to me would be that lxml match the supported instruction sets of cython (https://github.com/cython/cython/blob/dbb4e6a0e36a6c190d67d6e829db9164503be5d4/pyproject.toml#L14), but I noticed that scoder is the maintainer of that as well, so maybe he's planning on changing cython's minimum dependencies as well.

Edit: I forgot to mention, a big thank you to folks like scoder who maintain these core libraries that keep compute running around the world. I know it's not easy, but know that there are people that see the work you do and appreciate your tireless efforts.

Revision history for this message
scoder (scoder) wrote :

Ok, given the importance of lxml for all sorts of use cases and as a common (transitive) dependency, I'll set the CFLAGS back to the conservative "core2" that we had for years. I'll release a 5.2.2 soon.

Revision history for this message
scoder (scoder) wrote :

Changed back to "core2" without SSE in lxml 5.2.2 wheels.

Changed in lxml:
milestone: 5.2.1 → 5.2.2
Revision history for this message
Martin (martin22) wrote :

I confirm it works again for me with 5.2.2. Thanks a lot @scoder

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.