Apache2 crashes with SIGBUS in mod_ssl

Bug #2107254 reported by Alexis Wilke

This bug report will be marked for expiration in 38 days if no further activity occurs. (find out why)

6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
apache2 (Ubuntu)
Incomplete
Undecided
Unassigned

Bug Description

$ lsb_release -rd
No LSB modules are available.
Description: Ubuntu 24.04.2 LTS
Release: 24.04

$ dpkg -l apache2
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Architecture Description
+++-==============-=================-============-=================================
ii apache2 2.4.58-1ubuntu8.6 amd64 Apache HTTP Server

While running, about once a day, I get a SIGBUS from the SSL module.

Here is a partial stack trace:

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/sbin/apache2 -k start'.
Program terminated with signal SIGBUS, Bus error.
#0 __memcpy_evex_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:265
warning: 265 ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S: No such file or directory
(gdb) where
#0 __memcpy_evex_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:265
#1 0x000072c37cab74bf in ?? () from /usr/lib/apache2/modules/mod_ssl.so
#2 0x0000601c3919cb44 in ap_http_header_filter ()
#3 0x000072c37cf74748 in ?? () from /usr/lib/apache2/modules/mod_deflate.so
#4 0x000072c37cf596d2 in ?? () from /usr/lib/apache2/modules/mod_filter.so
#5 0x0000601c3916d8e8 in ?? ()
#6 0x0000601c39170c4a in ap_run_handler ()
#7 0x0000601c391744c6 in ap_invoke_handler ()
#8 0x0000601c3919c0e6 in ap_internal_redirect ()
#9 0x000072c37ceeb781 in ?? () from /usr/lib/apache2/modules/mod_rewrite.so
#10 0x0000601c39170c4a in ap_run_handler ()
#11 0x0000601c391744c6 in ap_invoke_handler ()
#12 0x0000601c3919b378 in ap_process_async_request ()
#13 0x0000601c3919b597 in ap_process_request ()
#14 0x0000601c3919b8fd in ?? ()
#15 0x0000601c3918724a in ap_run_process_connection ()
#16 0x000072c37cf412df in ?? () from /usr/lib/apache2/modules/mod_mpm_prefork.so

Revision history for this message
Alexis Wilke (alexis-m2osw) wrote :
Revision history for this message
Renan Rodrigo (renanrodrigo) wrote (last edit ):

Hello, @alexis-m2osw, thanks for reporting this bug!
Are you aware of exact steps to reproduce this behavior from a fresh Ubuntu install?

Meanwhile, I checked the stack trace and seems this memory misalignment issue happens in ssl_io_filter_coalesce (from ssl_engine_io.c).

#0 __memcpy_evex_unaligned_erms ()
    at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:265
#1 0x000072c37cab74bf in memcpy (__len=<optimized out>,
    __src=<optimized out>, __dest=<optimized out>)
    at /usr/include/x86_64-linux-gnu/bits/string_fortified.h:29
#2 ssl_io_filter_coalesce (f=0x72c37ced3830, bb=0x72c37a160f48)
    at /build/apache2-qqIoZi/apache2-2.4.58/modules/ssl/ssl_engine_io.c:1897

What I found interesting is this #1 entry - it mentions `string_fortified.h`. There is this Debian bug from a long long time ago
https://lists.debian.org/debian-apache/2008/06/msg00118.html
Which mentions compiling with -DFORTIFY_SOURCE has caused problems in earlier version - maybe this is related? Would GCC be wrongly optimizing things? I see this is amd64, which GCC version is installed?

tags: added: server-triage-discuss
Revision history for this message
Alexis Wilke (alexis-m2osw) wrote :

Well... I'm not sure that my version of gcc would be the same as the one used to compile a libc function?

    $ gcc --version
    gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0
    Copyright (C) 2023 Free Software Foundation, Inc.
    This is free software; see the source for copying conditions. There is NO
    warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Maybe the CPU spec would be more useful, although I think that would be part of the attachment?

processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 85
model name : Intel(R) Xeon(R) Silver 4216 CPU @ 2.10GHz
stepping : 7
microcode : 0x5003707
cpu MHz : 799.966
cache size : 22528 KB
physical id : 0
siblings : 32
core id : 0
cpu cores : 16
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 22
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 intel_ppin ssbd mba ibrs ibpb stibp ibrs_enhanced tpr_shadow flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts vnmi pku ospke avx512_vnni md_clear flush_l1d arch_capabilities
vmx flags : vnmi preemption_timer posted_intr invvpid ept_x_only ept_ad ept_1gb flexpriority apicv tsc_offset vtpr mtf vapic ept vpid unrestricted_guest vapic_reg vid ple shadow_vmcs pml ept_violation_ve ept_mode_based_exec tsc_scaling
bugs : spectre_v1 spectre_v2 spec_store_bypass swapgs taa itlb_multihit mmio_stale_data retbleed eibrs_pbrsb gds bhi
bogomips : 4200.00
clflush size : 64
cache_alignment : 64
address sizes : 46 bits physical, 48 bits virtual
power management:

I've seen such bugs where one assumes an alignment X is sufficient, when Y is require (where maybe Y = X * 8).

The Debian bug report you linked is from 2008 and they mention Sparc 64, a different type of CPU.

In the Apache logs, I see this:

[Mon Apr 14 07:06:17.459980 2025] [core:notice] [pid 173016] AH00051: child pid 919852 exit signal Bus error (7), possible coredump in /etc/apache2

The other Apache messages just say "loading," "configuring," "initializing," the usual.

Revision history for this message
Alexis Wilke (alexis-m2osw) wrote :

Oh, I see why you mentioned gcc... This is a builtin which looks like this in my includes:

  __fortify_function void *
  __NTH (memcpy (void *__restrict __dest, const void *__restrict __src,
          size_t __len))
  {
    return __builtin___memcpy_chk (__dest, __src, __len,
       __glibc_objsize0 (__dest));
  }

So not a libc function. That may prove difficult to fix if it's part of the C compiler!

Revision history for this message
Alexis Wilke (alexis-m2osw) wrote :

The header itself is defined in libc6

$ dpkg -S /usr/include/x86_64-linux-gnu/bits/string_fortified.h
libc6-dev:amd64: /usr/include/x86_64-linux-gnu/bits/string_fortified.h

$ dpkg -l libc6-dev
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Architecture Description
+++-===============-===============-============-=====================================================
ii libc6-dev:amd64 2.39-0ubuntu8.4 amd64 GNU C Library: Development Libraries and Header Files

Revision history for this message
Renan Rodrigo (renanrodrigo) wrote :

Thanks for the extra information there;
So yes, this is indeed part of libc6. I mentioned the old Debian bug because, although I see this was a different architecture, the flag that caused this kind of problem last time was used to build this particular version. I was just wondering what could be causing a similar issue now...

Revision history for this message
Alexis Wilke (alexis-m2osw) wrote :

I get those crashes around once a day. This makes me think it could be because of a cron job which runs at that frequency. I'll try to find that and see if I can explicitly reproduce.

In the meantime, the very instruction that generates the SIGBUS error is:

    vmovdqu64 (%rsi), %ymm16

yet, in today's crash, the rsi register is set to:

    0x72c37a09b000

So I don't understand the error, unless the memory is not accessible (some buffer overflow type of error rather than alignment issue). Although I thought those would use SIGSEGV instead.

John Chittum (jchittum)
tags: removed: server-triage-discuss
Revision history for this message
Renan Rodrigo (renanrodrigo) wrote :

So, to understand it better,

- Do you now if that was working fine before, and then started failing after an apache2 (+modules, or libc6) upgrade? Do you have a clue on when this started happening?

- Are you able to test if a newer version (like 2.4.62-1ubuntu1.1, in Oracular (24.10)) has the same problem?

- Do you think it would be worth to rebuild version 2.4.58-1ubuntu8.6 without the FORTIFY_SOURCE flag (or set it lower), just to rule out if this is the cause of this problem? I know that would not be a solution, but maybe a step towards finding the problem.

Revision history for this message
Alexis Wilke (alexis-m2osw) wrote :

- When did it start?

When I got my websites back up and running on 24.04. I had the new version for about 3 weeks, but I have no clue whether it would happen before that since the websites were not running yet.

So I did not see an upgrade between the time I got the sites running and started to see these crashes.

- Can you test on 24.10?

Well... not really. I only have 24.04 and for my main server, I don't use intermediate versions (too much work). I could install a VM, but then it would probably not be the same and it would make the setup more complicated (i.e. have to forward the Apache requests to the VM and back...) and there is no sure way to see whether it would fail if it were on the main server.

- Compile without FORITFY_SOURCE?

I would imagine that this way we could at least see whether that fixes the issue. But it would indeed not prove much. It would be cool to know why that very function gets called or rather which URL generates the error. I don't see how I could determine that at the moment. Also the crash did not happen for a couple of days... but I'm not holding my breath.

When recompiling locally, can we generate a package? If so, that would be fine. I could change the version slightly (like add a letter at the end) and install that package.

Revision history for this message
Alexis Wilke (alexis-m2osw) wrote :
Download full text (5.8 KiB)

- When did it start?

When I got my websites back up and running on 24.04. I had the new version for about 3 weeks, but I have no clue whether it would happen before that since the websites were not running yet and no crashes occurred.

So I did not see an upgrade between the time I got the sites running and started to see these crashes. However, I installed a few additional modules (see complete list below).

- Can you test on 24.10?

Well... not directly. I only have 24.04 and for my main server, I don't use intermediate versions (too much work). I could install a VM, but then it would probably not be the same and it would make the setup more complicated (i.e. have to forward the Apache requests to the VM and back...)

- Compile without FORITFY_SOURCE?

I would imagine that this way we could at least see whether that fixes the issue. But it would indeed not prove much. It would be cool to know why that very function gets called or rather which URL generates the error. I don't see how I could determine that at the moment. Also the crash did not happen for a couple of days... but I'm not holding my breath.

When recompiling locally, can we generate a package? If so, that would be fine. I could change the version slightly (like add a letter at the end) and install that package.

Would it be possible to recompile with the version in 24.10? Then we could see whether that crashes (but still run 24.04 as the main OS,) would that make any sense? Or just compile the latest Apache2 source on 24.04 but still apply the Ubuntu available patches?

- New occurrence

It happened again this morning (Sat Apr 19) and this time it's a different location (crc32_z in libz.) However, same signal (SIGBUS). So definitely some memory allocation issues / buffer overflow of some sort. I'll be rebooting so maybe it will be better after that event, although, I doubt that.

The instruction was a simple:

    mov 0x20(%rcx), %rbx

And %rcx is 0x7f45126cb000, which looks like the start of a 4K page...

The error.log has this line:

[Sat Apr 19 07:20:34.565773 2025] [core:notice] [pid 1046102] AH00051: child pid 1635807 exit signal Bus error (7), possible coredump in /etc/apache2

Around the same time, I found this line (anonymized URL):

[Sat Apr 19 07:20:30.521369 2025] [ssl:info] [pid 1635807] [client 127.0.0.1:50542] AH01964: Connection to child 14 established (server example.com:443)

In that same file, I see these errors once in a while:

[Sat Apr 19 06:00:59.321317 2025] [ssl:info] [pid 1635807] (70014)End of file found: [client 127.0.0.1:53594] AH01992: SSL library error 6 reading data

I'm afraid my other logs do not include the PID so I could not find anything with actual full URLs (not just a domain).

I'm also sending the crash reports with the normal Ubuntu system. Would you need something from me to make it easier to find those reports?

---

List of INSTALLED modules:

$ ls /etc/apache2/mods-available/
access_compat.load authz_user.load dir.load log_debug.load proxy_connect.load session_crypto.load
actions.conf autoindex.conf dump_io.load log_forensic.load proxy_express.load session_dbd.load
acti...

Read more...

Revision history for this message
Alexis Wilke (alexis-m2osw) wrote :

Just in case, I checked another one of my servers which runs on Ubuntu 22.04, and there was one crash (just one here!). The crash is similar and happened in the SSL library too. Here is the stack trace:

#0 __memmove_avx_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:317
#1 0x00007fec7fea637b in ?? () from /usr/lib/apache2/modules/mod_ssl.so
#2 0x000056465524f112 in ap_http_header_filter ()
#3 0x00007fec805ab629 in ?? () from /usr/lib/apache2/modules/mod_deflate.so
#4 0x00007fec805926ce in ?? () from /usr/lib/apache2/modules/mod_filter.so
#5 0x00005646552216d5 in ?? ()
#6 0x0000564655224d18 in ap_run_handler ()
#7 0x0000564655226c06 in ap_invoke_handler ()
#8 0x000056465524e68e in ap_internal_redirect ()
#9 0x00007fec7fee4f59 in ?? () from /usr/lib/apache2/modules/mod_rewrite.so
#10 0x0000564655224d18 in ap_run_handler ()
#11 0x0000564655226c06 in ap_invoke_handler ()
#12 0x000056465524d8f8 in ap_process_async_request ()
#13 0x000056465524db33 in ap_process_request ()
#14 0x000056465524de77 in ?? ()
#15 0x0000564655239e88 in ap_run_process_connection ()
#16 0x00007fec80552128 in ?? () from /usr/lib/apache2/modules/mod_mpm_prefork.so
#17 0x00007fec805524a1 in ?? () from /usr/lib/apache2/modules/mod_mpm_prefork.so
#18 0x00007fec80552da7 in ?? () from /usr/lib/apache2/modules/mod_mpm_prefork.so
#19 0x00005646552050e8 in ap_run_mpm ()
#20 0x0000564655204609 in main ()

As a side note, that Ubuntu 22.04 crash did not include the `Package: ...` field. But I would imagine that was fixed since it works as expected with Ubuntu 24.04.

Revision history for this message
Bryce Harrington (bryce) wrote :

Hi alexis-m2osw, thanks for reporting this SIGBUS crash in mod_ssl, and especially for attaching the core dump; that's really useful. This type of error often suggests a problem with memory access. Since it's happening in ssl_io_filter_coalesce and appears intermittently, it's possible a specific SSL traffic pattern or a race condition might be triggering it.

To help us investigate further and get closer to a potential fix, could you provide any additional details that might help reproduce this? For example, are there specific types of SSL certificates, cipher suites, or client behaviors that seem to coincide with the crashes? Additionally, if you can collect Apache logs with LogLevel debug around the time of a crash, that could provide more specific clues about the state of mod_ssl's I/O leading up to the issue. Any information you can provide to help identify a reliable way to reproduce this SIGBUS would be greatly appreciated.

Changed in apache2 (Ubuntu):
status: New → Incomplete
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.