segfault in libc-2.23.so netinstall installation pxe

Bug #1816846 reported by Rodrigo Fernandez
186
This bug affects 43 people
Affects Status Importance Assigned to Milestone
glibc (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

I have implemented a PXE server in Virtualbox 6.0.4 to perform automatic installations of different versions of Linux (particularly Ubuntu).

Until a few hours ago I could perform the automatic installation of Ubuntu Server 16.04.5 on a VM, but after updating libc6-udeb (2.23-0ubuntu11), I get the following error:

kernel: [ 186.927962] sh[8177]: segfault at 18 ip 00007fc21ff438df sp 00007ffe594a5e80 error 4 in libc-2.23.so[7fc21ff23000+1c0000]

I don't know how to fetch the installation log, so I attach a screenshot.

Revision history for this message
Rodrigo Fernandez (hipoloco) wrote :
Revision history for this message
Radek 'blufor' Slavicinsky (blufor) wrote :

I can verify this while doing amd64 netboot and following preseed install of a VM on 3 separate KVM hosts. It's been also working for me several hours ago

Revision history for this message
Radek 'blufor' Slavicinsky (blufor) wrote :
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in glibc (Ubuntu):
status: New → Confirmed
Revision history for this message
Johan Zaxmy (johan-zaxmy) wrote :

Same error as reporter, on several machines. Happens during manual installation from netinstall image (PXE booted).

Updating kernel and initrd.gz to xenial-proposed hwe-netboot 20101020ubuntu451.28 solves the problem.

Revision history for this message
Nanowish (nanowish) wrote :

Same issue here since since yesterday afternoon

summary: - segfault in libc-2.23.so during automatic installation
+ segfault in libc-2.23.so netboot installation
summary: - segfault in libc-2.23.so netboot installation
+ segfault in libc-2.23.so netinstall installation pxe
Revision history for this message
Rodrigo Fernandez (hipoloco) wrote :

I solved the problem with the solution proposed by johan-zaxmy.

Revision history for this message
Vladislav Khlopeniuk (vladhk) wrote :

kernel and initrd.gz from xenial-proposed 20101020ubuntu451.28 works fine. Thank you Johan Zaxmy for suggestion.

Revision history for this message
Vili Kosednar (safesize) wrote :

netboot from xenial-proposed 20101020ubuntu451.28 works!
Thanks Johan!

Revision history for this message
Mr Robot (meltedrobot) wrote :

Just tested Johan's workaround, and it works for us.

Thanks

Revision history for this message
Andrew Grasso (andrew-grasso) wrote :

This appears to be fixed in the latest netboot installer published today (2019-02-21 13:49 UTC).

http://archive.ubuntu.com/ubuntu/dists/xenial-updates/main/installer-amd64/current/images/netboot/

Our tests on installers before that were seeing the segfaults, but everything seems to be working again using the latest installer.

Revision history for this message
Turbo Fredriksson (turbo-bayour) wrote :

Mine's still not working. It downloads

http://archive.ubuntu.com/ubuntu//pool/main/g/glibc/libc6-udeb_2.23-0ubuntu11_amd64.udeb

which was uploaded *2019-02-05 20:38*. Not sure if this is still the same problem, or if it's my (older) installer image that is the problem.

Revision history for this message
Turbo Fredriksson (turbo-bayour) wrote :

Version 2.23-0ubuntu11 is currently the latest version (xenial-updates). Will there be a new, fixed upload? I'm guessing it's only the udeb that was botched?

Revision history for this message
Andrew Grasso (andrew-grasso) wrote :

@turbo-bayour

We had to upgrade our installer image to the one released 2019-02-21 at 13:49 UTC to get it working. We had downloaded fresh installers 15 minutes before that and were still seeing segfaults.

I'm not sure exactly what happened that started causing the segfaults, but it must have been a package update. As you note, libc6-udeb was last updated 2 weeks ago, so it doesn't make sense that it would only start causing issues this week. My suspicion is that something that depends on libc6-udeb is actually what was changed and is causing the fault. We spent some time trying to track down the exact root cause, but in the end just updating our installer image was the fastest way to get things working again.

I'd be really interested in a technician description of what what went wrong here if anyone knows.

Revision history for this message
Turbo Fredriksson (turbo-bayour) wrote : Re: [Bug 1816846] Re: segfault in libc-2.23.so netinstall installation pxe

> I'm not sure exactly what happened that started causing the segfaults,
> but it must have been a package update. As you note, libc6-udeb was last
> updated 2 weeks ago, so it doesn't make sense that it would only start
> causing issues this week. My suspicion is that something that depends on
> libc6-udeb is actually what was changed and is causing the fault.

My install image (with a modified boot meny, that loads the preseeding etc)
is about a year old now, so not sure what the correct fix is. Updating our
customised boot image would be a lot of work, so I’d prefer to have it
fixed “correctly” (whatever it is).

We haven’t used the image in several months, so didn’t notice until yesterday.
Just bad timing. But if it started happing just a few days ago (and you’ve been
running it successfully on the new libc6 package), then I agree it can’t be that..

> I'd be really interested in a technician description of what what went
> wrong here if anyone knows.

Yeah, me too!

Revision history for this message
Andrew Grasso (andrew-grasso) wrote :

> My install image (with a modified boot meny, that loads the preseeding etc)
> is about a year old now, so not sure what the correct fix is. Updating our
> customised boot image would be a lot of work, so I’d prefer to have it
> fixed “correctly” (whatever it is).

I agree, and that was my initial reaction, but we were able to just drop in the new kernel and initrd over top of the our previous ones and our menu configs and preseeds came forward without any tweaks. Depends on if you've built a custom initrd or modified the kernel, but might be worth a try since I don't know if a backward compatible fix is on the way. (linux and initrd.gz for netinstall available here: http://archive.ubuntu.com/ubuntu/dists/xenial-updates/main/installer-amd64/current/images/netboot/ubuntu-installer/amd64/)

> We haven’t used the image in several months, so didn’t notice until yesterday.
> Just bad timing. But if it started happing just a few days ago (and you’ve been
> running it successfully on the new libc6 package), then I agree it can’t be that..

Our netinstall image was about 6 months old, and we use it almost every day. We know it was working as recently as 2019-02-19 and first saw issues with it shortly before 22:00 UTC on 2019-02-20, around the time this bug report was opened.

Revision history for this message
Turbo Fredriksson (turbo-bayour) wrote :

> we were able to just drop in the new kernel and initrd over top of the our previous
> ones and our menu configs and preseeds came forward without any tweaks.

When you put it like that, that sounds fairly simple and non-intrusive. I’m also
considering the impact as a whole - any new machines I create will be ever so
slightly different than the others. Probably doesn’t matter, but at the moment
we’re not in a super rush.

I’ll give it till Monday morning to see if the correct fix will arrive, otherwise I’ll
replace the kernel and initrd image.

> We know it was working as recently as 2019-02-19 and first saw issues with it
> shortly before 22:00 UTC on 2019-02-20.

That’s very specific time frame. Should be easy enough for the Ubuntu devs
to try to pinpoint exactly what happened in the repo and from that trace the
fault.

Revision history for this message
Mykola Mogylenko (mmogylenko) wrote :

We faced with totally the same issue. We run PXE Installation on daily basis as part of our CI/CD Process. It started to fail since Feb 21st.

Revision history for this message
Jesse Rhoads (jrhoads80) wrote :

Yes, This problem _started_ on the 21st. I am seeing it when we attempt to do netboot preseed installs of Xenial. Bionic using the same identical preseed is working fine.

Revision history for this message
Michael Gregg (mgregg7) wrote :

This bug is still confirmed and active.

Other bugs are coming in that are dupes of this:
https://bugs.launchpad.net/ubuntu/+source/debian-installer/+bug/1817358

Revision history for this message
Jeremias Wong (jelly.noreply) wrote :

I’m also experiencing this issue.

Might be a bad daily build for 16.04.

Got this message with PXEBooting netinst Ubuntu 16.04
Loading libc6-udeb failed for unknown reasons. Aborting.

Looking at the logs, there’s a segfault in libc-2.23.so, right after downloading ubuntu/pool/main/g/glibc/libc6-udeb_2.23-0ubuntu11_amd64.udeb

MD5 (libc6-udeb_2.23-0ubuntu11_amd64.udeb) = 75d6702f728f2f9fc01c248455d1b8e9

Revision history for this message
Jeremias Wong (jelly.noreply) wrote :

Thanks @johan-zaxmy for proposed hotfix using kernel and initrd from suite xenial-proposed

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

I'm not an expert in this bit of the installer, but it seems to me that the problem is that installation using older kernel/initrd is what is failing. If you download the initrd/kernel from http://archive.ubuntu.com/ubuntu/dists/xenial-updates/main/installer-amd64/current/images/netboot/ubuntu-installer/amd64/ today, it works. Can someone who is experiencing the problem post the SHA1 of their initrd/kernel?

Revision history for this message
Laurent Dumont (baconpackets) wrote :

What we're the chances that I needed a netboot of 16.04 today...

Looks like I am facing the same issue with Foreman + Proxmox + Netboot. Is there a way to force the netboot from xenial-updates to test the fix?

Revision history for this message
mdes@hp.com (mdes) wrote :

I see this problem with the following initrd/kernel (I'm unable to trace the provenance of these files to their upstream sources):

$ sha1sum linux initrd.gz
6b11ddcba0f9a8a8dd1322e8c21ca185a7ac068e linux
8bf4a099d764b8b1482c590fd9081610079c4378 initrd.gz

I can also confirm that switching to use initrd/kernel from http://archive.ubuntu.com/ubuntu/dists/xenial-updates/main/installer-amd64/current/images/netboot/ubuntu-installer/amd64/, as mwhudson suggests above, works for me.

Thanks for the pointer.

Revision history for this message
Adam Conrad (adconrad) wrote :

When glibc was *uploaded* isn't meaningful here, it's when it was released to -updates, which was the 20th, and is when people started seeing this issue. Unfortunately, the NSS ABI is easily perturbed and seemingly unrelated changes can break it.

The "right" fix would be to attempt re-exec any long-running process under d-i after updating libc6-udeb (or just refuse to update libc6 at all in the installer environment), but that's not likely to happen in xenial.

The easier fix on events like this is, indeed, use the netboot images built with the newer glibc. Expecting old netboot images to work forever is a nice pipe dream, but has never actually been realistic in practice.

TL;DR: Update to the latest xenial-updates netboot images, and the problem will resolve itself.

Revision history for this message
Danny Berg (danny-berg) wrote :

Confirming the issue.
Using initrd.gz and linux from http://archive.ubuntu.com/ubuntu/dists/xenial-updates/main/installer-amd64/current/images/netboot/ubuntu-installer/amd64/, as suggested above, works.

Revision history for this message
Trent Lloyd (lathiat) wrote :

Can we get confirmation that this won't be fixed in terms of fixing older images, and that the correct solution (as suggested by Adam Conrad) is to update to the latest 'xenial-updates' netboot image?

And if so, change the bug status from Confirmed to something else (not sure if Fix Released or Won't Fix is appropriate, given the general issue is using the old netboot images)

From a usability standpoint, I briefly considered if people are likely to download netboot images from the "xenial" pocket rather than the "xenial-updates" pocket which will no longer work. If someone navigates to the archive manually that seems likely, however, on a search I have found most of the documentation found with a quick google search for "Ubuntu Netboot" points to the download location as http://cdimage.ubuntu.com/netboot/xenial/ - which does point to the xenial-updates pocket.

Revision history for this message
Adam Conrad (adconrad) wrote :

Fixing older images is literally impossible. If this bug were reported before the fix already existed, the fix would have been to build and publish a new netboot to -updates. Given that a new (and working) one already lives there, job done. The reason I haven't closed it yet (though, I should) is because it's much harder to find closed bugs, and I'd rather people see this one than file duplicates. At least for now.

Revision history for this message
Reinhard (reinhard-fink) wrote :

Question:
If images in
http://archive.ubuntu.com/ubuntu/dists/xenial/main/installer-amd64/current/images/netboot/ubuntu-installer/amd64/
are broken for nearly all us and having about 30 postings in 2 threads, what is the reason to place the new images NOT in ...dists/xenial/main... ?
Or to give as at least a README.md in ...dists/xenial/main... ?
Thanks!

Revision history for this message
Nathaniel Priddy (ngpriddy) wrote :

I'm currently using Foreman to provision machines. I've updated the image to pull from xenial-updates, but now the installer is unable to install the base system (the latest linux-headers-4.4.0-143). It's specifically unable to find the linux-headers-signed-generic. I did not have this issue before using the xenial-updates? Just pointing it to us.archive.ubuntu.com/ubuntu. I'm not sure of where to get help with this, but this is me specifically trying to use the xenial-updates netboot install since the bug has been introduced with just xenial. Any help is greatly appreciated.

Revision history for this message
Derek Gibson (dgibson.cim) wrote :

Confirming the same issue as Nathaniel Priddy.

Using updated initrd/kernel from http://archive.ubuntu.com/ubuntu/dists/xenial-updates/main/installer-amd64/current/images/netboot/ubuntu-installer/amd64/

I get E: Unable to locate package linux-headers-signed-generic

Also using Foreman to provision.

Revision history for this message
Nathaniel Priddy (ngpriddy) wrote :

It looks like it might be related to the fact there there isn't a linux-signed-image-4.4.0-143-generic available? I was able to force it to use the 142 image by setting:
d-i base-installer/kernel/image string linux-image-4.4.0-142-generic

I could set the parameter preseed-kernel-image to linux-image-4.4.0-142-generic for this to automatically execute. I was then able to fully image using the xenial-updates.

Revision history for this message
Derekk (xnaveira) wrote :

I am having the same issues as above using netboot via pixiecore.

Revision history for this message
Derekk (xnaveira) wrote :

I tried Nathaniel Priddy solution and it works for me. Thanks N!

Hopefully a fix will be released soon.

Revision history for this message
Christian Tucker (smph3) wrote :

The Nathaniel Priddy solution worked for me to - adding

d-i base-installer/kernel/image string linux-image-4.4.0-142-generic

to the preseed file just after the keyboard entry.... however I now have a new issue.

When Grub loads it lists Ubuntu, Windows, Ubuntu 16.4.0"

If I select ubuntu it loads no problem - if I select Ubuntu 16.4 then it crashes when loading

I think this is because ubuntu is expecting a different kernel version - the problem is I have no idea how to fix that!

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.