Linux 6.8 fails to boot on ARM64 if any param is more than 146 chars

Bug #2069534 reported by Shantur Rathore
52
This bug affects 7 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Undecided
Unassigned
Jammy
Invalid
Undecided
Unassigned
Noble
Fix Released
High
Matthew Ruffell
linux-hwe-6.8 (Ubuntu)
Invalid
Undecided
Unassigned
Jammy
Fix Released
High
Stefan Bader
Noble
Invalid
Undecided
Unassigned

Bug Description

BugLink: https://bugs.launchpad.net/bugs/2069534

[Impact]

Linux 6.8 kernel fails to boot on ARM64 when any Linux command line param is more than 146 characters.

This most notably affects MAAS deployments, as MAAS generates very long command line parameters for ARM64, e.g.:

nomodeset root=squash:http://10.254.131.130:5248/images/3b08252fa962c37a47d890fb5fe182b631a0c0478d758bf4573efa859cc2c548/ubuntu/arm64/ga-24.04/noble/stable/squashfs ip=::::sjc01-2b16-u07-mgx01b:BOOTIF ip6=off cc:\{'datasource_list': ['MAAS']\}end_cc cloud-config-url=http://10-254-131-128--25.maas-internal:5248/MAAS/metadata/latest/by-id/de6dn3/?op=get_preseed ro overlayroot=tmpfs overlayroot_cfgdisk=disabled log_host=10.254.131.130 log_port=5247 --- BOOTIF=01-${net_default_mac}

This was introduced in 6.8-rc1 by:

commit dc3f5aae06381b43bc9d0d416bd15ee1682940e9
Author: Ard Biesheuvel <email address hidden>
Date: Wed Nov 29 12:16:12 2023 +0100
Subject: arm64: idreg-override: Avoid parameq() and parameqn()
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=dc3f5aae06381b43bc9d0d416bd15ee1682940e9

There is no workaround, other than using command line parameters less than 146 characters. This is not tenable for MAAS users.

[Fix]

The fix arrived in a major refactor of early ARM64 init, where they moved from assembly to the pi mini c library. The specific commit that fixed the issue is:

commit e223a449125571daa62debd8249fa4fc2da0a961
Author: Ard Biesheuvel <email address hidden>
Date: Wed Feb 14 13:28:50 2024 +0100
Subject: arm64: idreg-override: Move to early mini C runtime
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e223a449125571daa62debd8249fa4fc2da0a961

However, this needs a lot of dependencies, mostly all the "mini c runtime" commits in the below merge commit:

commit 6d75c6f40a03c97e1ecd683ae54e249abb9d922b
Merge: fe46a7dd189e 1ef21fcd6a50
Author: Linus Torvalds <email address hidden>
Date: Thu Mar 14 15:35:42 2024 -0700
Subject: Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6d75c6f40a03c97e1ecd683ae54e249abb9d922b

The amount of code is generally unacceptable for an SRU due to regression risk. I also don't think that reverting "arm64: idreg-override: Avoid parameq() and parameqn()" is the right solution either.

Thankfully, Tj did some debugging of the root cause in comment #20 [1], and found the issue occurs because of memcmp() in include/linux/fortify-string.h detecting an attempted out-of-bounds read when comparing buf and aliases[i].alias.

That triggers the fortified memcmp()'s:

if (p_size < size || q_size < size)
fortify_panic(__func__);

where q_size == 146, size == 147, and it crashes the kernel.

[1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2069534/comments/20

I know SAUCE patches are to be avoided if possible, but Tj's solution is minimal and fixes the root cause without the regression risk of backporting the entire mini C runtime, so I suggest we go with Tj's patch.

commit a4c616d2156c9c4cf7c91e6983c8bf0d51985df1
Author: Tj <email address hidden>
Date: Fri Jul 26 13:48:44 2024 +0000
Subject: UBUNTU: SAUCE: arm64: v6.8: cmdline param >= 146 chars kills kernel
Link: https://lore.kernel.org/stable/JsQ4W_o2R1NfPFTCCJjjksPED-8TuWGr796GMNeUMAdCh<email address hidden>/T/#u

[Testcase]

1) Deploy an ARM64 VM or use a bare metal ARM64 board with Noble, running 6.8.
2) Edit /boot/grub/grub.cfg and add the following param to any boot entry with
Linux 6.8

testparam=f081c381e7b54edcba27e5f790d47911a4cc3e726d8d256878d3df9175c020e0f081c381e7b54edcba27e5f790d47911a4cc3e726d8d256878d3df9175c020e0f081c381e7b5732f126a62b4232

3) Reboot the machine and select the boot entry in grub with the testparam as
above.
4) Observe kernel never boots.

[Where problems could occur]

We are changing command line parsing on ARM64 systems, such that we only do a memcmp() with aliased entries if the parameter we are parsing has the same length as an aliased entry. This really shouldn't have any change in functionality at all.

If a regression were to occur, then command line parsing on ARM64 systems could be broken, and it could lead to early boot failures, likely caught on automated kernel tests.

[Other Info]

This fix is 6.8 specific. It is already fixed upstream by the mini C runtime in 6.9 and later. This patch is for noble only.

Changed in linux (Ubuntu):
status: New → Fix Released
Revision history for this message
Shantur Rathore (rathore4u) wrote :

Hi @Matthew

Thanks for fixing the bug.
Can you please let me know where can I get the fixed kernel release from?

Thanks

Revision history for this message
Matthew Ruffell (mruffell) wrote :

Hi Shantur,

It is not fixed, I just marked it as fixed for "Ubuntu" a.k.a the development release oracular, since it will pick up a 6.10+ kernel when it eventually becomes available.

I added a noble entry, since noble's kernel is the one that actually needs to be fixed.

I did have a look at "arm64: idreg-override: Move to early mini C runtime", but it fails to cherry pick on noble's kernel with quite a few conflicts, and I haven't had time yet to go looking for dependencies.

Thanks,
Matthew

Revision history for this message
Shantur Rathore (rathore4u) wrote (last edit ):

Hi Matthew,

Thanks for the update.
Please find my comments below

> I added a noble entry, since noble's kernel is the one that actually needs to be fixed.

We would need to add one entry for the next proposed hwe 6.8 for jammy

> I did have a look at "arm64: idreg-override: Move to early mini C runtime", but it fails to cherry pick on noble's kernel with quite a few conflicts, and I haven't had time yet to go looking for dependencies.

I believe that will be quite a bit to backport for 6.8, maybe the solution would be to revert the commit

commit dc3f5aae06381b43bc9d0d416bd15ee1682940e9
Author: Ard Biesheuvel <email address hidden>
Date: Wed Nov 29 12:16:12 2023 +0100

    arm64: idreg-override: Avoid parameq() and parameqn()

Thanks,

Revision history for this message
Shantur Rathore (rathore4u) wrote :
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux (Ubuntu Noble):
status: New → Confirmed
Revision history for this message
Chris MacNaughton (chris.macnaughton) wrote :

Hey Shantur,

Are you uploading that patch to Noble? I'm bumping into this issue at the moment as well.

Thanks!

Revision history for this message
Shantur Rathore (rathore4u) wrote :
Revision history for this message
Shantur Rathore (rathore4u) wrote :

Hi Chris,

I have added the Merge proposal but not sure how to bring this to developers' attention.
May I ask which device did you face this issue on and are you using MAAS ?

Thanks

Revision history for this message
Chris MacNaughton (chris.macnaughton) wrote :

Shantur: I'm using MAAS and encountering this on ARM64 hardware.

here's hoping we can find somebody to engage to get this merged and uploaded!

Revision history for this message
Shantur Rathore (rathore4u) wrote :

I have a fix for MAAS too which I trying to get merged.

https://bugs.launchpad.net/maas/+bug/2069059

Revision history for this message
Chris MacNaughton (chris.macnaughton) wrote :

At the request of iam_tj in the support:ubuntu.com Matrix room, this is the command line where I'm hitting this:

nomodeset root=squash:http://10.254.131.130:5248/images/3b08252fa962c37a47d890fb5fe182b631a0c0478d758bf4573efa859cc2c548/ubuntu/arm64/ga-24.04/noble/stable/squashfs ip=::::sjc01-2b16-u07-mgx01b:BOOTIF ip6=off cc:\{'datasource_list': ['MAAS']\}end_cc cloud-config-url=http://10-254-131-128--25.maas-internal:5248/MAAS/metadata/latest/by-id/de6dn3/?op=get_preseed ro overlayroot=tmpfs overlayroot_cfgdisk=disabled log_host=10.254.131.130 log_port=5247 --- BOOTIF=01-${net_default_mac}

Revision history for this message
TJ (tj) wrote :

This was brought to my attention by Chris; looking at the code in commit dc3f5aae0638 parsing of an individual parameter will terminate prematurely if it is more than 255 characters and it will not be recognised as expected, with the remaining characters being parsed as an additional parameter.

For aarch64 the default command-line length is 2048 characters.

Having seem some examples of actual command-lines I'm not yet convinced this commit is the cause - I'm currently building a reproducer to test some ideas. One such is, looking at the Ubuntu 6.8 git commits, there's a patch from upstream that fixes a command-line overflow:

commit 4e38935f02fa0 "init/main.c: Fix potential static_command_line memory overflow" ( upstream commit 46dad3c1e57897)

We really need to see a complete kernel log capture using options "earlyprintk debug" to see at what stage it breaks (and what the kernel reports as the cmdline, and what exact kernel version it is); currently I'm not convinced its the kernel failing here, but rather, the initialramfs processing (since most of the kernel command-line arguments shown are not kernel parameters at all; root= is but the 'squash:' type prefix isn't handled by the kernel's init/ code.).

I haven't looked at MAAS but the command lines indicate it may be adding scripts into the initialramfs that read the command-line.

Revision history for this message
Shantur Rathore (rathore4u) wrote :

@TJ

I share your disbelief that led me to checking this 3 times.

I came to this conclusion by 3 times git bisecting mainline kernel between different tags. Took me good 2 weeks to come to this conclusion but I could be wrong.

To confirm this I tested this fix with mainline kernel with 24.04 and 22.04 ( 6.8 HWE Next ).
Compiled mainline 6.8

Added a test param to command line with grub

testparam=f081c381e7b54edcba27e5f790d47911a4cc3e726d8d256878d3df9175c020e0f081c381e7b54edcba27e5f790d47911a4cc3e726d8d256878d3df9175c020e0f081c381e7b5

Kernel fails to boot.

Recomplied mainline 6.8 with reverting dc3f5aae0638

Kernel boots fine.

-- WTH --- Agreed.

Revision history for this message
TJ (tj) wrote :

I can reproduce the issue in a QEMU aarch64 virtual machine with both Ubuntu and Debian v6.8* kernels booting in EFI mode via GRUB, but here the length of the parameter that triggers it is 146 characters; eg:

linux /boot/vmlinuz-6.8.12-arm64 root=uuid=06bc9a44-31ef-41b1-bfe1-e6383822dccd ro console=earlycon efi=debug debug earlyprintk param146=ni4ohneo0oothieyeef9vo4ieth4yeiz6ohsiemae6aoy2asu9xei5eethoh0igaitha7laeghoothaeph9xai7kier3aib7aejaengahghan2zojeebai3kad9meesh6eichaey2

And the kernel hangs in the EFI libstub just after exiting boot services.

I'm instrumenting the code to get to the root cause.

Revision history for this message
TJ (tj) wrote :

I think the cause here is staring us in the face; we have a do { ... } while(1) loop wrapping this code. The only escape are the 2 return statements. If neither trigger it'll spin, which is what the symptoms seem to indicate.

1) "-- " as a parameter
2) !len (in other words, len==0)

In the first call to __parse_cmdline() bool parse_aliases == true so the re-entrant call into itself with parse_aliases == false should be done if the memcmp() finds any of the alises[i].alias in the current option.

Now, in the example command lines I've seen, none of the aliases is present.

It surely cannot be coincidence though that we have:

#define FTR_ALIAS_NAME_LEN 30
#define FTR_ALIAS_OPTION_LEN 116
...
static const struct {
char alias[FTR_ALIAS_NAME_LEN];
char feature[FTR_ALIAS_OPTION_LEN];
} aliases[]

that means each element is 146 characters long - the length at which I can trigger the bug.

Revision history for this message
TJ (tj) wrote :

With pr_debug() added into __parse_cmdline() there are no reports; last message is as always:

EFI stub: Exiting boot services...

Revision history for this message
Shantur Rathore (rathore4u) wrote :

Have you tried reverting the commit I mentioned to see if that fixes the issue?

Revision history for this message
TJ (tj) wrote (last edit ):

My aim is not to revert a commit but to discover what the actual bug is and get it fixed in upstream stable 6.8 tree.

I've isolated it to kernel-only by eliminating both firmware (UEFI) and boot-loader (GRUB) from the equation with:

qemu-system-aarch64 -machine virt,gic-version=3 -cpu max,pauth-impdef=on -smp 2 -m 4096 -nographic -kernel ./vmlinuz-6.8.12-arm64-debug -append "debug param146=ni4ohneo0oothieyeef9vo4ieth4yeiz6ohsiemae6aoy2asu9xei5eethoh0igaitha7laeghoothaeph9xai7kier3aib7aejaengahghan2zojeebai3kad9meesh6eichaey2"

This will hang. Removing one character from param146= so it is actually 145 characters and the kernel will start.

Now I have a minimal reproducer I can attach gdb and debug it.

Revision history for this message
Shantur Rathore (rathore4u) wrote :

That's awesome. Thanks for sharing the details.

Revision history for this message
TJ (tj) wrote :

My hunch about the length of struct aliases was correct; when a parameter that is longer than the *entire* aliases struct element (146 characters) is compared the call to memcmp() is redirected to "include/linux/fortify-string.h" [0] where checks are done to ensure there are no out-of-bounds reads.

Because the 'buf' parameter is 146 characters long the call looks like:

memcmp("param146=...", aliases[i].alias, len + 1)

where 'len' is 146 and so 147 gets passed in. That triggers:

 if (p_size < size || q_size < size)
  fortify_panic(__func__);

because 'size' (from 'len + 1') is 147 and q_size is 146 ( size_t q_size = __struct_size(q) )

('p' is `buf`, 'q' is 'aliases[i].alias' )

So, with a guard case to avoid calling memcmp() at all unless the lengths match it works. I'll send the patch to the v6.8 stable tree upstream.

$ qemu-system-aarch64 -machine virt,gic-version=3 -cpu max,pauth-impdef=on -smp 2 -m 4096 -nographic -kernel /srv/NAS/Sunny/SourceCode/builds/linux-aarch64/arch/arm64/boot/Image -append "debug param146=ni4ohneo0oothieyeef9vo4ieth4yeiz6ohsiemae6aoy2asu9xei5eethoh0igaitha7laeghoot
haeph9xai7kier3aib7aejaengahghan2zojeebai3kad9meesh6eichaey2"
[ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x000f0510]
[ 0.000000] Linux version 6.8.12 (<email address hidden>) (aarch64-linux-gnu-gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #10 SMP Fri
Jul 26 13:57:53 BST 2024
[ 0.000000] random: crng init done
[ 0.000000] Machine model: linux,dummy-virt
...
[ 0.000000] Kernel command line: debug param146=ni4ohneo0oothieyeef9vo4ieth4yeiz6ohsiemae6aoy2asu9xei5eethoh0igaitha7laeghoothaeph9xai7kier3aib7aejae
ngahghan2zojeebai3kad9meesh6eichaey2
[ 0.000000] Unknown kernel command line parameters "param146=ni4ohneo0oothieyeef9vo4ieth4yeiz6ohsiemae6aoy2asu9xei5eethoh0igaitha7laeghoothaeph9xai7k
ier3aib7aejaengahghan2zojeebai3kad9meesh6eichaey2", will be passed to user space.

[0] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/linux/fortify-string.h?h=v6.8#n659

Revision history for this message
Chris MacNaughton (chris.macnaughton) wrote :

In addition to proposing the fix upstream, what's the timeline for inclusion into Ubuntu's 6.8 kernel / Noble?

Revision history for this message
TJ (tj) wrote :

Upstream stable tree patch submission:

https://lore.kernel.org/stable/JsQ4W_o2R1NfPFTCCJjjksPED-8TuWGr796GMNeUMAdCh<email address hidden>/T/#u

Revision history for this message
TJ (tj) wrote :

@Chris: no idea - someone in the Canonical kernel team needs to deal with it.

Revision history for this message
Matthew Ruffell (mruffell) wrote :

Hi TJ,

Thanks for your fix, this looks much, much more palatable than backporting the entire mini C runtime, or reverting the commit that caused this problem.

Now, just as Greg K-H says, 6.8.y is EOL upstream, and is closed to new patches.

We can probably pick this up as a SAUCE patch for Ubuntu though.

I'll build you a test distro kernel with your patch ontop for testing, and if it works great, we will submit a SAUCE patch for SRU.

I'll write back with a test kernel soon.

Thanks,
Matthew

Revision history for this message
Matthew Ruffell (mruffell) wrote :

Hi TJ, Shantur, Chris,

If you wait 3 hours from this message, the kernels will likely be ready. They
are building in:

https://launchpad.net/~mruffell/+archive/ubuntu/lp2069534-test

They are 6.8.0-39-generic + your patch. Both for Noble and Jammy HWE.

Test whatever you like.

Please note this package is NOT SUPPORTED by Canonical, and is for TESTING
PURPOSES ONLY. ONLY Install in a dedicated test environment.

Instructions to Install (On a Jammy, Noble system):
1) sudo add-apt-repository ppa:mruffell/lp2069534-test
2) sudo apt update
3) sudo apt install linux-image-unsigned-6.8.0-39-generic linux-modules-6.8.0-39-generic linux-modules-extra-6.8.0-39-generic linux-headers-6.8.0-39-generic
4) sudo reboot
5) uname -rv
Look for +TEST2069534v20240727b1.

If you get asked to remove the currently running kernel say no.

Can you boot it with more than 147 characters?

Let me know.

Thanks,
Matthew

Revision history for this message
TJ (tj) wrote (last edit ):
Download full text (6.6 KiB)

I don't use Ubuntu kernels; build my own mainline but work on kernel issues in mainline and Debian. I tested 144..157 length (interestingly although arm64 professes to support 2048 characters the kernel messages are truncated at around 1000 so don't show the entire thing).

$ qemu-system-aarch64 -machine virt,gic-version=3 -cpu max,pauth-impdef=on -smp 2 -m 4096 -nographic -kernel /srv/NAS/Sunny/SourceCode/builds/linux-aarch64/arch/arm64/boot/Image -append "debug $( for l in {144..157}; do echo -n param$l=$(pwgen $((l-9)) 1)' '; done )" -initrd rootfs/boot/initrd.img-6.8.12-arm64-debug
[ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x000f0510]
[ 0.000000] Linux version 6.8.12 (<email address hidden>) (aarch64-linux-gnu-gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #10 SMP Fri
Jul 26 13:57:53 BST 2024
[ 0.000000] random: crng init done
[ 0.000000] Machine model: linux,dummy-virt
...
[ 0.000000] Kernel command line: debug param144=aixootoo9ii0dieghaem2eehahqu9Aejeuka3ui8chain9ief3Ooth9lahgeiyiew3dio3PhahpeiShoh1ootoch2rae9quushei9yu4ge5uasi9peizooT8JohjieMuGh7ohs7 param145=noohieL5DaeghaeGh4nueQuugoowohj6fa0Jaive7meaghukoog8tho6De6ahga7sheighah2Raing9eitai3eeHi4Ahr7aixaiLoh3cheeG
hosa9eR4sohkajahwe1tha9aotha param146=taesaekie3Vaiv2Neejohph0ozeile5daemu8beepha9Ojae8niev8nepaidaemu9uphaah4bongeiRahM5eichahTah6aegob8edee2xah6UaxahThee2puePua5ahchoqueixee param147=ahZai6EeW1EejaGh6hen6eu6oov6wo3ooph5theide6OhGh5oog2Iel8oong1ighooboo7ohthoh2le5eeloog1agha0phaid5enaeQuohfoo3EijaemeiZ
9ohG5aichoo9shuiFee param148=ieT6oogat5sheng9aeteigh4poohoongul5za7Eich5Abo2Aeraec0eingah5ahsh4Ooth9Phaithai3gethoo4piphie0zieHieYahngahbiitheingooshau4
chaepee2zeeWei5a param149=eenga3ku8deongie7Oovahsoo7ao1zail8remu2ieshai6haemee2eingoophev6eeY5KeeChiemeu2Eaquuqu8ahk3oohovoh6vaijaexoodeesoetuucie7geeba
h5cad5aikoh6mo param150=oopaiCh0thu1ioneed3apee9igieT7OaWedeemoop4izex9gaeRaequai6aavaephua4ahlooThaiquie4Gu8Eiyo8ohmai1aiye
...
[ 3.045515] Run /init as init process
[ 3.045717] with arguments:
[ 3.045852] /init
[ 3.045962] with environment:
[ 3.046098] HOME=/
[ 3.046206] TERM=linux
[ 3.046320] param144=aixootoo9ii0dieghaem2eehahqu9Aejeuka3ui8chain9ief3Ooth9lahgeiyiew3dio3PhahpeiShoh1ootoch2rae9quushei9yu4ge5uasi9peizooT8Johj
ieMuGh7ohs7
[ 3.046913] param145=noohieL5DaeghaeGh4nueQuugoowohj6fa0Jaive7meaghukoog8tho6De6ahga7sheighah2Raing9eitai3eeHi4Ahr7aixaiLoh3cheeGhosa9eR4sohkajah
we1tha9aotha
[ 3.047481] param146=taesaekie3Vaiv2Neejohph0ozeile5daemu8beepha9Ojae8niev8nepaidaemu9uphaah4bongeiRahM5eichahTah6aegob8edee2xah6UaxahThee2puePua
5ahchoqueixee
[ 3.047963] param147=ahZai6EeW1EejaGh6hen6eu6oov6wo3ooph5theide6OhGh5oog2Iel8oong1ighooboo7ohthoh2le5eeloog1agha0phaid5enaeQuohfoo3EijaemeiZ9ohG5
aichoo9shuiFee
[ 3.048466] param148=ieT6oogat5sheng9aeteigh4poohoongul5za7Eich5Abo2Aeraec0eingah5ahsh4Ooth9Phaithai3gethoo4piphie0zieHieYahngahbiitheingooshau4c
haepee2zeeWei5a
[ 3.048992] param149=eenga3ku8deongie7Oovahsoo7ao1zail8remu2ieshai6haemee2eingoophev6eeY5KeeChiemeu2Eaquuqu8ahk3oohovoh6vaijaexoodeesoetuucie7gee
bah5cad5aikoh6mo
[ 3.049508] param150=oo...

Read more...

Revision history for this message
Shantur Rathore (rathore4u) wrote :

Oh dear, I don't have good news.

I installed the kernel on my RockPro64 with the steps mentioned by Matthew, it booted normally without the testparam from the report.

I added the testparam in grub

testparam=f081c381e7b54edcba27e5f790d47911a4cc3e726d8d256878d3df9175c020e0f081c381e7b54edcba27e5f790d47911a4cc3e726d8d256878d3df9175c020e0f081c381e7b5

And it still fails to boot.

UART console log below -

  Booting a command list

Loading Linux 6.8.0-39-generic ...
Loading initial ramdisk ...
EFI stub: Booting Linux Kernel...
EFI stub: Loaded initrd from LINUX_EFI_INITRD_MEDIA_GUID device path
EFI stub: Using DTB from configuration table
EFI stub: Exiting boot services...

Just hangs there.

Revision history for this message
TJ (tj) wrote (last edit ):
Download full text (6.9 KiB)

@Shantur I confirm your results trying to boot that kernel image in QEMU and unfortunately there does not appear to be a debug symbols package generated so I cannot debug it.

I should do more thorough checks first! I had accidentally fetched the amd64 kernel image!! *red faced*

So, this build DOES boot successfully with long parameters with QEMU:

$ qemu-system-aarch64 -machine virt,gic-version=3 -cpu max,pauth-impdef=on -smp 2 -m 4096 -nographic -kernel ./vmlinuz-6.8.0-39
-generic -append "debug $( for l in {144..157}; do echo -n param$l=$(pwgen $((l-9)) 1)' '; done )" -initrd rootfs/boot/initrd.img-6.8.12-arm64-debug
[ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x000f0510]
[ 0.000000] Linux version 6.8.0-39-generic (buildd@bos03-arm64-061) (aarch64-linux-gnu-gcc-13 (Ubuntu 13.2.0-23ubuntu4) 13.2.0, GNU ld (GNU Binutils
for Ubuntu) 2.42) #39+TEST2069534v20240727b1-Ubuntu SMP PREEMPT_DYNAMIC Sat Jul 27 (Ubuntu 6.8.0-39.39+TEST2069534v20240727b1-generic 6.8.8)
[ 0.000000] KASLR enabled
[ 0.000000] random: crng init done
[ 0.000000] Machine model: linux,dummy-virt
...
[ 0.000000] Kernel command line: debug param144=voorainu7ohRiewieTh4mahk6thoh5tahheex0aewei5ohCahHa8eefae1Iuw1tei3aethu1aen6Aewoi6omoox0iexiebafieShaidohraozaigi4ohng1iLai4eeniz4Eh0ah param145=eesh4eequaequ2coovie2goa0fohZah3ahToop0Aihail7feeVohgaepi9EiGae7oongoo6queihoh3ukeinahh4io5peikae5iidohqueGh6Er0CaeP6kaWaezothahK2uX1af7 param146=ui2phaej7cho6Daip3ahhoze9noQuie1uzaiquaer9aPhae1TiShaekae7Cooj6an3be4aiQuae2euz7aehueQuoo1eThi9jahc1Ahsuraequ0Al4L
ietheeleso5ji6reich2hai param147=OoRohkei8eex5thair8KeeCh5kea1miepheigh2phahsiqueethoo0eequahs6Fiecoo6HeuQuohte7ooxipoo0seethai9quo7ogoo1caibeiphoh0ahth
8heiPo8nucohg5shewo param148=sheim7muoCaingiegeepeimuokooghab4oXouX3ge8leeN3Roog9eeMah8ahb5iuxaifaig5ahPoh1pa9Aebeech2go5JooN8xawah5ohLi7thee8lei5aiX6oh
kae4giexohcagee2 param149=ojae4cho8eec2kithu6aelai0ieshai6iepeiY7aiTh8Oophub5ahdouWaiguuY6deidecooShaiziew2fiegeCuukooz3AiCiVaid4shaaceixochejoozaiSaiYa
ipeiJ9eelechuj param150=Eipee9ohb7ifohcato7paisho9een3tiev7eexah2iephaey0zah0zahjeeV6Ca7theiPeapae5eengaeshoothaibooS7aesoo9
...
[ 3.083720] Run /init as init process
[ 3.084590] with arguments:
[ 3.085474] /init
[ 3.085815] with environment:
[ 3.086312] HOME=/
[ 3.086631] TERM=linux
[ 3.086959] param144=voorainu7ohRiewieTh4mahk6thoh5tahheex0aewei5ohCahHa8eefae1Iuw1tei3aethu1aen6Aewoi6omoox0iexiebafieShaidohraozaigi4ohng1iLai4
eeniz4Eh0ah
[ 3.088406] param145=eesh4eequaequ2coovie2goa0fohZah3ahToop0Aihail7feeVohgaepi9EiGae7oongoo6queihoh3ukeinahh4io5peikae5iidohqueGh6Er0CaeP6kaWaezo
thahK2uX1af7
[ 3.089842] param146=ui2phaej7cho6Daip3ahhoze9noQuie1uzaiquaer9aPhae1TiShaekae7Cooj6an3be4aiQuae2euz7aehueQuoo1eThi9jahc1Ahsuraequ0Al4Lietheeleso
5ji6reich2hai
[ 3.091349] param147=OoRohkei8eex5thair8KeeCh5kea1miepheigh2phahsiqueethoo0eequahs6Fiecoo6HeuQuohte7ooxipoo0seethai9quo7ogoo1caibeiphoh0ahth8heiP
o8nucohg5shewo
[ 3.092792] param148=sheim7muoCaingiegeepeimuokooghab4oXouX3ge8leeN3Roog9eeMah8ahb5iuxaifaig5ahPoh1pa9Aebeech2go5JooN8xawah5ohLi7thee8lei5aiX6ohk
ae4giexohcagee2
[ 3.094310] ...

Read more...

Revision history for this message
TJ (tj) wrote (last edit ):

@Shantur I also tested with your specific parameter in case it contained something to trigger this but it works fine:

(initramfs) uname -a; cat /proc/cmdline
Linux (none) 6.8.0-39-generic #39+TEST2069534v20240727b1-Ubuntu SMP PREEMPT_DYNAMIC Sat Jul 27 aarch64 GNU/Linux
debug testparam=f081c381e7b54edcba27e5f790d47911a4cc3e726d8d256878d3df9175c020e0f081c381e7b54edcba27e5f790d47911a4cc3e726d8d256878d3df9175c020e0f081c381e7b5

I have a suspicion you've got the wrong kernel! There's a 6.8.0-39 in Noble -updates|-security and I would not be surprised if that is what your system actually booted.

Revision history for this message
Shantur Rathore (rathore4u) wrote :

@TJ - Thanks for pointing that out.
I was indeed using the wrong kernel version.

With the test kernel it works perfectly.

Does anyone know how to engage someone from Canonical Kernel team to get this merged for Noble?

Thanks

Revision history for this message
Matthew Ruffell (mruffell) wrote :

Great, thanks for trying the test kernel.

I think the best way forward is to submit TJ's patch, it really is a better solution that backporting the entire mini c runtime or reverting the commit that introduced the problem.

I can write a SRU template and submit it tomorrow.

We need to try catch the 2024.08.05 SRU cycle as per https://kernel.ubuntu.com/, which closes for patches on the 31st July, which is really soon.

Revision history for this message
Chris MacNaughton (chris.macnaughton) wrote :

@mruffell is there anything I can do to help ensure this gets into that cycle? can certainly be one more "this patch works for me" but do let me know if there's more I can do to help!

summary: - linux 6.8 fails to boot on arm64 if any param is more than 140 chars
+ Linux 6.8 fails to boot on ARM64 if any param is more than 146 chars
description: updated
Changed in linux (Ubuntu Noble):
status: Confirmed → In Progress
importance: Undecided → High
assignee: nobody → Matthew Ruffell (mruffell)
tags: added: noble seg
description: updated
Revision history for this message
Matthew Ruffell (mruffell) wrote :

Hi everyone,

SRU template is written. The patch has been submitted to the Ubuntu Kernel mailing list.

Cover letter:
https://lists.ubuntu.com/archives/kernel-team/2024-July/152495.html
Patch:
https://lists.ubuntu.com/archives/kernel-team/2024-July/152496.html

TJ, I cc'd you incase the kernel team have any questions.

I will go speak to the kernel team now and make sure this makes 2024.08.05.

Thanks,
Matthew

Revision history for this message
TJ (tj) wrote :

Thanks Mathew. One note: I edited the SRU template to correct my name. It is just "Tj" (pronounced Teej) not initials T.J.

description: updated
Revision history for this message
Stefan Bader (smb) wrote :

I will opportunistically pick this up for a HWE-6.8 re-spin I have to do anyway.

Changed in linux-hwe-6.8 (Ubuntu):
status: New → Invalid
Changed in linux (Ubuntu Jammy):
status: New → Invalid
Changed in linux-hwe-6.8 (Ubuntu Noble):
status: New → Invalid
Changed in linux-hwe-6.8 (Ubuntu Jammy):
status: New → In Progress
assignee: nobody → Stefan Bader (smb)
importance: Undecided → High
Stefan Bader (smb)
Changed in linux-hwe-6.8 (Ubuntu Jammy):
status: In Progress → Fix Committed
Stefan Bader (smb)
Changed in linux (Ubuntu Noble):
status: In Progress → Fix Committed
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-hwe-6.8/6.8.0-40.40~22.04.3 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy-linux-hwe-6.8' to 'verification-done-jammy-linux-hwe-6.8'. If the problem still exists, change the tag 'verification-needed-jammy-linux-hwe-6.8' to 'verification-failed-jammy-linux-hwe-6.8'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-jammy-linux-hwe-6.8-v2 verification-needed-jammy-linux-hwe-6.8
Revision history for this message
Matthew Ruffell (mruffell) wrote :

Performing verification for jammy-hwe-6.8

I started two T2A instances on google cloud, which are arm64, with jammy.

One instance has:
6.8.0-39-generic #39~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Wed Jul 10 16:59:11 UTC 2
The other, 6.8.0-40-generic from -proposed:
6.8.0-40-generic #40~22.04.3-Ubuntu SMP PREEMPT_DYNAMIC Tue Jul 30 17:53:10 UTC 2

I edited /etc/default/grub.d/50-cloudimg-settings.cfg and set:

GRUB_CMDLINE_LINUX_DEFAULT="console=ttyS0,115200"

to

GRUB_CMDLINE_LINUX_DEFAULT="console=ttyS0,115200 testparam=f081c381e7b54edcba27e5f790d47911a4cc3e726d8d256878d3df9175c020e0f081c381e7b54edcba27e5f790d47911a4cc3e726d8d256878d3df9175c020e0f081c381e7b5732f126a62b4232"

ran:

$ sudo update-grub

and rebooted.

Unfortunately, I never saw the 6.8.0-39-generic again.

The 6.8.0-40-generic instance came up just fine:

$ cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-6.8.0-40-generic root=PARTUUID=17337627-dfbd-4ce7-9f99-4dd1da2542eb ro console=ttyS0,115200 testparam=f081c381e7b54edcba27e5f790d47911a4cc3e726d8d256878d3df9175c020e0f081c381e7b54edcba27e5f790d47911a4cc3e726d8d256878d3df9175c020e0f081c381e7b5732f126a62b4232

The 6.8.0-40-generic in -proposed fixes the issue. Happy to mark verified for jammy-hwe-6.8.

tags: added: verification-done-jammy-linux-hwe-6.8
removed: verification-needed-jammy-linux-hwe-6.8
Revision history for this message
Chris MacNaughton (chris.macnaughton) wrote :

hey mruffell, smb:

This got pulled into Jammy's proposed HWE kernel but I don't see the kernel bot mentioning the Noble version yet; is that going into proposed there or is it following a different path?

Revision history for this message
Matthew Ruffell (mruffell) wrote :

Hi Chris,

Yes, jammy-hwe-6.8 got fixed because Stefan Bader had to respin the kernel
for another regression anyway, so he opportunistically pulled it in.

For Noble, I think it will be part of the s2024.07.08 SRU cycle, as per
https://kernel.ubuntu.com/, as Manuel Diewald mentioned when I spoke to him.

Thanks,
Matthew

Revision history for this message
Chris MacNaughton (chris.macnaughton) wrote : Re: [Bug 2069534] Re: Linux 6.8 fails to boot on ARM64 if any param is more than 146 chars

Thanks for the update, Matthew!

Chris

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux/6.8.0-41.41 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-noble-linux' to 'verification-done-noble-linux'. If the problem still exists, change the tag 'verification-needed-noble-linux' to 'verification-failed-noble-linux'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-noble-linux-v2 verification-needed-noble-linux
Revision history for this message
Matthew Ruffell (mruffell) wrote :

Performing verification for Noble.

I again started two T2A instances on Google Cloud, both running Noble.

One instance has:
6.8.0-39-generic #39-Ubuntu SMP PREEMPT_DYNAMIC Sat Jul 6 02:50:39 UTC 2024
The other, 6.8.0-41-generic from -proposed2:
6.8.0-41-generic #41-Ubuntu SMP PREEMPT_DYNAMIC Fri Aug 2 23:26:06 UTC 2024

I edited /etc/default/grub.d/50-cloudimg-settings.cfg and set:

GRUB_CMDLINE_LINUX_DEFAULT="console=ttyS0,115200"

to

GRUB_CMDLINE_LINUX_DEFAULT="console=ttyS0,115200 testparam=f081c381e7b54edcba27e5f790d47911a4cc3e726d8d256878d3df9175c020e0f081c381e7b54edcba27e5f790d47911a4cc3e726d8d256878d3df9175c020e0f081c381e7b5732f126a62b4232"

ran:

$ sudo update-grub

and rebooted.

Again, I never saw the 6.8.0-39-generic again.

The 6.8.0-41-generic instance came up just fine:

$ cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-6.8.0-41-generic root=PARTUUID=e1ce6327-4835-4b2e-b73e-e7d6231d4869 ro console=ttyS0,115200 testparam=f081c381e7b54edcba27e5f790d47911a4cc3e726d8d256878d3df9175c020e0f081c381e7b54edcba27e5f790d47911a4cc3e726d8d256878d3df9175c020e0f081c381e7b5732f126a62b4232

The 6.8.0-41-generic in -proposed2 fixes the issue. Happy to mark verified for Noble.

tags: added: verification-done-noble-linux
removed: verification-needed-noble-linux
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (73.3 KiB)

This bug was fixed in the package linux-hwe-6.8 - 6.8.0-40.40~22.04.3

---------------
linux-hwe-6.8 (6.8.0-40.40~22.04.3) jammy; urgency=medium

  * jammy/linux-hwe-6.8: 6.8.0-40.40~22.04.3 -proposed tracker (LP: #2075181)

  * Packaging resync (LP: #1786013)
    - [Packaging] debian.hwe-6.8/dkms-versions -- update from kernel-versions
      (main/2024.07.08)

  * Linux 6.8 fails to boot on ARM64 if any param is more than 146 chars
    (LP: #2069534)
    - SAUCE: arm64: v6.8: cmdline param >= 146 chars kills kernel

  * revert support for arbitrary symbol length in modversion in hwe kernels
    (LP: #2039010)
    - Revert "UBUNTU: SAUCE: modpost: Replace 0-length array with flex-array
      member"
    - Revert "UBUNTU: SAUCE: allows to enable Rust with modversions"
    - Revert "UBUNTU: SAUCE: modpost: support arbitrary symbol length in
      modversion"

linux-hwe-6.8 (6.8.0-40.40~22.04.2) jammy; urgency=medium

  * jammy/linux-hwe-6.8: 6.8.0-40.40~22.04.2 -proposed tracker (LP: #2073455)

  * net/sched: Fix conntrack use-after-free (LP: #2073092)
    - net/sched: Fix UAF when resolving a clash

linux-hwe-6.8 (6.8.0-40.40~22.04.1) jammy; urgency=medium

  * jammy/linux-hwe-6.8: 6.8.0-40.40~22.04.1 -proposed tracker (LP: #2072200)

  * Packaging resync (LP: #1786013)
    - [Packaging] Include parent config for HWE-6.5
    - [Packaging] update variants

  [ Ubuntu: 6.8.0-40.40 ]

  * noble/linux: 6.8.0-40.40 -proposed tracker (LP: #2072201)
  * FPS of glxgear with fullscreen is too low on MTL platform (LP: #2069380)
    - drm/i915: Bypass LMEMBAR/GTTMMADR for MTL stolen memory access
  * a critical typo in the code managing the ASPM settings for PCI Express
    devices (LP: #2071889)
    - PCI/ASPM: Restore parent state to parent, child state to child
  * [UBUNTU 24.04] IOMMU DMA mode changed in kernel config causes massive
    throughput degradation for PCI-related network workloads (LP: #2071471)
    - [Config] Set IOMMU_DEFAULT_DMA_STRICT=n and IOMMU_DEFAULT_DMA_LAZY=yes for
      s390x
  * UBSAN: array-index-out-of-bounds in
    /build/linux-D15vQj/linux-6.5.0/drivers/md/bcache/bset.c:1098:3
    (LP: #2039368)
    - bcache: fix variable length array abuse in btree_iter
  * Mute/mic LEDs and speaker no function on EliteBook 645/665 G11
    (LP: #2071296)
    - ALSA: hda/realtek: fix mute/micmute LEDs don't work for EliteBook 645/665
      G11.
  * failed to enable IPU6 camera sensor on kernel >= 6.8: ivsc_ace
    intel_vsc-5db76cf6-0a68-4ed6-9b78-0361635e2447: switch camera to host
    failed: -110 (LP: #2067364)
    - mei: vsc: Don't stop/restart mei device during system suspend/resume
    - SAUCE: media: ivsc: csi: don't count privacy on as error
    - SAUCE: media: ivsc: csi: add separate lock for v4l2 control handler
    - SAUCE: media: ivsc: csi: remove privacy status in struct mei_csi
    - SAUCE: mei: vsc: Enhance IVSC chipset stability during warm reboot
    - SAUCE: mei: vsc: Enhance SPI transfer of IVSC rom
    - SAUCE: mei: vsc: Utilize the appropriate byte order swap function
    - SAUCE: mei: vsc: Prevent timeout error with added delay post-firmware
      download
  * failed to probe camera sensor on Dell XPS 9315: ov01a10 i...

Changed in linux-hwe-6.8 (Ubuntu Jammy):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 6.8.0-41.41

---------------
linux (6.8.0-41.41) noble; urgency=medium

  * noble/linux: 6.8.0-41.41 -proposed tracker (LP: #2075611)

  * Packaging resync (LP: #1786013)
    - [Packaging] debian.master/dkms-versions -- update from kernel-versions
      (main/s2024.07.08)

  * md: nvme over tcp with a striped underlying md raid device leads to data
    corruption (LP: #2075110)
    - md/md-bitmap: fix writing non bitmap pages

  * Linux 6.8 fails to boot on ARM64 if any param is more than 146 chars
    (LP: #2069534)
    - SAUCE: arm64: v6.8: cmdline param >= 146 chars kills kernel

  * CVE-2024-39484
    - mmc: davinci: Don't strip remove function when driver is builtin

  * CVE-2024-39292
    - um: Add winch to winch_handlers before registering winch IRQ

 -- Manuel Diewald <email address hidden> Fri, 02 Aug 2024 16:15:19 +0200

Changed in linux (Ubuntu Noble):
status: Fix Committed → Fix Released
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-lowlatency-hwe-6.8/6.8.0-41.41.1~22.04.1 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy-linux-lowlatency-hwe-6.8' to 'verification-done-jammy-linux-lowlatency-hwe-6.8'. If the problem still exists, change the tag 'verification-needed-jammy-linux-lowlatency-hwe-6.8' to 'verification-failed-jammy-linux-lowlatency-hwe-6.8'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-jammy-linux-lowlatency-hwe-6.8-v2 verification-needed-jammy-linux-lowlatency-hwe-6.8
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-gcp-6.8/6.8.0-1014.15~22.04.1 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy-linux-gcp-6.8' to 'verification-done-jammy-linux-gcp-6.8'. If the problem still exists, change the tag 'verification-needed-jammy-linux-gcp-6.8' to 'verification-failed-jammy-linux-gcp-6.8'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-jammy-linux-gcp-6.8-v2 verification-needed-jammy-linux-gcp-6.8
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.