Installation failed when Subiquity installs SSH server on noble

Bug #2056570 reported by Olivier Gayot
60
This bug affects 3 people
Affects Status Importance Assigned to Milestone
curtin
Fix Released
Undecided
Olivier Gayot
subiquity
Fix Released
Critical
Olivier Gayot

Bug Description

I'm trying to install Ubuntu Server 24.04 using today's daily + changes related to deb822. I'm using the following curtin revision:

https://git.launchpad.net/~ogayot/curtin/commit/?id=2ac55b7f594c5e73891e04a11ebcf9b1f7ec9e3e

Subiquity fails when installing openssh-server. Unfortunately, the logs do not help much to understand what's going on:

Mar 08 13:39:08 ubuntu-server subiquity_log.1547[11562]: system install failed for ['openssh-server']: Unexpected error while running comma>
Mar 08 13:39:08 ubuntu-server subiquity_log.1547[11562]: Command: ['unshare', '--fork', '--pid', '--mount-proc=/target/proc', '--', 'chroot>
Mar 08 13:39:08 ubuntu-server subiquity_log.1547[11562]: Exit code: 100
Mar 08 13:39:08 ubuntu-server subiquity_log.1547[11562]: Reason: -
Mar 08 13:39:08 ubuntu-server subiquity_log.1547[11562]: Stdout: ''
Mar 08 13:39:08 ubuntu-server subiquity_log.1547[11562]: Stderr: ''

I've added more logs to see what happens (see https://code.launchpad.net/~ogayot/curtin/+git/curtin/+merge/462051) and now I can see that dpkg fails during the postinst script:

Mar 08 14:51:10 ubuntu-server subiquity_log.1547[10961]: Stderr: perl: warning: Setting locale failed.
[...]
Mar 08 14:51:10 ubuntu-server subiquity_log.1547[10961]: Creating config file /etc/ssh/sshd_config with new version
Mar 08 14:51:10 ubuntu-server subiquity_log.1547[10961]: Creating SSH2 RSA key; this may take some time ...
Mar 08 14:51:10 ubuntu-server subiquity_log.1547[10961]: 3072 SHA256:tXx12vlm+iJZZUZzitch0ZdmXdYpmjw2eFG+vBmizWo root@ubuntu-server>
Mar 08 14:51:10 ubuntu-server subiquity_log.1547[10961]: Creating SSH2 ECDSA key; this may take some time ...
Mar 08 14:51:10 ubuntu-server subiquity_log.1547[10961]: 256 SHA256:aTGgNLJcS/gjoXyDbZGGw8Bksjm/ENHOcWwER6hZOYQ root@ubuntu-server >
Mar 08 14:51:10 ubuntu-server subiquity_log.1547[10961]: Creating SSH2 ED25519 key; this may take some time ...
Mar 08 14:51:10 ubuntu-server subiquity_log.1547[10961]: 256 SHA256:MZmITjwhkmfqHyu/U3x68Y9yw48UgJTfLAriavFznv4 root@ubuntu-server >
Mar 08 14:51:10 ubuntu-server subiquity_log.1547[10961]: Failed to connect to bus: No data available
Mar 08 14:51:10 ubuntu-server subiquity_log.1547[10961]: dpkg: error processing package openssh-server (--configure):
Mar 08 14:51:10 ubuntu-server subiquity_log.1547[10961]: installed openssh-server package post-installation script subprocess retu>
Mar 08 14:51:10 ubuntu-server subiquity_log.1547[10961]: Errors were encountered while processing:
Mar 08 14:51:10 ubuntu-server subiquity_log.1547[10961]: openssh-server
Mar 08 14:51:10 ubuntu-server subiquity_log.1547[10961]: E: Sub-process /usr/bin/dpkg returned an error code (1)

After adding set -x to the postinst script, we see that it is the call to systemctl daemon-reload which causes the failure:

        + [ -d /run/systemd/system ]
        + systemctl daemon-reload
        Failed to connect to bus: No data available
        + cleanup
        + [ /tmp/tmp.iebuhpLhg7 ]
        + rm -f /tmp/tmp.iebuhpLhg7
        + [ ]
        dpkg: error processing package openssh-server (--configure):
         installed openssh-server package post-installation script subprocess returned error exit status 1
        Errors were encountered while processing:
         openssh-server
        E: Sub-process /usr/bin/dpkg returned an error code (1)

I think it is a regression introduced by https://code.launchpad.net/~mitchellaugustin/curtin/+git/curtin/+merge/460960 but I have to confirm.

Tags: iso-testing

Related branches

Olivier Gayot (ogayot)
description: updated
Revision history for this message
Olivier Gayot (ogayot) wrote (last edit ):
Olivier Gayot (ogayot)
description: updated
Changed in subiquity:
importance: Undecided → High
description: updated
Dan Bungert (dbungert)
Changed in curtin:
status: New → Confirmed
Changed in subiquity:
status: New → Confirmed
Revision history for this message
Dan Bungert (dbungert) wrote :

Thanks for reporting this, the problem will need to be understood better. Is this more broad than openssh? If we aren't able to come up with an answer soon then I think we should revert the ischroot changes, as if I can only pick one of openssh and the original DKMS problem it's going to be openssh.

Changed in subiquity:
importance: High → Critical
tags: added: foundations-todo
Revision history for this message
Olivier Gayot (ogayot) wrote (last edit ):
Download full text (3.9 KiB)

Without the --mount-proc option, calling `systemctl daemon-reload` in the chroot prints out "Running in chroot, ignoring command 'daemon-reload'" and then exits with status 0.

With the --mount-proc option, calling `systemctl daemon-reload` in the chroot fails with "Failed to connect to bus: No data available" and fails with status 100.

To determine if we are running in a chroot, systemd calls fstatat(2) on / and then fstatat(2) on /proc/1/root. It then compares the resulting structures, looking specially at the inode number, inode type and backing device. If anything looks different, systemd assumes we are in a chroot.

Using stat(1), we can observe what happens:

Without the --mount-proc option, the backing device (i.e. "Device") is different, therefore systemd assumes we are in a chroot:

# stat -L / /proc/1/root
          File: /
          Size: 4096 Blocks: 8 IO Block: 4096 directory
=> Device: 252,0 Inode: 2 Links: 20
        Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
        Access: 2024-03-11 08:01:50.538756312 +0000
        Modify: 2024-03-11 08:01:49.398777854 +0000
        Change: 2024-03-11 08:01:49.398777854 +0000
         Birth: 2024-03-11 08:00:36.000000000 +0000
          File: /proc/1/root
          Size: 260 Blocks: 0 IO Block: 4096 directory
=> Device: 0,28 Inode: 2 Links: 1
        Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
        Access: 2024-03-11 08:06:22.017527026 +0000
        Modify: 2024-03-11 08:00:26.458886048 +0000
        Change: 2024-03-11 08:00:26.458886048 +0000
         Birth: 2024-03-11 07:58:30.876000000 +0000

But with the --mount-proc option, the structures look identical, therefore systemd thinks we are not running in a chroot:

          File: /
          Size: 4096 Blocks: 8 IO Block: 4096 directory
=> Device: 252,0 Inode: 2 Links: 20
        Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
        Access: 2024-03-11 08:01:50.538756312 +0000
        Modify: 2024-03-11 08:01:49.398777854 +0000
        Change: 2024-03-11 08:01:49.398777854 +0000
         Birth: 2024-03-11 08:00:36.000000000 +0000
          File: /proc/1/root
          Size: 4096 Blocks: 8 IO Block: 4096 directory
=> Device: 252,0 Inode: 2 Links: 20
        Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
        Access: 2024-03-11 08:01:50.538756312 +0000
        Modify: 2024-03-11 08:01:49.398777854 +0000
        Change: 2024-03-11 08:01:49.398777854 +0000
         Birth: 2024-03-11 08:00:36.000000000 +0000

Explanation
-----------
* When we run a command in a ChrootableTarget, we have:
  ** /proc bind mounted to /target/proc
  ** /sys bind mounted to /target/sys
  ** /run bind mounted to /target/run
  ** /dev bind mounted to /target/dev

* When we run, `unshare --pid --fork chroot /target apt-get ...`
  ** the content of /target/proc is inherited from outside the chroot, because of the bind-mount.
  ** /target/proc/1 corresponds to the process with PID 1 in the "parent" PID namespace (which is the ...

Read more...

Olivier Gayot (ogayot)
Changed in curtin:
assignee: nobody → Olivier Gayot (ogayot)
Revision history for this message
Alex Wang (alexwang-bkc) wrote :

All right, we are also hitting this error with user-data automation methodology which used to work on Ubuntu 22.04. May we know is there any workaround for this or we simply have to wait for new ISO to include the fix? Thanks a lot.

Revision history for this message
Dan Bungert (dbungert) wrote :

This will be fixed for 24.04. I project that a fix to land this week. Olivier here researched it and opened a MP, we just need to follow through on that process.

Revision history for this message
Ken VanDine (ken-vandine) wrote (last edit ):

A workaround would be to not check the "Install openssh-server" checkbox, but then you will need to login locally after installation and install that package manually and import your own ssh keys. But could be a valid workaround.

In the case of an automated install, you would need to change the ssh server option, but that might be less feasible of a workaround for automated installs.

Revision history for this message
Frank Heimes (fheimes) wrote (last edit ):

(Just bumped into this as well on s390x, were using ssh server is very common.
 hence marked this bug as affecting me as well ...
 Ken's workaround helps to be unblocked from further testing.)

Revision history for this message
Server Team CI bot (server-team-bot) wrote :

This bug is fixed with commit a5b815bc to curtin on branch master.
To view that commit see the following URL:
https://git.launchpad.net/curtin/commit/?id=a5b815bc

Changed in curtin:
status: Confirmed → Fix Committed
Olivier Gayot (ogayot)
Changed in subiquity:
status: Confirmed → In Progress
assignee: nobody → Olivier Gayot (ogayot)
Dan Bungert (dbungert)
Changed in subiquity:
status: In Progress → Triaged
status: Triaged → Fix Committed
Revision history for this message
Ubuntu QA Website (ubuntuqa) wrote :

This bug has been reported on the Ubuntu ISO testing tracker.

A list of all reports related to this bug can be found here:
http://iso.qa.ubuntu.com/qatracker/reports/bugs/2056570

tags: added: iso-testing
Olivier Gayot (ogayot)
tags: removed: foundations-todo
Revision history for this message
Olivier Gayot (ogayot) wrote :

This bug should be fixed in today's build of the Ubuntu Server installer.

Revision history for this message
Dan Bungert (dbungert) wrote :

Marking Fix Released for this Noble-only problem.

Changed in subiquity:
status: Fix Committed → Fix Released
Changed in curtin:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.