15.10beta crashes encrypted swap partition
| Affects | Status | Importance | Assigned to | Milestone | |
|---|---|---|---|---|---|
| | systemd (Ubuntu) |
High
|
Unassigned | ||
Bug Description
Hi,
I'm usually using a setup with three partitions on a disk
Partition 1: plain ext4 boot partition mounted on /boot
Partition 2: luks-encrypted swap
Partition 3: luks-encrypted btrfs for / /home ...
both mentioned in /etc/crypttab like
sda2_crypt UUID=a7976d5c-
sda3_crypt UUID=339b9a90-
With several machines I have installed 15.10 beta on and in several cases I experienced the problem that the swap is not activated at boot time and that /dev/disks/by-uuid does not contain a link to the swap partition, and the previously created luks-encrypted swap is destroyed after boot: It is not a luks partition anymore and filled with random (presumably encrypted) bytes without structure.
I first thought that this is a problem of the setup process, and repaired the swap manually. But then I found the partition destroyed again. This happend several times on several machines.
I am not sure yet what exactly would destroy the partition.
ProblemType: Bug
DistroRelease: Ubuntu 15.10
Package: cryptsetup 2:1.6.6-5ubuntu2
ProcVersionSign
Uname: Linux 4.2.0-16-generic x86_64
ApportVersion: 2.19.1-0ubuntu2
Architecture: amd64
CurrentDesktop: XFCE
Date: Wed Oct 14 18:12:58 2015
InstallationDate: Installed on 2015-10-08 (5 days ago)
InstallationMedia: Xubuntu 15.10 "Wily Werewolf" - Alpha amd64 (20150924)
SourcePackage: cryptsetup
UpgradeStatus: No upgrade log present (probably fresh install)
crypttab:
sda2_crypt UUID=a7976d5c-
sda3_crypt UUID=339b9a90-
| Hadmut Danisch (hadmut) wrote : | #1 |
| Steve Langasek (vorlon) wrote : | #2 |
| affects: | cryptsetup (Ubuntu) → systemd (Ubuntu) |
| Changed in systemd (Ubuntu): | |
| assignee: | nobody → Martin Pitt (pitti) |
| importance: | Undecided → High |
| Hadmut Danisch (hadmut) wrote : | #3 |
Well, life would be much easier if there was some usable documentation about what's going on within systemd.
By the way, I did not put in that 'swap' option manually, it was inserted by the xubuntu 15.10 beta installer on cdrom/usb image. If you choose to encrypt a partition and put a swap inside, it automatically adds that swap option. So at least this crypttab option, the behaviour of the installer, and systemd don't fit together.
Since you mention it: On my other machine with 15.10 I noticed the problem that the machine does not recover from hibernate, but performs a fresh boot, which meets your hint, that wake up does not work with that style of crypt swap.
whatever it is what fills the device with random data, should honor the luks option in the crypttab and use this thing as intended (i.e. configure the device mapper and do a swapon).
| Martin Pitt (pitti) wrote : | #4 |
I tried to reproduce this on today's ubuntu desktop amd64 image (20151014). I think I set up partitions like you described: 1 GB /boot on partition 1, 1 GB LUKS on partition 2 (and put swap on vda2_crypt), 8 GB LUKS on partition 3 (and put btrfs / on vda3_crypt).
Both during install and after a few reboots I see correct partition/file system types in "blkid":
$ blkid
/dev/mapper/
/dev/mapper/
/dev/vda1: UUID="947e51a6-
/dev/vda2: UUID="7a5a8534-
/dev/vda3: UUID="aa700da9-
The only change was in the UUID of vda2_crypt as that gets re-mkswap-ed every time due to the "swap" option in crypttab. If that's undesired, this needs to be fixed in partman -- however, it doesn't sound like that's the actual issue you see.
My /etc/crypttab looks pretty much like your's:
vda2_crypt UUID=7a5a8534-
vda3_crypt UUID=aa700da9-
and /etc/fstab isn't surprising either:
dev/mapper/
# /boot was on /dev/vda1 during installation
UUID=947e51a6-
/dev/mapper/
/dev/mapper/
So I can't reproduce "destroys swap partition" just yet. From your description it sounds like something is destroying sda3 itself (i. e. the outer encrypted LUKS partition), *not* the unencrypted sda3_crypt, right?
Can you please give me some details:
- What do you precisely do to "repair the swap manually"?
- After that, please copy&paste the output of "sudo blkid", "sudo swapon -s", "cat /etc/crypttab", and "cat /etc/fstab".
- Reboot
- After that, please copy&paste all of the above commands again, so that we can compare.
- Run "sudo journalctl -b > /tmp/journal.txt" and attach /tmp/journal.txt as well.
Thanks!
| Changed in systemd (Ubuntu): | |
| status: | New → Incomplete |
| Hadmut Danisch (hadmut) wrote : | #5 |
> From your description it sounds like something is destroying sda3 itself (i. e. the outer encrypted LUKS partition), *not* the unencrypted sda3_crypt, right?
Right.
I've created the partitions with the graphical xubuntu installer from xubuntu 15.10 beta 1 cdrom put on a usb stick, and created both sda2 and sda3 as encrypted volumes, then put a swap in sda2_encrypted and btrfs in sda3_encrypted. This worked well with 14.04.
After booting I've realized that the machine had no swap, even no links to the partition under /dev/disks/by-uuid, and thus could not open the device manually.
I found that the partition was completely filled with random data, no luks header. cryptsetup isLuks said it is not a luks device, and xxd should no trace of a luks header anymore, completely overwritten.
I assumed it was a problem of the installer, not of the running system. My first suspicion was a corrupted partition table, but I did not find any problem with the partition itself. My next suspicion was a fault in the storage device, since I had replaced the old hard disk with a brand new SSD for the fresh install, but except from that problem I do not see any problems with storage, and I experienced these problems on two distinct machines. I do not see any problems on the other partitions and their file systems so far.
> - What do you precisely do to "repair the swap manually"?
cryptsetup luksFormat -c aes-xts-plain64 -s 512 /dev/sda2 (and enter the same password as for the root partition sda3)
cryptsetup luksOpen /dev/sda2 xxx
mkswap /dev/mapper/xxx
On one of the two machines (office machine, I'm using right now) this helped and the problem did not reoccur so far. That's why I first assumed that it was just a problem of the installation process (graphical xubuntu installer), because I had experienced more trouble with the installer used in the lubuntu 15.10 beta cdrom image.
I did the very same thing at my machine at home, also ran into that problem, again assumed that it was a problem of the xubuntu installer, fixed it as described above, but it reoccured. (Meanwhile there's more trouble with this machine, systemd hangs in the boot process, except when I open an emergency root session.)
>- After that, please copy&paste the output of ...
I'll reply to that once I am back home at that particular machine.
| Hadmut Danisch (hadmut) wrote : | #6 |
OK, I am back at my home machine: The problem occured again, the machine destroyed again luks on /dev/sda2.
Furthermore, I have another problem: When doing a regular boot, but boot process hangs after systemd listed the names of several services (in most cases networking.service is the last printed, which is not quite useful, since these are, as I understand it, finished services, not the once that cause trouble. I did not yet find a way to make that damned systemd tell what it's doing.
Strange enough, the machine boots without problems if I choose there recovery mode, choose to aktive network from the menu, and then go on, so it works when recovery mode is part of the boot chain. I guess Ubuntu will have lots of fun with that systemd.
sudo blkid (sda2 currently damaged again)
/dev/mapper/
/dev/sda1: UUID="19e9998b-
/dev/sda3: UUID="339b9a90-
/dev/sda2: PARTUUID=
swapon -s : no output
/etc/crypttab:
#sda2_crypt UUID=a7976d5c-
sda2_crypt /dev/disk/
sda3_crypt UUID=339b9a90-
(I've replaced the serial number of my disk with *********)
/etc/fstab
/dev/mapper/
UUID=19e9998b-
/dev/mapper/
/dev/mapper/
.
I'll now repair the partition as described, reboot and come again.
| Hadmut Danisch (hadmut) wrote : | #7 |
OK, freshly rebootet. This time, sda2 has survived as a valid and operating luks partition.
crypttab and fstab not changed.
# swapon -s
Filename Type Size Used Priority
/dev/dm-1 partition 16308220 0 -1
# dir /dev/mapper
insgesamt 0
crw------- 1 root root 10, 236 Okt 15 22:17 control
lrwxrwxrwx 1 root root 7 Okt 15 22:18 sda2_crypt -> ../dm-1
lrwxrwxrwx 1 root root 7 Okt 15 22:17 sda3_crypt -> ../dm-0
# blkid
/dev/mapper/
/dev/sda1: UUID="19e9998b-
/dev/sda2: UUID="e1a46217-
/dev/sda3: UUID="339b9a90-
/dev/mapper/
I'll attach journal.txt
| Martin Pitt (pitti) wrote : | #8 |
> This time, sda2 has survived as a valid and operating luks partition.
Then the journal won't show the bits where it destroys it (but it's still useful for comparison). I'd like to see a journal when it does destroy the device. One way would be to just keep rebooting until that happens.
However, there might be a faster and also more useful way. First, stop only the swap partition and luks device:
sudo systemctl stop systemd-
Now /dev/mapper/ should not have sda2_crypt any more, just sda3_crypt (for the root partition). Then you can run the commands in /run/systemd/
sudo SYSTEMD_
(enter passphrase)
sudo /lib/systemd/
# now check if the signature is still correct:
sudo blkid -p /dev/sda2
You can try running this several times until it destroys your partition (FTR, I ran it successfully some 20 times). Does that reproduce the bug for you? If so, please copy&paste the output from the command cycle that did the destruction. If not, then I guess it's something else in the boot process, and then please reboot until it happens and attach the journal output from this boot.
Thanks!
> Furthermore, I have another problem: When doing a regular boot, but boot process hangs after systemd listed the names of several services
Please file a separate bug report about that. /usr/share/
| Hadmut Danisch (hadmut) wrote : | #9 |
I've noticed something and I guess that both my problems - swap problem and systemd hanging while booting - are closely related.
Why?
The system does not hang at boot, when I choose the recovery mode from grub, and in the recovery mode select "network" to enable networking. Important: before activating the network, the console asks me to enter the password for sda2_crypt (swap). System then can boot up the regular way.
So it seems to be something in the systemd service order. The recovery menu's network option does something, the normal boot sequence doesn't.
| Hadmut Danisch (hadmut) wrote : | #10 |
OK,
I have debugged this down and got big steps further in identifying the problem.
An important step for debugging was to learn how to debug systemd.
http://
was quite helpful, that
systemctl enable debug-shell.service
helps a lot. After that, one can get a root shell when the systemd boot process is hanging.
I have identified *two* problems, both in
/lib/systemd/
First problem:
The system boot procedure hangs because the process
/lib/
hangs. It waits for password input, but for some reason it's prompt and input don't make it their way to the boot console or boot splash prompt. There's a problem with the procedure for requesting a password.
Killing that process from the debug console makes the boot process continue immediately (of course without working swap).
Once knowing that this is the process causing trouble, debugging get's much easier, since it is not required anymore to try this within a boot process. You can use a running machine with any test partition for easy debugging.
BTW: systemd does not use /etc/crypttab directly, but converts the contents of /etc/crypttab to dynamically created units first, which can be found under /run/systemd. It shows
ExecStart=
ExecStop=
ExecStartPost=
So one knows what happens right here.
You can easily call the given command from anywhere as root with any partition, without the need to edit /etc/crypttab, because it's all command line parameters here. Makes testing pretty easy now.
Second problem:
That damned systemd-cryptsetup ignores luks (or is unable to cope with modern luks settings).
That's what the dmsetup looks like for my root partition setup in the initramfs:
0 903712768 crypt aes-xts-plain64 000000000000000
This looks good, because it's the same crypt-parameters (aes-xts) as I used when creating the luks partition, and it uses an offset of 4096, allowing the luks header to remain untouched.
But after running that systemd-cryptsetup for the sda2 partition (even after freshly partitioning it with cryptsetup), dmtable shows that:
0 32616448 crypt aes-cbc-
which contains *two* wrong settings:
- it's the wrong cipher
- it's an offset of 0, which overwrites the luks header. That's why I am seeing garbage again and again.
So it turns out that systemd-cryptsetup is tripple-buggy:
- Password dialog not working in boot process, neither in splash or non-splash mode (that's why boot process hangs)
- wrong cipher
- no offset, thus overwriting the luks header.
| Hadmut Danisch (hadmut) wrote : | #11 |
More info from the source code of systemd-cryptsetup:
else if (streq(option, "luks"))
...
} else if (STR_IN_SET(option, "plain", "swap", "tmp"))
so the swap argument overwrites the luks argument and resets the ecryption type to plain.
And indeed, when using
/lib/systemd/
(i.e. just change the order of the parameters, use swap,luks,discard instead of luks, swap, discard , as the ubuntu installer creates, it works and uses the luks partition correctly.
| Hadmut Danisch (hadmut) wrote : | #12 |
OK, I've finally found the problem(s). Was a bunch of little nasty problems, that's why it was difficult to debug.
1)
The 15.10 beta installer had filled /etc/initramfs-
RESUME=
which is never updated after first installation. Since I had to repair the swap device several times, this was not correct anymore, and furthermore /usr/share/
RESUME=sda2_crypt
is entered.
2) That's why /usr/share/
Once it is correctly mentioned in this file (after fixing bug 1), the password is fetched and the device is opened at the initramfs phase, i.e. before systemd takes control. This works well.
3) If sda2_crypt is not mentioned in the initramfs' /conf/conf.
But this does not work, since both systemd and plymouthd have bugs. plymouthd can go into an endlesss loop or completey fail, depending whether you have splash/graphical boot or textual.
Once bug 1 and 2 are solved, this issue does not occur anymore.
4) But then, system hangs while booting for another reason. systemd still tries to create a swap device and hangs forever. I could not reliably figure out why, but it looks as if it waits for systemd-cryptsetup for to do some things which it doesn't do since the crypt device is already open.
Solution: remove the swap option in /etc/crypttab
5) Finally seems to work.
Just for the notes: systemd (and plymouth) is so buggy and intransparent that it is far from beeing production-ready.
That cost me several evenings of work and headache.
| Changed in systemd (Ubuntu): | |
| status: | Incomplete → New |
| Hadmut Danisch (hadmut) wrote : | #13 |
Further observations:
I meanwhile figured out three modes:
1) putting the swap flag into /etc/crypttab -> crashes the partition every now and then, but not always.
2) removing the flag from /etc/crypttab, but keeping it in /etc/fstab and keeping it in
/etc/initramfs-
3) as 2, but removing /etc/initramfs-
Unfortunately the ubuntu installer produces mode 1, which does not really work.
This mess should really be fixed for 16.04 LTS. In my eyes it's a major problem of systemd, but the initramfs code could also be extended to use the password cache (which is there and caches anyway) to avoid asking twice.
regards
| Launchpad Janitor (janitor) wrote : | #14 |
Status changed to 'Confirmed' because the bug affects multiple users.
| Changed in systemd (Ubuntu): | |
| status: | New → Confirmed |
| Changed in systemd (Ubuntu): | |
| assignee: | Martin Pitt (pitti) → nobody |
| eviljoel (eviljoel-t) wrote : | #15 |
I'm also having this issue with Ubuntu 16.04.2.


The systemd package has taken over the handling of /etc/crypttab at boot from cryptsetup (without much coordination AFAICS), and it sounds like its interpretation of the crypttab is buggy.
"swap" is not synonymous with "random", and should not result in the device being clobbered, which is what is happening here. In particular, encrypted persistent swap needs to be supportable for users who wish to use this for suspend to disk, and this requires a LUKS header (with UUID).
Note however that for this use case, you *also* don't actually want to use 'swap' as an option in /etc/crypttab, because this is defined as "Run mkswap on the created device", and there's no need to do that if you have a persistent crypted swap.