multiple long delays during kernel and userspace boot

Bug #1858495 reported by Ryan Harper
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux-signed-azure (Ubuntu)
Won't Fix
Medium
Colin Ian King

Bug Description

Booting some Bionic instances in Azure (gen1 machines), I see some large delays during kernel/userspace boot that it would be good to understand what's going on. Additionally, there areas during boot that see delays is different for an image that's been created from a template vs. stock images.

I'm attaching some data, 10 runs of the same image in a scaling set that run the initial boot. Processing the journal output, looking at delays of over 2.0 shows some concern.

[ 1.788581] localhost.localdomain kernel: * Found PM-Timer Bug on the chipset. Due to workarounds for a bug,
                                             * this clock source is slow. Consider trying other clock sources
[ 3.545974] localhost.localdomain kernel: Unstable clock detected, switching default tracing clock to "global"
                                             If you want to keep using the local clock, then add:
                                               "trace_clock=local"
                                             on the kernel command line
[ 6.401684] localhost.localdomain kernel: EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null)
[ 15.280390] localhost.localdomain kernel: EXT4-fs (sda1): re-mounted. Opts: discard

After capturing bionic image as a template, and creating a new VM, we see new hot spots we didn't see before.

# HotSpot maximum delta between kernel messages: 2.00000
# [ 2.846188] localhost.localdomain kernel: AES CTR mode by8 optimization enabled
# [ 5.919313] localhost.localdomain kernel: raid6: avx2x4 gen() 21512 MB/s
#
# [ 6.591530] localhost.localdomain kernel: EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null)
# [ 9.031051] localhost.localdomain systemd[1]: systemd 237 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN -PCRE2 default-hierarchy=hybrid)
#
# [ 13.773554] localhost.localdomain sh[871]: + exit 0
# [ 21.625467] localhost.localdomain kernel: UDF-fs: INFO Mounting volume 'UDF Volume', timestamp 2019/12/17 00:00 (1000)
#
# [ 24.919359] bugbif2be000001 systemd-timesyncd[771]: Synchronized to time server 91.189.89.198:123 (ntp.ubuntu.com).
# [ 29.787339] bugbif2be000001 cloud-init[1026]: Cloud-init v. 19.2-36-g059d049c-0ubuntu2~18.04.1 running 'init' at Mon, 16 Dec 2019 18:14:47 +0000. Up 25.20 seconds.

The easiest comparison kernel-side is the systemd-analyze value:

Grepping in the debug data:

% grep "Startup finished.*kernel" bug-bionic-baseline-no*.debug/*/journal.log | cut -d" " -f 7-
Startup finished in 3.209s (kernel) + 49.305s (userspace) = 52.515s.
Startup finished in 3.355s (kernel) + 51.732s (userspace) = 55.088s.
Startup finished in 3.287s (kernel) + 51.747s (userspace) = 55.035s.
Startup finished in 3.129s (kernel) + 50.066s (userspace) = 53.195s.
Startup finished in 3.350s (kernel) + 50.682s (userspace) = 54.032s.
Startup finished in 3.355s (kernel) + 49.322s (userspace) = 52.678s.
Startup finished in 3.219s (kernel) + 51.124s (userspace) = 54.343s.
Startup finished in 3.128s (kernel) + 49.226s (userspace) = 52.354s.
Startup finished in 3.193s (kernel) + 53.197s (userspace) = 56.390s.
Startup finished in 3.118s (kernel) + 46.203s (userspace) = 49.322s.

foofoo % grep "Startup finished.*kernel" bug-bionic-baseline-after*.debug/*/journal.log | cut -d" " -f 7-
Startup finished in 7.685s (kernel) + 32.463s (userspace) = 40.148s.
Startup finished in 7.041s (kernel) + 35.998s (userspace) = 43.040s.
Startup finished in 7.808s (kernel) + 35.444s (userspace) = 43.253s.
Startup finished in 7.206s (kernel) + 37.952s (userspace) = 45.159s.
Startup finished in 8.426s (kernel) + 36.976s (userspace) = 45.403s.
Startup finished in 6.731s (kernel) + 35.484s (userspace) = 42.216s.
Startup finished in 7.152s (kernel) + 32.664s (userspace) = 39.817s.
Startup finished in 7.429s (kernel) + 36.177s (userspace) = 43.606s.
Startup finished in 9.075s (kernel) + 32.494s (userspace) = 41.570s.
Startup finished in 7.281s (kernel) + 32.732s (userspace) = 40.013s.

ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: linux-image-5.0.0-1027-azure 5.0.0-1027.29~18.04.1
ProcVersionSignature: User Name 5.0.0-1027.29~18.04.1-azure 5.0.21
Uname: Linux 5.0.0-1027-azure x86_64
ApportVersion: 2.20.9-0ubuntu7.9
Architecture: amd64
Date: Mon Jan 6 18:44:10 2020
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=C.UTF-8
 SHELL=/bin/bash
SourcePackage: linux-signed-azure
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
Ryan Harper (raharper) wrote :
Changed in linux-signed-azure (Ubuntu):
assignee: nobody → Colin Ian King (colin-king)
Changed in linux-signed-azure (Ubuntu):
importance: Undecided → Medium
status: New → In Progress
Revision history for this message
Andrea Righi (arighi) wrote :

Hi Ryan, these boot delays are happening with a particular instance type or a particular configuration? Is this slowness consistent across reboots (rebooting the same instance multiple times)?

I'm doing some tests (profiling kernel initcalls and user-space boot time via systemd-analize) using an instance created from a stock bionic image, but the boot time seems to be pretty consistent.

Revision history for this message
Colin Ian King (colin-king) wrote :

Hi Ryan,

We would like to reproduce this bug to debug it further. Can you answer the questions below relating to your initial comments in the bug:

"Booting some Bionic instances in Azure (gen1 machines).."
Q: What is a gen1 machine? What instance type is this?

"..I see some large delays during kernel/userspace boot that it would be good to understand what's going on. Additionally, there areas during boot that see delays is different for an image that's been created from a template vs. stock images."

Q: I don't know what these are. Can you explain how these are created? Do you have any exact examples of a template and stock image?

Thanks, Colin

Changed in linux-signed-azure (Ubuntu):
status: In Progress → Incomplete
Revision history for this message
Ryan Harper (raharper) wrote :

Sorry for missing the questions earlier.

Azure has two "machine types" gen1 which boots a non-uefi based virtual hardware platform and gen2 which is UEFI with newer virtual hardware, details here:

https://azure.microsoft.com/en-us/updates/generation-2-virtual-machines-in-azure-public-preview/?cdn=disable

I boot daily Ubuntu-Server Daily images: here's a gen1 launch:

az vm create --name=citest-singleboot-Standard-DS2-v2 \
   --image=Canonical:UbuntuServer:19.10-DAILY:19.10.201911270 \
   --admin-username=root --admin-username=ubuntu \
   -g cloud-init-sru --os-disk-size-gb 37 \
   --ssh-key-value /home/ubuntu/.ssh/id_rsa.pub \
   --boot-diagnostics-storage cibootdiagstor \
   --size Standard_DS2_v2 '--location=West US'

Gen2 launch looks like:

az vm create --name=cloudinit-azure-cloudtest \
   --image=Canonical:UbuntuServer:19_10-daily-gen2:19.10.202002050
   --admin-username=root --admin-username=ubuntu \
   -g cloud-init-sru --os-disk-size-gb 37 \
   --ssh-key-value /home/ubuntu/.ssh/id_rsa.pub \
   --boot-diagnostics-storage cibootdiagstor \
   --size Standard_DS2_v2 '--location=West US'

To make a template image after booting one of these images; you run the following:

# deallocate vm, generalize
az vm deallocate --resource-group $RESOURCE_GROUP --name $VM_NAME
az vm generalize --resource-group $RESOURCE_GROUP --name $VM_NAME
az image create --resource-group $RESOURCE_GROUP \
    $HV_GEN \
    --name $TEMPLATE_IMAGE --source $VM_NAME --location "$LOCATION"

And then you can launch an instance using your template image by
specifying the the $TEMPLATE_IMAGE value (you pick the name) to
the az create --image parameter above.

HV_GEN=--hyper-v-generation=V2 # only if you're creating an gen2 image template.

Changed in linux-signed-azure (Ubuntu):
status: Incomplete → New
Revision history for this message
Ryan Harper (raharper) wrote :

> <arighi> 14:48:15> this slowness is happening with a particular instance type?

I've not tested extensively across all types; but it's common for any of the "fast" types which have SSD backing. I've seen this DS1_v2, DS2_v2, DS3_v3, D4-v2, B2s, A2s, L4s

> <arighi> 14:49:38> the slowness is happening across multiple reboots or it's just the first boot?

Some issues are seen on first boot, others only on first boot of a template, and some on reboot.

Revision history for this message
Ryan Harper (raharper) wrote :

here is some debug data I captured:

-rw-rw-r-- 1 ubuntu ubuntu 729 Dec 16 12:15 bug-bionic-baseline-after-templating-Standard-DS2-v2.csv
drwxrwxr-x 12 ubuntu ubuntu 22 Dec 16 12:15 bug-bionic-baseline-after-templating-Standard-DS2-v2.csv.debug/
-rw-rw-r-- 1 ubuntu ubuntu 721 Dec 16 13:07 bug-bionic-baseline-no-templating-Standard-DS2-v2.csv
drwxrwxr-x 12 ubuntu ubuntu 22 Dec 16 13:07 bug-bionic-baseline-no-templating-Standard-DS2-v2.csv.debug/
-rw-rw-r-- 1 ubuntu ubuntu 2514 Jan 6 12:41 hotspots.py

% find . -name 'journal.log' | sort
./bug-bionic-baseline-after-templating-Standard-DS2-v2.csv.debug/1/journal.log
./bug-bionic-baseline-after-templating-Standard-DS2-v2.csv.debug/10/journal.log
./bug-bionic-baseline-after-templating-Standard-DS2-v2.csv.debug/2/journal.log
./bug-bionic-baseline-after-templating-Standard-DS2-v2.csv.debug/3/journal.log
./bug-bionic-baseline-after-templating-Standard-DS2-v2.csv.debug/4/journal.log
./bug-bionic-baseline-after-templating-Standard-DS2-v2.csv.debug/5/journal.log
./bug-bionic-baseline-after-templating-Standard-DS2-v2.csv.debug/6/journal.log
./bug-bionic-baseline-after-templating-Standard-DS2-v2.csv.debug/7/journal.log
./bug-bionic-baseline-after-templating-Standard-DS2-v2.csv.debug/8/journal.log
./bug-bionic-baseline-after-templating-Standard-DS2-v2.csv.debug/9/journal.log
./bug-bionic-baseline-no-templating-Standard-DS2-v2.csv.debug/1/journal.log
./bug-bionic-baseline-no-templating-Standard-DS2-v2.csv.debug/10/journal.log
./bug-bionic-baseline-no-templating-Standard-DS2-v2.csv.debug/2/journal.log
./bug-bionic-baseline-no-templating-Standard-DS2-v2.csv.debug/3/journal.log
./bug-bionic-baseline-no-templating-Standard-DS2-v2.csv.debug/4/journal.log
./bug-bionic-baseline-no-templating-Standard-DS2-v2.csv.debug/5/journal.log
./bug-bionic-baseline-no-templating-Standard-DS2-v2.csv.debug/6/journal.log
./bug-bionic-baseline-no-templating-Standard-DS2-v2.csv.debug/7/journal.log
./bug-bionic-baseline-no-templating-Standard-DS2-v2.csv.debug/8/journal.log
./bug-bionic-baseline-no-templating-Standard-DS2-v2.csv.debug/9/journal.log

You can run the hotspots.py on any of the journal.log files:
% python3 hotspots.py bug-bionic-baseline-after-templating-Standard-DS2-v2.csv.debug/1/journal.log 2>/dev/null | head
# HotSpot maximum delta between kernel messages: 0.50000
# [ 2.637087] localhost.localdomain kernel: AES CTR mode by8 optimization enabled
# [ 4.493987] localhost.localdomain kernel: raid6: avx2x4 gen() 22822 MB/s
#
# [ 5.159028] localhost.localdomain kernel: EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null)
# [ 8.385284] localhost.localdomain systemd[1]: systemd 237 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN -PCRE2 default-hierarchy=hybrid)
#
# [ 8.551071] localhost.localdomain systemd[1]: Installed transient /etc/machine-id file.
# [ 10.292844] localhost.localdomain systemd[1]: Created slice System Slice.

Revision history for this message
Ryan Harper (raharper) wrote :

The primary concern is time before rootfs mounting and executing /sbin/init. For the spots after that, that falls into systemd territory.

Revision history for this message
Colin Ian King (colin-king) wrote :

Can you update the bug with the output from the command:

cat /proc/cmdline

I wonder if an overly pessimistic rootdelay setting is being used.

Revision history for this message
Colin Ian King (colin-king) wrote :

Hi Ryan,

I analysed the data from comment #6 using the journal.log from bug-bionic-baseline-after-templating-Standard-DS2-v2 and collated some notable places in the boot process and compared boots 1 through to 10. Attached are the results as a LibreOffice Calc spread sheet. The data shows a little bit of timing variance (which is not surprising) but on the whole the data looks very similar from boot to boot. So I can't see what's particularly wrong with the boot timings in that scenario. Any comments on that?

Data: see attached.

Revision history for this message
Colin Ian King (colin-king) wrote :

And also data for bug-bionic-baseline-no-templating-Standard-DS2-v2.csv attached

Revision history for this message
Colin Ian King (colin-king) wrote :

Some of the early kernel boot time variations occur when detecting and spinning up devices sda and sbd and reading the partition information. Given this is probably spinny disc media, one would expect variations of this kind in early boot.

The other variations I can see are also due to probing of hardware, which again is probably expected. I'm finding it hard to see anything really too problematic with the kernel side boot timings given that we are dealing with H/W style variations in the boot. If there are issues that are glaring and I'm missing it, can you point me to them?

Revision history for this message
Ryan Harper (raharper) wrote :

None of the VMs will be using spinning disks, it's all SSD; and virtual disks anyhow.

I would not expect much timing difference on virtual hardware; there aren't real device or pci timing delays; though the kernel may wait for them; however, it should be consistent.

In terms of the things that caught my eye:

# HotSpot maximum delta between kernel messages: 2.00000
# [ 2.846188] localhost.localdomain kernel: AES CTR mode by8 optimization enabled
# [ 5.919313] localhost.localdomain kernel: raid6: avx2x4 gen() 21512 MB/s

# [ 5.159028] localhost.localdomain kernel: EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null)
# [ 8.385284] localhost.localdomain systemd[1]: systemd 237 running in

[ 1.788581] localhost.localdomain kernel: * Found PM-Timer Bug on the chipset. Due to workarounds for a bug,
                                             * this clock source is slow. Consider trying other clock sources
[ 3.545974] localhost.localdomain kernel: Unstable clock detected, switching default tracing clock to "global"
                                             If you want to keep using the local clock, then add:
                                               "trace_clock=local"
                                             on the kernel command line

Now, I owe you a /proc/cmdline, I'll get to that.

I looked over the spreadsheet; and while they are consistent with in each type (no template, after templating) the total time in kernel is double in after-templating (according to systemd-analyze); so I'd like to understand why that's the case.

% grep "Startup finished.*kernel" bug-bionic-baseline-no*.debug/*/journal.log | cut -d" " -f 7-
Startup finished in 3.209s (kernel) + 49.305s (userspace) = 52.515s.
Startup finished in 3.355s (kernel) + 51.732s (userspace) = 55.088s.
Startup finished in 3.287s (kernel) + 51.747s (userspace) = 55.035s.
Startup finished in 3.129s (kernel) + 50.066s (userspace) = 53.195s.
Startup finished in 3.350s (kernel) + 50.682s (userspace) = 54.032s.
Startup finished in 3.355s (kernel) + 49.322s (userspace) = 52.678s.
Startup finished in 3.219s (kernel) + 51.124s (userspace) = 54.343s.
Startup finished in 3.128s (kernel) + 49.226s (userspace) = 52.354s.
Startup finished in 3.193s (kernel) + 53.197s (userspace) = 56.390s.
Startup finished in 3.118s (kernel) + 46.203s (userspace) = 49.322s.

vs

% grep "Startup finished.*kernel" bug-bionic-baseline-after*.debug/*/journal.log | cut -d" " -f 7-
Startup finished in 7.685s (kernel) + 32.463s (userspace) = 40.148s.
Startup finished in 7.041s (kernel) + 35.998s (userspace) = 43.040s.
Startup finished in 7.808s (kernel) + 35.444s (userspace) = 43.253s.
Startup finished in 7.206s (kernel) + 37.952s (userspace) = 45.159s.
Startup finished in 8.426s (kernel) + 36.976s (userspace) = 45.403s.
Startup finished in 6.731s (kernel) + 35.484s (userspace) = 42.216s.
Startup finished in 7.152s (kernel) + 32.664s (userspace) = 39.817s.
Startup finished in 7.429s (kernel) + 36.177s (userspace) = 43.606s.
Startup finished in 9.075s (kernel) + 32.494s (userspace) = 41.570s.
Startup finished in 7.281s (kernel) + 32.732s (userspace) = 40.013s.

Revision history for this message
Ryan Harper (raharper) wrote :

Ah, from the journal.log:

Command line: BOOT_IMAGE=/boot/vmlinuz-5.3.0-1008-azure root=PARTUUID=1261a2c6-48ca-43ee-9b70-197f5b89b82c ro console=tty1 console=ttyS0 earlyprintk=ttyS0 panic=-1

Revision history for this message
Colin Ian King (colin-king) wrote :

Thanks for the useful feedback, I'll enable some finer grained metrics tomorrow and see if I can pin point those longer delays.

Revision history for this message
Colin Ian King (colin-king) wrote :

I added initcall debug and got some more data:

The first delay:

# [ 2.846188] localhost.localdomain kernel: AES CTR mode by8 optimization enabled
# [ 5.919313] localhost.localdomain kernel: raid6: avx2x4 gen() 21512 MB/s

There are some silent initializations occuring between these two points, mainly crypto code such as ghash, crc32, ccrt10dif, raid0, raid, libcrc32c etc. A fairly long pause of ~600ms occurs during the hv_netvsc initialization, this is the Hyper-V network driver, so that's expected to have some time variation and is dependant on the host.

The second delay:

# [ 5.159028] localhost.localdomain kernel: EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null)
# [ 8.385284] localhost.localdomain systemd[1]: systemd 237 running in..

I can't seem to seem to reproduce the long delay, I'm seeing ~100-250ms delays between the invocation of the mount and the mount completion. I wonder if this is still a H/W issue where my VMs are different machines and performing faster. Anyhow, with the debug enabled I believe there is little going on between these two points apart from an ext4 mount.

Changed in linux-signed-azure (Ubuntu):
status: New → Incomplete
Revision history for this message
Marcelo Cerri (mhcerri) wrote :

The issue can't be reproduced anymore.

Changed in linux-signed-azure (Ubuntu):
status: Incomplete → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.