[regression] file injection is broken on centos computes with Cirros images

Bug #1587960 reported by Ivan Berezovskiy on 2016-06-01
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Mirantis OpenStack
Status tracked in 10.0.x
10.0.x
High
Albert Syriy
9.x
High
Albert Syriy

Bug Description

Steps to reproduce:
Deploy 1-3 controller nodes
Add any number >1 centos computes nodes

Run ostf tests

Expected result:
ostf tests are green

Actual result:
Launch instance with file injection (failure) Execution command on Instance fails with unexpected result. Please refer to OpenStack logs for more details.

Problem could be related to this commit https://review.openstack.org/#/c/322434/2

Variable are set according to the patch, config drive appear in qemu command line but looks like it's ignored by instance.

ISO #424

Failed tests:
https://product-ci.infra.mirantis.net/job/9.0.system_test.ubuntu.rh/124/
https://product-ci.infra.mirantis.net/view/9.0_swarm/job/9.0.system_test.ubuntu.ol/6/

description: updated
Sergey Kolekonov (skolekonov) wrote :

btw, ubuntu cloud images are not affected - file injection works as expected, so it's cirros issue

Roman Podoliaka (rpodolyaka) wrote :

Ivan, could you please clarify what image you are using? It definitely works with TestVM we ship, which is Cirros-based and the latter works in upstream.

Ivan Berezovskiy (iberezovskiy) wrote :

This is failure in swarm and it doesn't use any specific image (only same cirros)

Roman Podoliaka (rpodolyaka) wrote :

I checked Ivan's env:

1) as Sergey pointed out ubuntu images work fine

2) we use the very same TestVM image as in Ubuntu deployments (checked md5 hashes)

3) with the cirros image used by devstack (which also forces usage of config drives) result is the same - the file's not injected

Will continue investigation, but the point is that only Cirros images are currently affected.

summary: - file injection is broken on centos computes
+ file injection is broken on centos computes with Cirros images

This looks interesting.

The only difference between booting of the same image on CentOS and Ubuntu is the label of the config-drive partion:

http://paste.openstack.org/show/507019/

Apparently, cirros is lookin for a label "config-2": http://bazaar.launchpad.net/~cirros-dev/cirros/trunk/view/head:/src/lib/cirros/ds/configdrive#L6

and string comparison fails due to case sensitivity...

Roman Podoliaka (rpodolyaka) wrote :

nova-compute uses the very same command on Ubuntu and CentOS:

2016-06-01 16:54:30.336 26950 DEBUG oslo_concurrency.processutils [req-5de854fe-2957-4307-8318-274a141363ce 035602374b8944efa0f6ff9a9584f3a6 c9
dcf94a37d04a46adb2a7ef42f6b9f3 - - -] CMD "mkfs -t vfat -n config-2 /var/lib/nova/instances/9c5b8421-11f3-4ca4-8972-628f82a20674/disk.config" r
eturned: 0 in 0.014s execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:356

and

2016-06-01 17:36:48.900 27558 DEBUG oslo_concurrency.processutils [req-78e93714-49c4-4152-94a8-391b60a6f919 0cd9e79b3589452ea19ae5780a31a97d 8c
0694c968144610be1c5897497fdc02 - - -] CMD "mkfs -t vfat -n config-2 /var/lib/nova/instances/0c2600cf-8cce-4969-a738-0b6e2b97c031/disk.config" r
eturned: 0 in 0.050s execute /usr/lib/python2.7/dist-packages/oslo_concurrency/processutils.py:356

so this looks to be a difference in the mkfs implementations...

Roman Podoliaka (rpodolyaka) wrote :

Just checked inside a VM - the check that Cirros performs is indeed case sensitive.

$ blkid -tLABEL=CONFIG-2 -odevice
/dev/vdb
$ blkid -tLABEL=config-2 -odevice
$

http://bazaar.launchpad.net/~cirros-dev/cirros/trunk/view/head:/src/lib/cirros/shlib#L134

Roman Podoliaka (rpodolyaka) wrote :

^ so we need either to figure out why mkfs capitalizes the label, or support both labels in Cirros (or just drop cirros...)

Roman Podoliaka (rpodolyaka) wrote :

And Ubuntu image works correctly, because it uses cloud-init, which actually tries multiple checks:

https://github.com/number5/cloud-init/blob/5c806580a7a4e3917ca582a44126c4c24b411880/cloudinit/sources/DataSourceConfigDrive.py#L248-L293

Roman Podoliaka (rpodolyaka) wrote :

Apparently, dosfstools won't let you use lowercase labels on CentOS, while it works fine on Ubuntu:

http://paste.openstack.org/show/507193/

All the descriptions of the config drive format mention "config-2", not "CONFIG-2":

http://cloudinit.readthedocs.io/en/latest/topics/datasources.html
https://coreos.com/os/docs/latest/config-drive.html
http://docs.openstack.org/user-guide/cli_config_drive.html

Nothing is said about whether case-sensitive or -insensitive string comparison should be used for comparing of FS labels.

Looks like FAT standard does not specify how labels should be treated, but Windows (at least XP) stores those in upper-case:

"For FAT volumes, volume labels are stored as uppercase regardless of whether they contain lowercase letters. NTFS volume labels retain and display the case used when the label was created."

https://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/label.mspx?mfr=true

E.g. in Debian this was considered to be a bug and was fixed:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=714971;msg=2

It even was accepted to upstream:

https://github.com/dosfstools/dosfstools/commit/465dd8cf8f643bdd39a732e7d7f819a6abdf3d83

and made it to 3.0.22 release. We must be using an older version.

tags: added: area-linux

so does it mean that the root of the issue in CirrOS image only?

tags: added: swarm-blocker
tags: added: blocker-for-qa
summary: - file injection is broken on centos computes with Cirros images
+ [regression] file injection is broken on centos computes with Cirros
+ images
Roman Podoliaka (rpodolyaka) wrote :

MOS Linux team, could you please rebuild an image with this patch ^ applied?

Roman Podoliaka (rpodolyaka) wrote :

Sorry, the patch above contains a typo, please use this one instead - http://xsnippet.org/361760/raw/

Roman Podoliaka (rpodolyaka) wrote :

Timur,

>>> so does it mean that the root of the issue in CirrOS image only?

Well, it's both:

- mkfs.vfat / dosfslabel tools on RHEL / CentOS 7 automatically capitilize the label we pass, so instead of "config-2" we get "CONFIG-2", while on Debian / Ubuntu we get a string in lower case as expected

- the config drive spec doc states that FS label should be in lower case

- the init system of cirros expects the config drive FS label to be in lowercase and fails, if it's not

- cloud-init (used in Ubuntu and other cloud images) is "smarter" and tries different options when looking for a config drive (e.g. a file system type and specific content stored), that's why Ubuntu / Fedora / other cloud images work correctly

Putting a workaround into Cirros seems to be the easiest way to proceed here, as we don't really want to change RHEL / CentOS packages.

in the same time it means that some cloud images will not work on such environments because of the issue with cloud-init. Please do not forget to add the information about the issue to the release notes.

tags: added: release-notes

does it affect deployments with ubuntu compute nodes?

Fix proposed to branch: master
Change author: Albert Syriy <email address hidden>
Review: https://review.fuel-infra.org/21573

Reviewed: https://review.fuel-infra.org/21573
Submitter: Pkgs Jenkins <email address hidden>
Branch: master

Commit: e696511151afe50007562efb9d8d0d9f242216b1
Author: Albert Syriy <email address hidden>
Date: Thu Jun 2 15:05:45 2016

Fix for file injection is broken on centos computes with Cirros images LP #1587960

Closes-bug: #1587960

Change-Id: I0091f8644ef15f5b14fa2dbab4c866ab1507acb1

Fix proposed to branch: 9.0
Change author: Albert Syriy <email address hidden>
Review: https://review.fuel-infra.org/21615

Reviewed: https://review.fuel-infra.org/21615
Submitter: Pkgs Jenkins <email address hidden>
Branch: 9.0

Commit: 1bd7c31d6d70034b84484e6fe35faf810c3fb0c3
Author: Albert Syriy <email address hidden>
Date: Fri Jun 3 09:14:40 2016

Fix for file injection is broken on centos computes with Cirros images LP #1587960

Closes-bug: #1587960

Change-Id: I0091f8644ef15f5b14fa2dbab4c866ab1507acb1
(cherry picked from commit e696511151afe50007562efb9d8d0d9f242216b1)

Related fix proposed to branch: master
Change author: Evgeny Konstantinov <email address hidden>
Review: https://review.fuel-infra.org/22326

Reviewed: https://review.fuel-infra.org/22326
Submitter: Evgeny Konstantinov <email address hidden>
Branch: master

Commit: dcd0ac5631e06625c76d2d62c17bc1fef67a7072
Author: Evgeny Konstantinov <email address hidden>
Date: Wed Jun 22 10:51:45 2016

Add resolved issues to relnotes 9.0

Change-Id: I87df13ac06921547312dd2165097d080528ec864
Related-Bug: #1587960
Related-Bug: #1544446
Related-Bug: #1478310
Related-Bug: #1543050
Related-Bug: #1495699

tags: added: release-notes-done
removed: release-notes
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers