Cobbler max length of kickstart/preseed 110kb ± 24hdd

Bug #1382364 reported by Andrey Kirilochkin
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Invalid
Medium
Vladimir Kozhukalov

Bug Description

Hi guys,
Found one interesting bug, about big kick-start/preseed file.

I want to deploy MOS with 10 Ceph nodes.

1. Success story: On 5 of them I have only 5 disks, each disk has additional partition for journal. Provisioning of these nodes works fine.
2. On node (by my fault) has 25 disks and journal in a file. I just forgot create additional partition. Provision of this node works fine too.
3. Unsuccessful: On other 4 nodes I have 25 disks with additional partition for ceph-journal.

When i started to search for solution i found that:

1. Preseed file in first case was: ±80kb.
2. In second case: 110kb.
3. In third case: 155kb.

It seems that we have linux limitation of length of the one string.

When I have removed 4 drives from those unsuccessful nodes, provision work fine.

So guys, it seems that we should create something like "helper.sh" that will be downloaded during the provision. This "helper" should have short command-line parameters like this:

/tmp/part_create.sh 1:10GB:xfs:/var/lib/ceph/osd-1/journal 2:100GB:xfs:/var/lib/ceph/osd-1

It is my vision, how to fix this.

Mike Scherbakov (mihgen)
Changed in fuel:
milestone: none → 6.0
tags: added: provision
description: updated
Changed in fuel:
importance: Undecided → Medium
Revision history for this message
Andrey Kirilochkin (andreika-mail) wrote :
Revision history for this message
Andrey Kirilochkin (andreika-mail) wrote :

This one is bigger than possible.

Changed in fuel:
assignee: nobody → Fuel Library Team (fuel-library)
status: New → Confirmed
Changed in fuel:
status: Confirmed → Triaged
Changed in fuel:
assignee: Fuel Library Team (fuel-library) → Matthew Mosesohn (raytrac3r)
Revision history for this message
Matthew Mosesohn (raytrac3r) wrote :

Couldn't reproduce in CentOS (you wrote kickstart). Now I looked at log and it's actually Ubuntu. Trying to reproduce in VirtualBox with 15 disks.

Revision history for this message
Matthew Mosesohn (raytrac3r) wrote :

I'm having issues with ceph deployment after install, but the actual partitioning looks ok on the Cobbler side when deploying Ubuntu. What kind of errors are you seeing? What made you suspect Cobbler was to blame?

Changed in fuel:
status: Triaged → New
status: New → Incomplete
Changed in fuel:
milestone: 6.0 → 6.1
Changed in fuel:
status: Incomplete → Invalid
Revision history for this message
Oleksiy Molchanov (omolchanov) wrote :

This bug was incomplete for more than 4 weeks. We cannot investigate it further so we are setting the status to Invalid. If you think it is not correct, please feel free to provide requested information and reopen the bug, and we will look into it further.

Revision history for this message
Matthew Mosesohn (raytrac3r) wrote :

I was able to reproduce it with a VirtualBox env with 25 1GB disks. Installation completed sometimes, but it took a very long time. I wasn't able to identify what specific delays caused the problem. It's definitely a problem, but image based provisioning is coming and is probably our most likely way to work around this.

Revision history for this message
Andrey Kirilochkin (andreika-mail) wrote :

Sorry for inactivity.
When we have on ceph-osd more than 23 partitions, installation of ubuntu fails.

How to reproduce:
1. Configure 30+ partitions for ceph-osd and 30+ for journal on the same disk(just to be sure).
2. Start installation and wait until installation finish.
3. Ubuntu installs boot-sector and goes to reboot the machine.

What expected:
1. Installation is finished.

What we really have:
1. Node goes to reboot, but after reboot installation starts again because preseed was corrupted in post-install section.

Revision history for this message
Dmitriy Novakovskiy (dnovakovskiy) wrote :

I think workaround is required for this, it's not enough to rely on image based provisioning (until, at least, it is claimed to be fully stable and substituting default Cobbler-based mechanism). Around 60% of installations that are happening or about to happen w/ 6.0 in near future that I'm aware of are using 20+ disks Ceph nodes.

Ugly workaround that Vladimir has described, or something like "run a shell script to partition OSD disks and add them to OSD at the very end of deployment"

Revision history for this message
Vladimir Kozhukalov (kozhukalov) wrote :

The correct way to fix this issue is to use image based provisioning (IBP). It is not quite stable at the moment but we know most of its bugs and some of them are fixed and merged. Any other potential workarounds like helper.sh or something like that are even less stable and need more resources for implementation.

By the way, our current IBP implementation is also cobbler-based and it is quite easy to enable it. So I think we need to cover all many disks deployments with IBP not wasting resources for inventing ugly buggy schemes.

Revision history for this message
Vladimir Kozhukalov (kozhukalov) wrote :
Changed in fuel:
assignee: Matthew Mosesohn (raytrac3r) → Vladimir Kozhukalov (kozhukalov)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.