cloud-init selects sysconfig netconfig renderer if network-manager is installed on Ubuntu
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
| MAAS |
Undecided
|
Unassigned | ||
| Provider for Plainbox - Canonical Certification Server |
Critical
|
Rod Smith | ||
| cloud-init |
High
|
Unassigned | ||
| cloud-init (Ubuntu) |
Undecided
|
Unassigned |
Bug Description
Configuration:
UEFI/BIOS: TEE136S
IMM/BMC: CDI333V
CPU: Intel(R) Xeon(R) Platinum 8253 CPU @ 2.20GHz
Memory: 16G DIMM * 12
Raid card: ThinkSystem RAID 530-8i
NIC Card: Intel X722 LOM
Reproduce Steps:
1.Config "network" as first boot
2.Power on machine
3.Visit TC through web browser and Commission machine
4.When commission complete, deploy ubuntu 18.04 LTS on SUT
5.The Error appeared during OS deploy.
Deploy errors like the following(you can view the attachment for details):
cloud-init[xxxx] Date_and_time - handlers.
cloud-init[xxxx] Date_and_time - handlers.
Related branches
- Server Team CI bot: Approve (continuous-integration) on 2019-04-22
- Dan Watkins: Approve on 2019-04-11
-
Diff: 70 lines (+39/-2)2 files modifiedcloudinit/net/sysconfig.py (+4/-2)
tests/unittests/test_net.py (+35/-0)
duanbenliang (duanbl1) wrote : | #1 |
duanbenliang (duanbl1) wrote : | #2 |
duanbenliang (duanbl1) wrote : | #3 |
duanbenliang (duanbl1) wrote : | #4 |
Blake Rouse (blake-rouse) wrote : | #5 |
Jeff Lane (bladernr) wrote : | #6 |
FYI, I've added a cert task for this. I don't know for sure this is curtin, it looks like something may have changed in one of the hundreds of dependency packages that checkbox pulls in causing curtin to fail.
Rod is investigating it on our side.
Changed in plainbox-provider-certification-server: | |
importance: | Undecided → Critical |
assignee: | nobody → Rod Smith (rodsmith) |
status: | New → Confirmed |
Jeff Lane (bladernr) wrote : | #7 |
We have a bug for this as well, 1189973 but duping for that kills the MAAS (possibly curtin) task. So I un-duped it for now
Changed in maas: | |
status: | New → Incomplete |
Rod Smith (rodsmith) wrote : | #8 |
We've traced the problem to the network-manager package, which gets pulled in by a dependency in canonical-
Rod Smith (rodsmith) wrote : | #9 |
Rod Smith (rodsmith) wrote : | #10 |
Ryan Harper (raharper) wrote : | #11 |
Neither curtin, nor cloud-init will *skip* generating networking. However, if there exists some additional netplan config in the target system that cloud-init is not aware (maybe provided in the NetworkManager package (or something else)) then there may be a conflict in the configuration that prevents netplan apply from bringing up the network.
If possible, getting the systemd journal and what's in /etc/netplan and /run/systemd/
Rod Smith (rodsmith) wrote : | #12 |
Rod Smith (rodsmith) wrote : | #13 |
I've attached the /var/log/
Ryan Harper (raharper) wrote : | #14 |
2019-03-14 17:32:34,606 - __init__.py[DEBUG]: Selected renderer 'sysconfig' from priority list: None
This is a cloud-init bug. The sysconfig renderer has NetworkManager support, this triggered cloud-init to render sysconfig instead of netplan.
Changed in cloud-init: | |
importance: | Undecided → High |
status: | New → Confirmed |
Changed in maas: | |
status: | Incomplete → Invalid |
summary: |
- An error occurs when MAAS Deploy 18.04 on ThinkSystem SR590 + cloud-init selects sysconfig netconfig renderer if network-manager is + installed on Ubuntu |
Ryan Harper (raharper) wrote : | #15 |
You can workaround this issue by including the following curtin config when deploying.
write_files:
policy:
path: /etc/cloud/
content: |
#cloud-config
system_info:
network:
Changed in cloud-init: | |
status: | Confirmed → In Progress |
Rod Smith (rodsmith) wrote : | #16 |
Thanks for the quick fix, Ryan! I've confirmed that your curtin config workaround in comment #15 works. Do you have an estimate for how long it'll be before a fix goes live? (I ask so we can plan whether we should push your workaround through one of the certification packages.)
Ryan Harper (raharper) wrote : Re: [Bug 1819994] Re: cloud-init selects sysconfig netconfig renderer if network-manager is installed on Ubuntu | #17 |
On Fri, Mar 15, 2019 at 2:50 PM Rod Smith <email address hidden> wrote:
> Thanks for the quick fix, Ryan! I've confirmed that your curtin config
> workaround in comment #15 works. Do you have an estimate for how long
> it'll be before a fix goes live? (I ask so we can plan whether we should
> push your workaround through one of the certification packages.)
>
Depends on where you need it. It can likely land upstream either today
or on Monday; and would be available via the cloud-init-dev daily PPA;
however, an SRU will take at least another week after next; We're almost
done with an existing cloud-init SRU; so we'd likely not start another SRU
until the current one is in -updates.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https:/
>
> Title:
> cloud-init selects sysconfig netconfig renderer if network-manager is
> installed on Ubuntu
>
> To manage notifications about this bug go to:
> https:/
>
Amy Gou (goujm1) wrote : | #18 |
hi Jeff and all,
After upgrade online, it is MAAS 0.4.0 show under version tale, but still 2.4.2 under the log. in the same time, the deploy fails again. please double check the log and let me know if there is any comments.
Best Regards,
Amy
Jeff Lane (bladernr) wrote : | #19 |
Hi Amy,
first, which machine failed? I see a bunch of machines in the /var/log/
Secondly, the version you posted in the screen shot looks correct, can you show me the output of:
ls -l /etc/maas/
Jeff Lane (bladernr) wrote : | #20 |
Amy: Also, could you send me a tarball containing /etc/maas/preseeds ??
duanbenliang (duanbl1) wrote : | #21 |
duanbenliang (duanbl1) wrote : | #22 |
Amy Gou (goujm1) wrote : | #23 |
hi Jeff,
it is SR590 Cascadelake deploy failed with the new MAAS 0.4.0. the attahmen above is collected from The environment with SR590 Cascadelake.
Besides, the same issue also occurs on SR650 cascadelake.
best Regards,
Amy
Rod Smith (rodsmith) wrote : | #24 |
Amy, I think you're confusing the MAAS version (which is 2.4.2 on one of our installations) and the maas-cert-server package version (the latest of which is 0.4.0). The maas-cert-server 0.3.9 package includes a workaround (but NOT A FIX) for this bug, and 0.4.0 provides some unrelated improvements, so the installation SHOULD succeed after you've upgraded maas-cert-server to version 0.3.9 or 0.4.0. If it's still failing, then it could be you'll need to apply the workaround described by Ryan Harper in comment #15, which is different from the workaround in maas-cert-server 0.3.9 and 0.4.0. (Post back if you need help applying Ryan's workaround.) It could also be that you're looking at a completely different problem.
Amy Gou (goujm1) wrote : | #25 |
hi Rod,
Thanks for your update, we will use the workaround to execute the current certification test on Purley Cascadelake.
As to the Deploy failure on MAAS 0.4.0, do you advise we raise the other defect to track?
Best Regards,
Amy
This bug is fixed with commit 5de83fc5 to cloud-init on branch master.
To view that commit see the following URL:
https:/
Changed in cloud-init: | |
status: | In Progress → Fix Committed |
This bug is believed to be fixed in cloud-init in version 19.1. If this is still a problem for you, please make a comment and set the state back to New
Thank you.
Changed in cloud-init: | |
status: | Fix Committed → Fix Released |
Amy Gou (goujm1) wrote : | #28 |
Sorry for the later reply, the issue does not occur with current Cloud-init v. 18.5-45-
Best Regards,
Amy
Jeff Lane (bladernr) wrote : | #29 |
Hi Amy, it's likely that you're still using our patched tooling that includes a workaround. cloud-init 18.5 should not work.
Jeff Lane (bladernr) wrote : | #30 |
Just a heads up, the fix is now in -updates, I've tested this locally on a couple deployments and it seems to resolve the issue we had before. Asking my team to verify on a couple more deployments for due diligence.
Rod Smith (rodsmith) wrote : | #31 |
I've tested this on three nodes on two MAAS servers (my own home MAAS server and maastiff, our MAAS server in the certification lab), using both 18.04 and 19.04. It looks good to me.
Changed in plainbox-provider-certification-server: | |
status: | Confirmed → Fix Committed |
Amy Gou (goujm1) wrote : | #32 |
thanks for your kindly update, i will do the double check with the latest one.
Dan Watkins (oddbloke) wrote : | #33 |
Hi Amy et al,
I'm going to mark this Fix Released, as 19.1 has made its way in to Ubuntu. Please let us know if you don't think this is fixed!
Dan
Changed in cloud-init (Ubuntu): | |
status: | New → Fix Released |
Changed in plainbox-provider-certification-server: | |
status: | Fix Committed → Fix Released |
Looks like it might be an issue either in curtin or MAAS based on the network configuration.
Once the machine fails to deploy can you provide the output of:
maas {profile} machine get-curtin-config {system_id}