CentOS node deployment fails

Bug #1502839 reported by Sagar Shukla
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
MAAS
Fix Released
Medium
Unassigned

Bug Description

I am running MAAS 1.8.0 version, and was able to compile a CentOS image using maas-image-builder script. After compiling the image, I was able to push it into MAAS using MAAS Create APIs. Now when I try to deploy a node using this CentOS image, deployment fails giving following error message in regiond.log file:

2015-10-04 10:43:30 [maasserver] ERROR: Unable to identify boot image for (centos/amd64/generic/centos7/commissioning): cluster 'maas' does not have matching boot image.

Also bootup of node fails with following error message at the boot prompt: unable to find metadata boot-kernel

This is what my centos image folder looks like:

# ls /var/lib/maas/boot-resources/current/centos/amd64/generic/centos7/generated/
root-tgz

Seems the image upload is broken or image deployment. Was unable to figure out with limited error logging messages.

Sagar Shukla (sa-shukla)
summary: - CentOS node deployment mails
+ CentOS node deployment fails
tags: added: centos
tags: added: maas
Gavin Panella (allenap)
tags: added: qap
tags: added: qa-missing
removed: qap
Revision history for this message
Sagar Shukla (sa-shukla) wrote :

screenshot of the error observed during deployment. Looking at the error message, looks like it is using a network subnet instead of IP address while trying to fetch custom image. Not sure from where this value is fetched.

Revision history for this message
Sagar Shukla (sa-shukla) wrote :

Any quick workaround would really help.

Revision history for this message
Andres Rodriguez (andreserl) wrote :

Hi Sagar,

Is the machine 'ready'? Is your cluster connected? Have you downloaded any ubuntu imageS?

IOn order to deploy centos you need to have ubuntu images imported and it seems the error you are seeing its at commissioning time? which is supposed to happen before you can actually deploy.

Revision history for this message
Sagar Shukla (sa-shukla) wrote :

@Andres - Yes. I do have downloaded Ubuntu images and 70 nodes are already up and running with Ubuntu 14.04 image. Ubuntu 12.04 also worked before upgrade to MAAS 1.8.0 .

I was able to complete the commissioning process without any issues. Looks like commissioning process uses Ubuntu image. Node state Ready and Allocated worked fine without any issues.

I can try commissioning again, but it is not giving me any errors, so not sure what to check for.

Revision history for this message
Sagar Shukla (sa-shukla) wrote :

tried deploying Ubuntu image and it worked without any issues.

Revision history for this message
Sagar Shukla (sa-shukla) wrote :

Is it now confirm that this is an issue?

Revision history for this message
Andres Rodriguez (andreserl) wrote :

Hi Sagar,

Sorry for a late reply. This might be due to the image that you have created. MAAS now supports importing and deploying CentOS automatically.

If you go to the Settings page and go to 'Sync URL' section, you can change the syncurl from 'releases' to 'daily'. If you do this and go to the 'Images' tab again, you should be able to see the CentOS image available for download. Can you please try that?

Revision history for this message
Sagar Shukla (sa-shukla) wrote :

Hi Andres,

I tried following the procedure given below:
- Remove my custom CentOS 7 image
- Upgrade MAAS to 1.8.2 from 1.8.0
- configure your settings and download new CentOS images from 'daily' section.
- Deploy a node using this new image.

I still continue to see same issues / errors while deploying a CentOS node.

I even tried deploying CentOS 6.6 using your approach, but error remains the same while deploying new node.

Following is my curtin_userdata_centos file which looks little wierd:
# cat curtin_userdata_centos
#cloud-config
debconf_selections:
 maas: |
  {{for line in str(curtin_preseed).splitlines()}}
  {{line}}
  {{endfor}}

late_commands:
  maas: [wget, '--no-proxy', '{{node_disable_pxe_url}}', '--post-data', '{{node_disable_pxe_data}}', '-O', '/dev/null']

power_state:
  mode: reboot

Any more thoughts? Does the deployment work in your environment?

Revision history for this message
Sagar Shukla (sa-shukla) wrote :

Though the OS installation fails, but MAAS still tries to boot the node and it fails to boot with following error mesage:

Could not find kernel image: centos/amd64/generic/centos66/no-such-image/boot-kernel

In the centos folder after importing image, just root-tgz file exists. Files like boot-kernel or root-image or boot-initrd does not exists.

root@server:/var/lib/maas/boot-resources/current/centos/amd64/generic/centos66/release# ls
root-tgz
root@server:/var/lib/maas/boot-resources/current# ls ubuntu/amd64/generic/trusty/daily/
boot-initrd boot-kernel di-initrd di-kernel root-image root-tgz

Is this the cause of issue?

Revision history for this message
Sagar Shukla (sa-shukla) wrote :

Any thoughts on why the service is using the network instead of IP address?

Sagar Shukla (sa-shukla)
no longer affects: maas-images-yarmouth (Ubuntu)
Changed in maas:
assignee: nobody → Sagar Shukla (sa-shukla)
assignee: Sagar Shukla (sa-shukla) → nobody
assignee: nobody → Sagar Shukla (sa-shukla)
Sagar Shukla (sa-shukla)
Changed in maas:
status: New → In Progress
Revision history for this message
Sagar Shukla (sa-shukla) wrote :

okay ... no worries, I found that issue is with function pick_cluster_controller_address() in /usr/lib/python2.7/dist-packages/maasserver/preseed.py file. I have not yet been able to identify exact fix, but found a workaround in the interim.

Revision history for this message
Brian Sheets (brians-s) wrote :

I am having a similar issue with windows image and booting. Should I add the info here or submit a new bug report?

Revision history for this message
Blake Rouse (blake-rouse) wrote :

The pick_cluster_controller_address() was removed in 1.9 and the logic for getting the cluster address is much better. Please test 1.9 to see if this solves your issue. I am going to mark this Fix Committed as 1.9 changes this logic.

@Brian

Please test with 1.9 and if that issue still occurs please file a separate bug.

Changed in maas:
assignee: Sagar Shukla (sa-shukla) → nobody
status: In Progress → Fix Committed
milestone: none → 1.9.0
importance: Undecided → Medium
Revision history for this message
Brian Sheets (brians-s) wrote : Re: [Bug 1502839] Re: CentOS node deployment fails

Ok,

Will do.

Brian

On 11/5/15, 8:23 AM, "<email address hidden> on behalf of Blake Rouse" <<email address hidden> on behalf of <email address hidden>> wrote:

>The pick_cluster_controller_address() was removed in 1.9 and the logic
>for getting the cluster address is much better. Please test 1.9 to see
>if this solves your issue. I am going to mark this Fix Committed as 1.9
>changes this logic.
>
>@Brian
>
>Please test with 1.9 and if that issue still occurs please file a
>separate bug.
>
>** Changed in: maas
> Assignee: Sagar Shukla (sa-shukla) => (unassigned)
>
>** Changed in: maas
> Status: In Progress => Fix Committed
>
>** Changed in: maas
> Milestone: None => 1.9.0
>
>** Changed in: maas
> Importance: Undecided => Medium
>
>--
>You received this bug notification because you are subscribed to the bug
>report.
>https://bugs.launchpad.net/bugs/1502839
>
>Title:
> CentOS node deployment fails
>
>Status in MAAS:
> Fix Committed
>
>Bug description:
> I am running MAAS 1.8.0 version, and was able to compile a CentOS
> image using maas-image-builder script. After compiling the image, I
> was able to push it into MAAS using MAAS Create APIs. Now when I try
> to deploy a node using this CentOS image, deployment fails giving
> following error message in regiond.log file:
>
> 2015-10-04 10:43:30 [maasserver] ERROR: Unable to identify boot image
> for (centos/amd64/generic/centos7/commissioning): cluster 'maas' does
> not have matching boot image.
>
> Also bootup of node fails with following error message at the boot
> prompt: unable to find metadata boot-kernel
>
> This is what my centos image folder looks like:
>
> # ls /var/lib/maas/boot-resources/current/centos/amd64/generic/centos7/generated/
> root-tgz
>
> Seems the image upload is broken or image deployment. Was unable to
> figure out with limited error logging messages.
>
>To manage notifications about this bug go to:
>https://bugs.launchpad.net/maas/+bug/1502839/+subscriptions

Changed in maas:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.