Fresh image import of 3 archs displaying multiple rows for armhf and amd64

Bug #1388373 reported by Christian Reis
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Fix Released
Medium
Blake Rouse

Bug Description

I just installed rc1 and after configuring my cluster went to upload images. I selected 3 archs: amd64, armhf and ppc64el.

The progress display shows multiple rows of amd64 and armhf, but not of ppc64el. I've seen this happen before.

I'm attaching a screenshot; this isn't critical but is an obvious bug that we could fix for release if it's trivial.

Related branches

Revision history for this message
Christian Reis (kiko) wrote :
Revision history for this message
Blake Rouse (blake-rouse) wrote :

Can you please attach the ouput of:

maas admin boot-resources read

maas admin boot-sources read

Changed in maas:
status: New → Incomplete
Revision history for this message
Christian Reis (kiko) wrote :

async@malzinho:~$ sudo maas async boot-resources read
[
    {
        "name": "ubuntu/trusty",
        "kflavor": "generic",
        "architecture": "amd64/hwe-t",
        "subarches": "generic,hwe-p,hwe-q,hwe-r,hwe-s,hwe-t",
        "type": "Synced",
        "id": 1,
        "resource_uri": "/MAAS/api/1.0/boot-resources/1/"
    }
]

Revision history for this message
Christian Reis (kiko) wrote :

async@malzinho:~$ sudo maas async boot-sources read
[
    {
        "url": "http://maas.ubuntu.com/images/ephemeral-v2/releases/",
        "keyring_data": "",
        "resource_uri": "/MAAS/api/1.0/boot-sources/1/",
        "keyring_filename": "/usr/share/keyrings/ubuntu-cloudimage-keyring.gpg",
        "id": 1
    }
]

Revision history for this message
Christian Reis (kiko) wrote :
Revision history for this message
Christian Reis (kiko) wrote :
Revision history for this message
Christian Reis (kiko) wrote :
Revision history for this message
Christian Reis (kiko) wrote :

Okay, looking at the logs there is are two obvious problems.

1. the cluster and region are located on the same machine, but are communicating over a DHCP-assigned IP address (allocated to wlan0) instead of localhost. I'm not sure how or why this is automatically chosen (all I did was dpkg -i *.deb from the MAAS build-area), but it will always break in a setup like this, where I have a machine with an upstream link over wlan0 and an eth0 to the MAAS-managed network.

  In my case, it seems the first IP assigned to wlan0 was 192.168.99.190, but it has now changed to 192.168.99.183.

2. this morning while I was asleep something triggered a GPG error we've seen before:

  subprocess.CalledProcessError: Command `gpg --batch --verify --keyring=/tmp/maas-I1SEJmkeyrings/maas.ubuntu.com-images-ephemeral-v2-releases.gpg -` returned non-zero exit status 2: (u'', u"gpg: fatal: can't create directory `/home/maas/.gnupg': No such file or directory\nsecmem usage: 0/0 bytes in 0/0 blocks of pool 0/32768\n")

HOWEVER, because the weird display this bug report is about happened right after I clicked on "import images" I'm not sure whether region to cluster communication being broken actually matters here; I am almost sure that the region and cluster were connected when I triggered the import.

Revision history for this message
Christian Reis (kiko) wrote :

I had left the import running overnight. This is what it looks like when I load the Images page now.

Changed in maas:
status: Incomplete → New
Revision history for this message
Julian Edwards (julian-edwards) wrote : Re: [Bug 1388373] Re: Fresh image import of 3 archs displaying multiple rows for armhf and amd64

On Sunday 02 November 2014 11:27:06 you wrote:
> async@malzinho:~$ sudo maas async boot-resources read

FWIW you don't need sudo to run the "maas" command, it's an API client.

Revision history for this message
Julian Edwards (julian-edwards) wrote :

On Sunday 02 November 2014 11:40:17 you wrote:
> In my case, it seems the first IP assigned to wlan0 was
> 192.168.99.190, but it has now changed to 192.168.99.183.

It's really not a good idea to run a server on a dynamic IP address...

Revision history for this message
Julian Edwards (julian-edwards) wrote :

On Sunday 02 November 2014 11:40:17 you wrote:
> subprocess.CalledProcessError: Command `gpg --batch --verify
> --keyring=/tmp/maas-I1SEJmkeyrings/maas.ubuntu.com-images-
> ephemeral-v2-releases.gpg -` returned non-zero exit status 2: (u'',
> u"gpg: fatal: can't create directory `/home/maas/.gnupg': No such file
> or directory\nsecmem usage: 0/0 bytes in 0/0 blocks of pool 0/32768\n")

This is bug 1376024 (in 1.7.1), especially see comment 6. The code is not
setting GPGHOME where it should.

Revision history for this message
Julian Edwards (julian-edwards) wrote :

On Sunday 02 November 2014 11:41:09 you wrote:
> I had left the import running overnight. This is what it looks like when
> I load the Images page now.

These are the hardware enablement images I guess. I would actually like to see
these all listed individually. Maybe not on a separate line, but at least I'd
like to know which ones it downloaded. (Especially since if you use sstream-
mirror to create a local repo, you have to specify subarch)

Revision history for this message
Blake Rouse (blake-rouse) wrote :

There is one thing that I do not understand. The boot-resources output shows only one image, yet the images page shows multiple. Since the boot-resources output is direct from the database, it is not possible for it to return less resources then what is on the images page.

Revision history for this message
Christian Reis (kiko) wrote :

> It's really not a good idea to run a server on a dynamic IP address...

I don't agree -- it's only the wlan0 uplink which is dynamic, the host has a static IP on eth0 -- and I don't think it matters either, the fact we bind to the wlan0 interface when the cluster and region are on the same machine is broken.

Revision history for this message
Christian Reis (kiko) wrote :

Blake, yeah, it's weird but that's what it looks like. I guess I could start from scratch again unless you want to look into what's wrong?

Revision history for this message
Gavin Panella (allenap) wrote :

Slightly edited transcript from a conversation this morning:

<kiko> allenap, one of the problems with my install is that MAAS is
  choosing to use the wlan0 interface/IP to do region to cluster
  communication
<kiko> I was wondering why it did that and if we can work around it
<allenap> kiko: If that interface goes down, the cluster will reconnect
  via another interface.
<kiko> allenap, the problem is the interface will not go down, it will
  just change IP addresses
<kiko> allenap, could we give preference to ethX, emX, lo?
<kiko> (which are less likely to change)
<kiko> well
<kiko> lo would be the obvious one to use for region/cluster
<kiko> region+cluster
<allenap> kiko: The old IP address will be updated within 60 seconds.
  The cluster will also try to connect on another address in the
  meantime. But we could add a preference for interfaces not prefixed
  "wlan", for example.
<allenap> kiko: Connecting via lo is more complex than it sounds. The
  region advertises the addresses on which it's available. If we include
  lo then, in a multi-cluster environment, each cluster would try to
  connect to itself. We'd need to detect the situation where the cluster
  and region are on the same host, something we don't currently do.

Revision history for this message
Julian Edwards (julian-edwards) wrote :

On Monday 03 November 2014 12:37:40 you wrote:
> > It's really not a good idea to run a server on a dynamic IP address...
>
> I don't agree -- it's only the wlan0 uplink which is dynamic, the host
> has a static IP on eth0 -- and I don't think it matters either, the fact
> we bind to the wlan0 interface when the cluster and region are on the
> same machine is broken.

That's fine, your "server" NIC is static.

You're right that we need to restrict the interfaces on which pserv binds. It
deliberately binds to all of them right now.

Revision history for this message
Blake Rouse (blake-rouse) wrote :

It is the subarch images that are not hwe kernels.

Changed in maas:
status: New → Triaged
Christian Reis (kiko)
Changed in maas:
milestone: 1.7.0 → 1.7.1
Revision history for this message
Christian Reis (kiko) wrote :

I restored to a btrfs snapshot at the point of import and applied Blake's patch to maasserver/views.py. It seems to do exactly what I told it to, though I haven't let it run to completion yet.

Changed in maas:
status: Triaged → Fix Committed
Changed in maas:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.