Standard users cannot see all unallocated nodes.

Bug #1300294 reported by Blake Rouse
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Invalid
High
Unassigned

Bug Description

Each standard user in MAAS sees a different number of unallocated nodes. Admin users can see all of the nodes.

How to reproduce:
1. Login as user1 and see X number of ready nodes
3. Login as user2 and see Y number of ready nodes.

This was noticed on MAAS where it was being controlled with JuJu, and each standard user, is performing bootstrap and destroy-environment often.

Revision history for this message
Gavin Panella (allenap) wrote :

Each non-admin user should see the same number of unallocated nodes as other non-admin users. Indeed, each admin user should see the same number of unallocated nodes as other admin users.

However, if there's a lot of churn, I can see that the count might not be stable enough to compare accurately. Were the figures wildly different? When destroying an environment, Juju may release many machines over a fairly short period. If the steady-state figures differ then I'd be worried.

Was this observed on the front-page cake chart? That may diverge from reality ;)

I'm not sure that this is a critical bug. Is it preventing use of MAAS or causing data loss or corruption?

Revision history for this message
Raphaël Badin (rvb) wrote :

Like Gavin said, this is really weird. Is there anything that differentiates the nodes seen by user1 and not user2 (or the reverse)? Can you reproduce this using the cli (using 2 different profiles) reliably?

Revision history for this message
Blake Rouse (blake-rouse) wrote :

Just for some background:
The MAAS server this was noticed on is used in a workflow, for testing. First an admin user connects to MAAS to see which nodes are available. Based on the available nodes it selects so many nodes for the test. These nodes are then given to the slave users. The slave users then perform the JuJu bootstrap, using the selected nodes from the admin user, and once they finish destroy-environment is performed.

Using the workflow described above, the nodes should have been available when passed to the standard users. But anytime a command was issued it would get a permission edit error, making me think they were given to another user, yet they were in the unallocated state.

Looking at the UI, the difference between the available was very noticeable: 4 vs 14.

Using this workflow, if the standard users are set to admin, then everything works fine.

Changed in maas:
importance: Critical → High
Changed in maas:
assignee: nobody → Jeroen T. Vermeulen (jtv)
status: New → In Progress
Revision history for this message
Jeroen T. Vermeulen (jtv) wrote :

Were those 4 vs. 14 nodes definitely in the Ready state? Initially we thought the discrepancy might be caused by nodes being in the Ready state but still having an owner set. But there doesn't seem to be any way to achieve that whether through the API or the UI, even when taking data races, error paths, and the low transaction isolation level into account. Moreover, a long-standing assertion in the acquire() call would have called attention to the situation.

I do wonder if the situation was not related to Juju still being in the process of allocating nodes, one at a time, with the different users observing different states in that process; or allocated nodes perhaps being included in the counts (in which case the difference would be reasonable and expected). Bear in mind that Juju will return upon accepting a command, and continue executing it asynchronously. It may take a while.

And so, my follow-up questions are:

0. What MAAS and Juju versions are these?

1. How exactly did these users obtain their listings of available nodes?

2. Do we know that the MAAS was in a steady state at that time, without Juju trying to allocate or deallocate nodes?

3. What exactly do you mean by nodes being “given to the slave users” — given that it's not possible to allocate a node on another user's behalf?

4. At a time when the problem manifests itself, could you start a database shell using “sudo maas-region-admin dbshell --installed” and give us the output of this command:

    SELECT hostname, status, owner_id FROM maasserver_node ORDER BY hostname;

This will help us verify that there really are no stray ownership settings. It would be best to verify that the discrepancy is there right after the command was executed, as well as right before, to eliminate transient numbers as an explanation.

Changed in maas:
status: In Progress → Incomplete
Changed in maas:
milestone: 14.04 → 14.10
milestone: 14.10 → none
Christian Reis (kiko)
Changed in maas:
assignee: Jeroen T. Vermeulen (jtv) → nobody
milestone: none → next
Revision history for this message
Graham Binns (gmb) wrote :

Is this still a bug in 1.7? If so, please try out jtv's instructions in #4.

I haven't been able to reproduce this yet.

Christian Reis (kiko)
Changed in maas:
milestone: next → 1.7.1
Revision history for this message
Graham Binns (gmb) wrote :

I still haven't been able to reproduce this. I'm going to mark it Invalid; we can reopen it if necessary.

Changed in maas:
status: Incomplete → Invalid
Changed in maas:
milestone: 1.7.1 → 1.7.2
Changed in maas:
milestone: 1.7.2 → none
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.