reporting messages can slow down operations greatly
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MAAS |
Confirmed
|
High
|
Unassigned |
Bug Description
When looking into bug 1604962 I investigated cloud-init and curtin logs and looked at timestamps.
In Adams scenario shown at
https:/
posting a message back to maas was taking up to 7 seconds.
cloud-init and curtin during an installation can be expected to post dozens of messages. if each of those took just 5 seconds the group of only 12 would make an installation take 60 seconds longer than it needed to.
This can be considered a "client" problem (curtin and cloud-init) in some respects. These clients could definitely background their posting of data so that they can go on. However, if they do that at some point they probably should verify that all messages were correctly posted, so its possible that backgrounding the posting wouldn't actually help.
Adam's system I think was a "orange box", with 10 clients all installing. That does not seem like enough load to account for 5+ second posts.
Related bugs:
* bug 1604962: node set to "failed deployment" for no visible reason
In my add reporting to vmtests branch, I can reproduce this by adding a
time.sleep(1) in our in-tree webhook server.
XenialBasic takes ~65 seconds to install
Adding the sleep to the webserver on each post, total time is 160 seconds.
On Wed, Jul 27, 2016 at 12:21 PM, Scott Moser <email address hidden> wrote:
> Public bug reported: /bugs.launchpad .net/maas/ +bug/1604962/ comments/ 16 /bugs.launchpad .net/bugs/ 1606999 /bugs.launchpad .net/cloud- init/+bug/ 1606999/ +subscriptions
>
> When looking into bug 1604962 I investigated cloud-init and curtin logs
> and looked at timestamps.
> In Adams scenario shown at
> https:/
> posting a message back to maas was taking up to 7 seconds.
>
> cloud-init and curtin during an installation can be expected to post
> dozens of messages. if each of those took just 5 seconds the group of
> only 12 would make an installation take 60 seconds longer than it needed
> to.
>
> This can be considered a "client" problem (curtin and cloud-init) in
> some respects. These clients could definitely background their posting
> of data so that they can go on. However, if they do that at some point
> they probably should verify that all messages were correctly posted, so
> its possible that backgrounding the posting wouldn't actually help.
>
> Adam's system I think was a "orange box", with 10 clients all
> installing. That does not seem like enough load to account for 5+
> second posts.
>
> Related bugs:
> * bug 1604962: node set to "failed deployment" for no visible reason
>
> ** Affects: cloud-init
> Importance: Undecided
> Status: New
>
> ** Affects: curtin
> Importance: Undecided
> Status: New
>
> ** Affects: maas
> Importance: Undecided
> Status: New
>
> ** Also affects: cloud-init
> Importance: Undecided
> Status: New
>
> ** Also affects: curtin
> Importance: Undecided
> Status: New
>
> --
> You received this bug notification because you are subscribed to curtin.
> Matching subscriptions: curtin-bugs-all
> https:/
>
> Title:
> reporting messages can slow down operations greatly
>
> To manage notifications about this bug go to:
> https:/
>