Comment 8 for bug 1239820

Revision history for this message
Mike Holmes (mike-holmes) wrote : Re: [Bug 1239820] Re: udhcp is randomly failing on the Arndale in lava-ssh sessions

I hear you Matt, but given the very wide spread use of OE, why is this is
not seen else where, by very many people unless our set up is considerably
slower that normal ?

>>
The DHCP server then only has a short amount of time to respond. On a
smaller network that is probably fine, but with the large amount of devices
in our and the sheer amount of DHCP traffic it's likely taking just a tiny
bit too long to respond and the board has already given up.
>>

If we can be sure the root cause is because our lab is slow but "acceptably
slow" rather than needing fixing in some way, and that is then the reason
for the issue, modifying all the Linaro OE images to wait longer makes
sense, but it feels like a band aid.

I think we need Fathi to comment on modifying all the Linaro OE images to
wait longer, because every Linaro OE image we tried failed the same way.

If Fathi is fine with the -n fix in the OE images, I am.

Last note, this usually manifests in the deployed image boot, not in the
master image - is that LAVA related in any way ?

Mike

On 16 October 2013 18:35, Matthew Hart <email address hidden> wrote:

> As I've mentioned before, a recurring theme in the logs you have linked:
>
> http://validation.linaro.org/scheduler/job/79440/log_file#L_15_435
> http://validation.linaro.org/scheduler/job/79424/log_file#L_31_367
> http://validation.linaro.org/scheduler/job/78934/log_file#L_15_367
> http://validation.linaro.org/scheduler/job/78926/log_file#L_29_366
>
> is that udhcpc is already well into it's DHCP requests before the kernel
> has even noticed that the network link is up, so a considerable amount of
> the total time udhcpc is willing to wait, is lost without any requests
> making it out of the board.
> The DHCP server then only has a short amount of time to respond. On a
> smaller network that is probably fine, but with the large amount of devices
> in our and the sheer amount of DHCP traffic it's likely taking just a tiny
> bit too long to respond and the board has already given up.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1239820
>
> Title:
> udhcp is randomly failing on the Arndale in lava-ssh sessions
>
> Status in LAVA Validation Lab:
> New
>
> Bug description:
> I ran the same lava-ssh test on the same arndale in the regular lab
> over a short span of a couple of hours, with RT and non RT kernels and
> they have a common issue, udhcp randomly fails.
>
> regular kernel
> - http://validation.linaro.org/scheduler/job/78934 Fail
> - http://validation.linaro.org/scheduler/job/78934/log_file#L_15_377
>
> regular kernel
> - http://validation.linaro.org/scheduler/job/78933 Pass
>
> rt kernel
> - http://validation.linaro.org/scheduler/job/78926 Fail
> - http://validation.linaro.org/scheduler/job/78926/log_file#L_29_372
>
> rt kernel
> - http://validation.linaro.org/scheduler/job/78917 Pass
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/lava-lab/+bug/1239820/+subscriptions
>