suspend/wifi_resume_time and wifi_resume_time_auto: Not being used correctly?

Bug #1174519 reported by Daniel Manrique
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Checkbox Provider - Base
Won't Fix
High
Unassigned

Bug Description

So I'm trying to track down what happens with this test. We noticed a discrepancy between checkbox and plainbox, running on the same machine with the same Ubuntu version.

The reason this was brought to our attention is that plainbox reports it as failed:

https://certification.canonical.com/hardware/201102-7187/submission/90907/test-results

Your wifi resumed in 54.57 seconds after the last suspend FAIL: the network failed to reconnect within the allotted time

However, when we looked at checkbox's submission in more detail, it's also not doing what we expect (however, this result is returned as PASS, which is why I hadn't noticed this before):

https://certification.canonical.com/hardware/201102-7187/submission/90941/test-results

Unable to obtain wifi connection time after a S3. Please be sure that the system has been suspended

I think the problem is we're using these tests wrong. To sort of validate what *should* happen, I first created a wifi connection with the create_connection script:

sudo /usr/share/checkbox/scripts/create_connection some-open-ssid

Connection some-open-ssid registered
Active connection state: activating
Active connection path: /org/freedesktop/NetworkManager/ActiveConnection/19
state: activated
Connection activated
Connection some-open-ssid activated.

OK, now *without* deleting the connection, I sleep the system:

sudo rtcwake -m no -s 120
sudo pm-suspend

Once the system wakes up, I run the test script:

$ /usr/share/checkbox/scripts/network_reconnect_resume_test -t 20 -d wifi
Your wifi resumed in 4.26 seconds after the last suspend
PASS: the network connected within the allotted time

Great, so this works. THen I delete the connection file from /etc/Networkmanager/system-connections, and retry the suspend process. I get this after suspending/resuming:

$ /usr/share/checkbox/scripts/network_reconnect_resume_test -t 210 -d wifi
ubuntu@201102-7187:~$
(huh, it didn't return anything??)

The reason I think we're using this wrong is this. I looked at all the create-connection jobs and *all* of them dutifully delete the connection file when they complete. So the sequence of events in reality is:

1- wifi connection established. We delete it afterwards.
2- suspend.
3- wake up, we do *not* try to reestablish wifi connection because we deleted it.
4- Check for wifi events in the log file. This will either lead to no events or to an event that happened during step 1, a while ago. Test fails.

I wonder if we should somehow create the connection, then without deleting it doing a suspend/resume cycle, and maybe then deleting it, so we ensure the log file contains the required information.

Tags: job
Revision history for this message
Brendan Donegan (brendan-donegan) wrote :

Yeah, this dawned on me as well :/ We either need to integrate the 'create/suspend/delete' process in the suspend test(s), or have two new jobs which do this.

Changed in checkbox:
status: New → Confirmed
importance: Undecided → High
Revision history for this message
Daniel Manrique (roadmr) wrote :

Ugh, I'd prefer to have two jobs that create (and then don't delete) the connection. Those should depend on the corresponding functional wifi tests so they don't run if wifi is deemed to be non-functional.

However, suspend_advanced_* should NOT depend on these create jobs, though in theory they shouldn't fail. Rather, we should express running order using the whitelist so they run before the suspend_advanced test.

Then perhaps the suspend-time jobs can depend on the create jobs, and then take care of deleting the connections, so it's more or less guaranteed that if the connections were not created, then the suspend-time jobs won't run and won't try to delete them.

If you agree with this we can set the bug to triaged.

Daniel Manrique (roadmr)
tags: added: job
Changed in checkbox:
status: Confirmed → Triaged
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Hello, this issue still exist in this SRU cycle, from 3.2 to 3.11,
network_resume_time_auto passed even if it failed to get the resume time

in some other cases, it passed without any other comment, but I think we're expecting descriptions like this:
"Your wired resumed in 1.89 seconds after the last suspend PASS: the network connected within the allotted time"

Maybe the "return None" python code has something to do with it?

Zygmunt Krynicki (zyga)
affects: checkbox → plainbox-provider-checkbox
Revision history for this message
Zygmunt Krynicki (zyga) wrote :

I'm marking this as WONT FIX but please reopen if this is actively hampering anyone's work. My goal is to limit the number of open bugs to get a better idea as to what is really important.

Remember that you can always escalate bugs by contacting us in #checkbox on freenode (or #cert-infra in the internal IRC) or by responding in bugs directly.

Changed in plainbox-provider-checkbox:
status: Triaged → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.