ssh timeout period may be too long

Bug #2028693 reported by Brian Murray
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Auto Package Testing
New
Undecided
Unassigned

Bug Description

Due to the networking issues with s0lp4 in bos02 there is a pattern like the following:

untu/autopkgtest-cloud/worker/worker[3623431]: WARNING: Testbed failure. Retrying in 5 minutes... Log follows:
untu/autopkgtest-cloud/worker/worker[3623431]: ERROR: 0s autopkgtest [17:43:59]: starting date: 2023-07-25
                                                 0s autopkgtest [17:43:59]: git checkout: cabdbf6 Add timestamps to logfile
                                                 0s autopkgtest [17:43:59]: host juju-4d1272-prod-proposed-migration-14; command line: /home/ubuntu/autopkgtest/runner/autopkgtest --output-di>
                                               1685s Creating nova instance adt-mantic-s390x-python-bitarray-20230725-170857-juju-4d1272-prod-proposed-migration-14 from image adt/ubuntu-mant>
                                               1685s Timed out waiting for ssh. Aborting! Console log:
                                               1685s ------- nova console-log 878a16dc-2e6e-43fb-8e70-b3d03c4e570d (adt-mantic-s390x-python-bitarray-20230725-170857-juju-4d1272-prod-proposed>
                                               1685s
                                               1685s
                                               1685s Ubuntu Mantic Minotaur (development branch) auto-syncubuntu-mantic-daily-s390x-server-20230619-disk1 sclp_line0
                                               1685s
                                               1685s auto-syncubuntu-mantic-daily-s390x-server-20230619-disk1 login:
                                               1685s ---------------------------------------------------
                                               1685s ------- nova show 878a16dc-2e6e-43fb-8e70-b3d03c4e570d (adt-mantic-s390x-python-bitarray-20230725-170857-juju-4d1272-prod-proposed-migrat>
                                               1685s +--------------------------------------+-----------------------------------------------------------------------------------------+
                                               1685s | Property | Value |
                                               1685s +--------------------------------------+-----------------------------------------------------------------------------------------+
                                               1685s | OS-DCF:diskConfig | MANUAL |
                                               1685s | OS-EXT-AZ:availability_zone | nova |
                                               1685s | OS-EXT-SRV-ATTR:host | s0lp4 |
                                               1685s | OS-EXT-SRV-ATTR:hypervisor_hostname | s0lp4.internal |
                                               1685s | OS-EXT-SRV-ATTR:instance_name | instance-02f60c88 |
                                               1685s | OS-EXT-STS:power_state | 1 |
                                               1685s | OS-EXT-STS:task_state | - |
                                               1685s | OS-EXT-STS:vm_state | active |
                                               1685s | OS-SRV-USG:launched_at | 2023-07-25T17:44:39.000000 |
                                               1685s | OS-SRV-USG:terminated_at | - |
                                               1685s | accessIPv4 | |
                                               1685s | accessIPv6 | |
                                               1685s | config_drive | |
                                               1685s | created | 2023-07-25T17:44:27Z |
                                               1685s | flavor | autopkgtest (aea8bb59-2c51-4a96-9fea-89fe8bfe1807) |
                                               1685s | hostId | 45c407a6370073f072fc706956c64253b8b3a60969ea6fd241f4003e |
                                               1685s | id | 878a16dc-2e6e-43fb-8e70-b3d03c4e570d |
                                               1685s | image | adt/ubuntu-mantic-s390x-server-20230723.img (b96e1f9b-0a7e-4b9c-9f3a-3ae22e79b971) |
                                               1685s | key_name | testbed-juju-4d1272-prod-proposed-migration-14 |
                                               1685s | metadata | {} |
                                               1685s | name | adt-mantic-s390x-python-bitarray-20230725-170857-juju-4d1272-prod-proposed-migration-14 |
                                               1685s | net_prod-proposed-migration network | 10.44.124.247 |

Here we have waited about minutes to be able to connect to the instance which seems unnecessarily long. If the instance is active and has an IP but we can't connect to port 22 in a shorter period of time I think we should give up more quickly.

Revision history for this message
Tim Andersson (andersson123) wrote :

This argument to the runner is the cause:

`--timeout-build=20000`

I'll make an mp to reduce it.

Revision history for this message
Tim Andersson (andersson123) wrote :
Revision history for this message
Paride Legovini (paride) wrote :

--timeout-build is the timeout for when autopkgtest builds the package. It's completely unrelated.

The timeout Brian is referring to comes from the nova script (setup-ssh/nova). Grep for this message (from this bug description):

  Timed out waiting for ssh. Aborting! Console log:

to find the relevant retry loop.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.