Timed out while waiting for units openstack-hypervisor/0 to be ready

Bug #2060573 reported by Andre Ruiz
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Snap
New
Undecided
Unassigned

Bug Description

One of the machines from the lab frequently gets this error:

15:40:05 > Configure MySQL ...
15:40:05 > Patch LoadBalancer service annotations ...
15:40:05 > Initializing Terraform from provider mirror ...
15:40:13 > Deploying OpenStack Hypervisor ...
16:00:19 > Adding Openstack Hypervisor unit to machine(s) ...
16:00:19 > Adding Openstack Hypervisor unit to machine(s) ... Timed out while waiting for units openstack-hypervisor/0 to be ready
16:00:19 > Adding Openstack Hypervisor unit to machine(s) ...
16:00:19 Error: Timed out while waiting for units openstack-hypervisor/0 to be ready

I'm struggling to find where in the logs I can find more information about the reason for this. It might be a networking configuration but I can't find it. I'm attaching logs for the latest occurrence, let me know if I should be capturing something else on my logs and I can re-run and upload again.

Revision history for this message
Andre Ruiz (andre-ruiz) wrote :

Logs for the last build with this error.

Revision history for this message
Andre Ruiz (andre-ruiz) wrote :
Download full text (3.3 KiB)

On this particular one, I suspect this is related to the machine being and older AMD proc which does not have all the instructions that openvswitch components were compiled with.

I found this in syslog:

Apr 11 19:00:33 druid ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl --no-wait -- init -- set Open_vSwitch . db-version=8.4.0
Apr 11 19:00:33 druid kernel: [ 2782.446630] traps: ovs-vswitchd[377037] trap invalid opcode ip:7f22485bc7a9 sp:7ffd4068a910 error:0 in librte_dmadev.so.23.0[7f22485bc000+3000]
Apr 11 19:00:33 druid openstack-hypervisor.ovsdb-server[377036]: Illegal instruction (core dumped)
Apr 11 19:00:33 druid ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl --no-wait set Open_vSwitch . ovs-version= "external-ids:system-id=\"druid.lab.token\"" "external-ids:rundir=\"/var/snap/openstack-hypervisor/common/run/openvswitch\"" "system-type=\"ubuntu-core\"" "system-version=\"22\""
Apr 11 19:00:33 druid ovs-vsctl: ovs|00002|db_ctl_base|ERR|ovs-version=: argument does not end in "=" followed by a value.
Apr 11 19:00:33 druid openstack-hypervisor.ovsdb-server[377040]: ovs-vsctl: ovs-version=: argument does not end in "=" followed by a value.
Apr 11 19:00:33 druid openstack-hypervisor.ovsdb-server[377003]: * Configuring Open vSwitch system IDs

This is the proc. Yes, it is very old, but still very brave (4 Cores and 16 GB RAM). Is this a lost cause?

ubuntu@druid:~$ lscpu
Architecture: x86_64
  CPU op-mode(s): 32-bit, 64-bit
  Address sizes: 48 bits physical, 48 bits virtual
  Byte Order: Little Endian
CPU(s): 4
  On-line CPU(s) list: 0-3
Vendor ID: AuthenticAMD
  Model name: AMD Phenom(tm) II X4 B35 Processor
    CPU family: 16
    Model: 5
    Thread(s) per core: 1
    Core(s) per socket: 4
    Socket(s): 1
    Stepping: 2
    CPU max MHz: 2900.0000
    CPU min MHz: 800.0000
    BogoMIPS: 5799.84
    Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nop
                         l nonstop_tsc cpuid extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt hw_pstate vmmcall npt lbrv svm_l
                         ock nrip_save
Virtualization features:
  Virtualization: AMD-V
Caches (sum of all):
  L1d: 256 KiB (4 instances)
  L1i: 256 KiB (4 instances)
  L2: 2 MiB (4 instances)
NUMA:
  NUMA node(s): 1
  NUMA node0 CPU(s): 0-3
Vulnerabilities:
  Gather data sampling: Not affected
  Itlb multihit: Not affected
  L1tf: Not affected
  Mds: Not affected
  Meltdown: Not affected
  Mmio stale data: Not affected
  Retbleed: Not affected
  Spec rstack overflow: Not affected
  Spec store bypass: Not affected
  Spectre v1: Mitigation; usercopy/swapgs barriers and _...

Read more...

Revision history for this message
Andre Ruiz (andre-ruiz) wrote :

I'm eventually seeing this error on other Intel machines as well (although not so often as the AMD one). Here is an example (see attached logs).

Revision history for this message
Andre Ruiz (andre-ruiz) wrote :

This is probably a duplicate of LP: #2065866

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.