periodic-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset002-queens-upload is timeout in virt-customize step

Bug #1762351 reported by Arx Cruz
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
Quique Llorente

Bug Description

Logs:
https://logs.rdoproject.org/openstack-periodic/periodic-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset002-queens-upload/bf98214/console.txt.gz

This task should take only a few minutes, however is taking more than 2 hours and make the job timeout
The job usually takes 2.5 hours to finish.

Please notice that this runs after the tempest, so we do have tempest results, it's not related.

We don't have logs because the log collection is done in toc-test builder, and virt-customize is running in images-upload-and-label builder (another bug will be opened for this)

Arx Cruz (arxcruz)
Changed in tripleo:
importance: Undecided → High
status: New → Triaged
Revision history for this message
yatin (yatinkarel) wrote :
Download full text (8.1 KiB)

So this is not happening always and something is to do with nested virtualization:-

We hold the node as logs were not available in job to see what's in the virt-customize log, below is what we found:-
[jenkins@undercloud ~]$ tail -f .__repo_setup.sh.log
GS =0000 00000000 0000ffff 00009300
LDT=0000 00000000 0000ffff 00008200
TR =0000 00000000 0000ffff 00008b00
GDT= 00000000 0000ffff
IDT= 00000000 0000ffff
CR0=60000010 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
Code=00 66 89 d8 66 e8 e5 af ff ff 66 83 c4 0c 66 5b 66 5e 66 c3 <ea> 5b e0 00 f0 30 36 2f 32 33 2f 39 39 00 fc 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
^C
[jenkins@undercloud ~]$ vi .__repo_setup.sh.log
[jenkins@undercloud ~]$ cat .__repo_setup.sh.log
[ 0.0] Examining the guest ...
libguestfs: trace: set_verbose true
libguestfs: trace: set_verbose = 0
libguestfs: trace: set_network true
libguestfs: trace: set_network = 0
libguestfs: trace: add_drive "overcloud-full.qcow2" "readonly:false" "protocol:file" "discard:besteffort"
libguestfs: trace: add_drive = 0
libguestfs: trace: launch
libguestfs: trace: get_tmpdir
libguestfs: trace: get_tmpdir = "/tmp"
libguestfs: trace: version
libguestfs: trace: version = <struct guestfs_version = major: 1, minor: 36, release: 3, extra: rhel=7,release=6.el7_4.3,libvirt, >
libguestfs: trace: get_backend
libguestfs: trace: get_backend = "direct"
libguestfs: launch: program=virt-customize
libguestfs: launch: version=1.36.3rhel=7,release=6.el7_4.3,libvirt
libguestfs: launch: backend registered: unix
libguestfs: launch: backend registered: uml
libguestfs: launch: backend registered: libvirt
libguestfs: launch: backend registered: direct
libguestfs: launch: backend=direct
libguestfs: launch: tmpdir=/tmp/libguestfs15JLpA
libguestfs: launch: umask=0022
libguestfs: launch: euid=0
libguestfs: trace: get_backend_setting "force_tcg"
libguestfs: trace: get_backend_setting = NULL (error)
libguestfs: trace: get_cachedir
libguestfs: trace: get_cachedir = "/var/tmp"
libguestfs: begin building supermin appliance
libguestfs: run supermin
libguestfs: command: run: /usr/bin/supermin5
libguestfs: command: run: \ --build
libguestfs: command: run: \ --verbose
libguestfs: command: run: \ --if-newer
libguestfs: command: run: \ --lock /var/tmp/.guestfs-0/lock
libguestfs: command: run: \ --copy-kernel
libguestfs: command: run: \ -f ext2
libguestfs: command: run: \ --host-cpu x86_64
libguestfs: command: run: \ /usr/lib64/guestfs/supermin.d
libguestfs: command: run: \ -o /var/tmp/.guestfs-0/appliance.d
supermin: version: 5.1.16
supermin: rpm: detected RPM version 4.11
supermin: package handler: fedora/rpm
supermin: acquiring lock on /var/tmp/.guestfs-0/lock
supermin: build: /usr/lib64/guestfs/supermin.d
supermin: reading the supermin appliance
supermin: build: visiting /usr/lib64/guestfs/supermin.d/base.tar.gz type gzip base image (tar)
supermin: build: visiting /usr/lib64/guestfs/supermin.d/daemon.tar.gz type gzip base image (tar)
supermin: build: visiting /usr/lib64/guestfs/supermin.d/excludefiles type uncompressed exclu...

Read more...

Revision history for this message
Rafael Folco (rafaelfolco) wrote :

cpu flags seem to be identical on failing / successful runs... perhaps we could enforce tcg mode until we find the root cause.

Revision history for this message
yatin (yatinkarel) wrote :

<< cpu flags seem to be identical on failing / successful runs... perhaps we could enforce tcg mode until we find the root cause.

Proposed https://review.rdoproject.org/r/13327 until we find the root cause.

Revision history for this message
Rafael Folco (rafaelfolco) wrote :

tagging tech-debt for RCA

tags: added: tech-debt
Changed in tripleo:
assignee: Rafael Folco (rafaelfolco) → nobody
Changed in tripleo:
assignee: nobody → Gabriele Cerami (gcerami)
Revision history for this message
Matt Young (halcyondude) wrote :

triage note: this job has passed for the past 5 runs

Changed in tripleo:
status: Triaged → Invalid
Revision history for this message
Matt Young (halcyondude) wrote :

(and we've had a promotion)

Revision history for this message
Arx Cruz (arxcruz) wrote :

It’s not invalid, we have a promotion because a workaround https://review.rdoproject.org/r/#/c/13327/

Changed in tripleo:
status: Invalid → Triaged
Changed in tripleo:
milestone: rocky-1 → rocky-2
Changed in tripleo:
assignee: Gabriele Cerami (gcerami) → Quique Llorente (quiquell)
Revision history for this message
Quique Llorente (quiquell) wrote :
Revision history for this message
Matt Young (halcyondude) wrote :

(triage) holding pattern waiting on BZ

Revision history for this message
Sagi (Sergey) Shnaidman (sshnaidm) wrote :

07 May, no updates in BZ

Changed in tripleo:
assignee: Quique Llorente (quiquell) → Sagi (Sergey) Shnaidman (sshnaidm)
Changed in tripleo:
milestone: rocky-2 → rocky-3
Revision history for this message
Quique Llorente (quiquell) wrote :

No updates on the BZ

Changed in tripleo:
assignee: Sagi (Sergey) Shnaidman (sshnaidm) → Quique Llorente (quiquell)
Changed in tripleo:
milestone: rocky-3 → rocky-rc1
Changed in tripleo:
milestone: rocky-rc1 → stein-1
Revision history for this message
Quique Llorente (quiquell) wrote :

Testing if it's still an issue with https://review.openstack.org/601254

Changed in tripleo:
milestone: stein-1 → stein-2
Changed in tripleo:
milestone: stein-2 → stein-3
Changed in tripleo:
milestone: stein-3 → stein-rc1
Changed in tripleo:
milestone: stein-rc1 → train-1
Changed in tripleo:
milestone: train-1 → train-2
Changed in tripleo:
milestone: train-2 → train-3
Changed in tripleo:
milestone: train-3 → ussuri-1
Changed in tripleo:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.