build-tools: debian: image creation fails intermittently

Bug #1963716 reported by Davlet Panech
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Unassigned

Bug Description

Brief Description
-----------------

Debian build tools sometimes fail when building the ISO image. The problem seems to be related to the aptly POD, or k8s networking.

apt-get inside the LAT fails to download packages from repomgr pod with "Connection reset by peer" or "Broken pipe" errors. This happens intermittently with different packages. This error was observed on vanilla k8s, not Minikube.

Severity
--------
Major

Steps to Reproduce
------------------
Set up a debian build environment in vanilla k8s (not minikube). Build packages as usual. Build the image.

Expected Behavior
------------------
Image build succeeds

Actual Behavior
----------------
Image build fails intermittently.

Reproducibility
---------------
Intermittent

System Configuration
--------------------
N/A

Branch/Pull Time/Commit
-----------------------
master/2021-03-04

Last Pass
---------
master/2021-03-03

Timestamp/Logs
--------------

08:12:39 appsdk - DEBUG: E: Failed to fetch http://stx-debian-pipeline-stx-repomgr:80/deb-local-build/pool/main/s/systemd/systemd-coredump_247.3-6.stx.3_amd64.deb Error reading from server - read (104: Connection reset by peer) [IP: 10.109.51.162 80]

Test Activity
-------------
N/A

Workaround
----------
Try the build again, or build in minikube.

Revision history for this message
Davlet Panech (dpanech) wrote :

I can reproduce this problem vanilla k8s approx 30% of the time.

I haven't specifically tried to reproduce it in minikube, but nobody has complained about minikube so far.

I tried setting "net.netfilter.nf_conntrack_tcp_be_liberal=1" kernel option on the host as described here: https://technology.lastminute.com/chasing-k8s-connection-reset-issue/ -- it doesn't seem to help.

Revision history for this message
Davlet Panech (dpanech) wrote :

Note: our k8s cluster has only one node.

Ghada Khalil (gkhalil)
tags: added: stx.7.0 stx.build stx.debian
Changed in starlingx:
importance: Undecided → Medium
status: New → Triaged
Revision history for this message
Anthony Nowell (anowell1) wrote :

Per Davlet, this has not reoccurred recently which is a big change from 30% of the time 2 months ago. It's not clear which change resolved it, but there has been a general push to improve reliability of builds over the past 2 months.

Changed in starlingx:
status: Triaged → Won't Fix
status: Won't Fix → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.