base system installation is not robust against transient network failures
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
base-installer (Debian) |
New
|
Unknown
|
|||
base-installer (Ubuntu) |
Triaged
|
Medium
|
Unassigned | ||
debootstrap (Debian) |
Fix Released
|
Unknown
|
|||
debootstrap (Ubuntu) |
Fix Released
|
Medium
|
Unassigned |
Bug Description
[This bug was originally reported by Gary Potwin in https:/
I have a Supermicro P6SBA motherboard with a 700 MHz Pentium III, 512 M of ram, 20G and 120G hard drives, and a DSL Internet connection.
This system has been running Windows 98 for years, and I wanted to try Ubuntu 10.04.
Using the network kernel and initrd, the systems boots OK and downloads the rest of the installer OK from the default mirror (us.archive.
Everything seems OK until I try to load the base system.
After downloading files for about 3.5 minutes (often when it is trying to get the file libklibc), I see the network activity stop, and soon after I get an error message stating that it has failed to load that file (all others up to that point were OK).
After about 2 more minutes, during which time one or more additional files fail to load, the network activity goes back to normal, and all the remaining files for the base system download OK.
Due to the failed files, I get the error message that the base system has failed to install.
I did successfully download the failed files using wget into /target/
The system still thinks that the files were not successfully downloaded, and I don't know how to tell the system that they are there and OK. I have tried using many different mirrors at different times of the day, and both http and ftp, all fail as above.
Using a similar technique, I was able to successfully load Debian 5.08, so I think the hardware is OK, but I would really like to try the Ubuntu.
To try to rule out any problem with the DSL, I downloaded a very large file that took 10 minutes of continuous running under Windows.
Then I went back to the 10.04 install and did the same thing using wget at a console, just after partitioning the hard drive, and just before starting the base install, and it worked fine. While loading the installer, one file did fail to load, but you were given the opportunity to retry, which took care of the problem.
I wish the base install allowed retries instead of just "go back" and "continue", which don't seem to make any additional attempt to retry.
Any help would be appreciated.
Gary
description: | updated |
affects: | ubiquity (Ubuntu) → debootstrap (Ubuntu) |
Changed in base-installer (Debian): | |
status: | Unknown → New |
Changed in debootstrap (Ubuntu): | |
status: | Triaged → In Progress |
Changed in debootstrap (Debian): | |
status: | Unknown → New |
Changed in debootstrap (Debian): | |
status: | New → Fix Released |
This is a complex problem that has been known in the Debian installer since at least 2004. I'm going to try to break it down here in the hope of making some progress on it.
1. Download error handling in debootstrap is arranged wrongly
In particular, it doesn't deal correctly with corrupted files, and will tend to muddle on until something fails as a consequence of the corruption. In some cases it's possible for debootstrap to complete successfully despite a corrupted download! There's a patch in http:// bugs.debian. org/cgi- bin/bugreport. cgi?bug= 618920 that improves things, although I've been working on a better version of it.
2. No retry option
As Joey notes in http:// bugs.debian. org/cgi- bin/bugreport. cgi?bug= 283600, there's only a fairly limited communication channel between debootstrap (which is a separate tool invoked by the installer to do the hard work) and the parts of the installer that can actually interact with the user. This means that it's hard to set up a "retry" option that just retries a single download, because debootstrap doesn't wait for user interaction on errors and it would be a substantial amount of work to rearrange it to do so.
What we might be able to do is as follows: if debootstrap fails at the retrieval stage before it actually starts unpacking anything, then we could offer an option that simply tries the whole thing again, keeping the previous contents of /target (so that would also preserve anything you'd wgetted by hand, but it would also try to redownload any other missing files for itself). This is a little less neat, but would do the job. In fact, if we borrowed some ideas from net-retriever, we could even let you choose a different mirror.