smart update crashes on first run with many channels

Bug #244605 reported by Rehan Khan
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Smart Package Manager
New
Undecided
Unassigned

Bug Description

Imported: http://tracker.labix.org/issue113

Reason: Review

further details: https://blueprints.launchpad.net/smart/+spec/bug-reporting-migration

msg1144 (view) Author: chekov Date: 2007-05-22.19:53:52

here is a snippet with the whole error. to get this I do a fresh smart install,
add some 12 channels and sync to a mirror file. the "base" channel here is
probably the 9th channel it starts to update, so there would still be some left...

Fetching information for 'CentOS-5 - Base OS - i386'...
-> http://mirrors.kernel.org/centos/5/os/i386/repodata/repomd.xml
repomd.xml ######################################## [ 93%]
-> http://mirrors.kernel.org/centos/5/os/i386/repodata/filelists.xml.gz
-> http://mirrors.kernel.org/centos/5/os/i386/repodata/primary.xml.gz
primary.xml.gz ######################################## [ 96%]
filelists.xml.gz ######################################## [100%]

Fetching information for 'CentOS-5 - Updates'...
Unhandled exception in thread started by <bound method URLLIBHandler.fetch of
<smart.fetcher.URLLIBHandler object at 0xb7bcfb8c>>
Traceback (most recent call last):
  File "/usr/lib/python2.4/site-packages/smart/fetcher.py", line 1243, in fetch
    item.setSucceeded(localpath, fetchedsize)
  File "/usr/lib/python2.4/site-packages/smart/fetcher.py", line 558, in
setSucceeded
    self._progress.setSubDone(self._urlobj.original)
  File "/usr/lib/python2.4/site-packages/smart/progress.py", line 250, in setSubDone
    (subcurrent, subtotal,
KeyError: 'http://mirrors.kernel.org/centos/5/updates/i386/repodata/repomd.xml'

I re-run the exact same routine but without mirrors and it works fine...
-alan

msg1143 (view) Author: niemeyer Date: 2007-05-22.13:36:49

More information is needed to diagnose the problem. At least a full traceback
that says where it actually stopped.

msg1139 (view) Author: chekov Date: 2007-05-21.07:27:30

This is still an issue with smart .50 on a brand new centos5 box.
full error reads:
"Unhandled exception in thread started by <bound method URLLIBHandler.fetch of
<smart.fetcher.URLLIBHandler object at 0xb7bb2bec>>
Traceback (most recent call last):
  File "usr/lib/python2.4/site-packages/smart/fetcher.py", line 1243, in fetch
     item.setSucceeded(localpath, fetchedsize)"
...
and it goes on from there. it always happens when smart needs to use mirrors to
retrieve the metadata files to the extent that it causes the counter to go over
100%...
-alan

msg1083 (view) Author: chekov Date: 2007-01-28.20:13:16

Hi David,

yes, on a brand new Fedora Core 6 box with smart 0.42 (40.fc6)

to be specific I get a
"Unhandled exception in thread started by <bound method URLLIBHandler.fetch
of <smart.fetcher.URLLIBHandler object at 0xb7a92fac>>"
as soon as I get my first 102% marker it creashes like that.
-alan

On Sun, 29 Oct 2006, David Farning at Labix Tracker wrote:

>
> David Farning <email address hidden> added the comment:
>
> chekov,
>
> Is this error reproduceable in version 0.42?
>
> What distro and release are you using?
>
> Thanks
> Dave
>
> ----------
> nosy: +dfarning
> status: unread -> chatting
>
> _______________________________________
> Labix issue tracker <email address hidden>
> <http://tracker.labix.org/issue113>
> _______________________________________
>

msg935 (view) Author: dfarning Date: 2006-11-15.21:46:56

There has been a discussion about this issue on the mailing list recently.
Can't find the post right now.

Someone on dial-up has noted that under certain circumstances a package can
start downloading on two mirrors at a time.

Sounds like the fetcher is appending a retry on the queue with removing the
current download.

Dave.

msg849 (view) Author: dfarning Date: 2006-10-29.04:28:48

chekov,

Is this error reproduceable in version 0.42?

What distro and release are you using?

Thanks
Dave

msg368 (view) Author: chekov Date: 2006-02-13.04:02:41

more info:
the number of channels does not seem to matter at all, except that the more
channels you have the more likely you are to use a mirror.
This seems to be a bug strictly in how smart gets channel update information
using mirrors. Even updating only 2 channels it crashes if more than one of the
files needed fails on the first mirror.
This is a hard crash (hangs smart, requires a kill -9) and replicating it
depends on one of the mirrors failing.
-alan

msg351 (view) Author: chekov Date: 2006-02-10.20:25:35

The problem seems to be related to mirrors...if a mirror failes to give me a
file, the updater goes to the next mirror and in some cases might try 4-5 before
being happy. this adds up the %s on the right even though it is still the same
file. eventually it gets over 100% and crashes.

msg344 (view) Author: chekov Date: 2006-02-09.21:00:39

I add about 18 channels to start before updating it for the first time. As of
version .40+ this started causing smart to crash during the update.

Smart will begin doing the updates just fine but keeps going and eventually
passes 100% on the completed indidcator. Then it throughs a Unhandled exception
error started by <bound method FTPHandler.fetch>. most recent call trace goes
fetcher.py line 1048 (fechitem.setSucceeded), fetcher.py line 547
(self._progress.setSubDone), and progress.py line 250 (setSubDone).

sorry I don't have cut&paste ability since I'm doing this in a kickstart
environmnet.
The error does not occur if you run smart update again (even if it crashed the
first time). and it can be avoided by individually updating all channels.
However, it does kill my kickstart as is
-alan

Tags: mandriva
Revision history for this message
David Smid (dsmid) wrote :

I'd like to fix this bug but don't know how to do it right.

Apparently, I can use this to silence the error:

    def setSubDone(self, subkey):
...
            if not subkey in self.__subprogress:
                return
...

or:
...
           (subcurrent, subtotal,
             fragment, subdata) = self.__subprogress.setdefault(subkey, (0, 0, 0, {}))
...

But I feel the cause of this error is elsewhere, maybe in Fetcher.run(), where item.setNextURL() is called.
FetcherItem.setNextURL calls self._urlobj.set() with next mirror URL and that sets also self._urlobj.original to given URL (why ???).
But self._urlobj.original is passed as a parameter to Progress.setSubDone() and there it's used as a key in __subprogress dictionary.
If FetcherItem.__progress.setSubDone() is called before calling FetcherItem.progress() or FetcherItem.updateSpeed(), the dictionary item with key self._urlobj.original is not created and KeyError is thrown.

Does it make sense or am I missing anything ?

Revision history for this message
David Smid (dsmid) wrote :

I was wrong. There's nothing wrong with FetcherItem.setNextURL().

The problem is as follows:
When FetcherHandler downloads a file, the overall progress gets incremented but if the file doesn't validate (invalid checksum), Fetcher tries to download the same file from another mirror.
This new download increments overall progress as well. That means one file causes progress to be incremented twice.
When 'smart update' is near end of all downloads, current progress equals to total but there are still some files left.
When progress hits 100%, flag __done is set and that prevents Progress.setSub() from creating subprogress record for newly downloaded files. When FetcherItem.setSucceeded() is called upon successful download, it tries to call Progress.setSubDone() but since the subprogress item doesn't exist in dictionary, KeyError is thrown.

Revision history for this message
David Smid (dsmid) wrote :

Possible fix in lp:~dsmid/smart/keyerror

tags: added: mandriva
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.