[MASTER] aptitude/apt-get segmentation fault on corrupted cache

Bug #16467 reported by samsaga2
170
This bug affects 8 people
Affects Status Importance Assigned to Milestone
apt (Debian)
Fix Released
Unknown
apt (Ubuntu)
Fix Released
Medium
Unassigned

Bug Description

I cannot install/update nothing. I always get a segmentation fault.

# gdb apt-get
GNU gdb 6.1-debian
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386-linux"...(no debugging symbols found)...Using
host libthread_db library "/lib/tls/i686/cmov/libthread_db.so.1".

(gdb) run dist-upgrade
Starting program: /usr/bin/apt-get dist-upgrade
(no debugging symbols found)...(no debugging symbols found)...(no debugging
symbols found)...(no debugging symbols found)...(nS'està llegint la llista de
paquets... Acabatmbols found)...(no debugging symbols found)...(no debugging
symbols found)...
S'està construint l'arbre de dependències... 0%
Program received signal SIGSEGV, Segmentation fault.
0x429c440e in pkgDepCache::StateCache::Update () from
/usr/lib/libapt-pkg-libc6.3-5.so.3.3
(gdb) bt
#0 0x429c440e in pkgDepCache::StateCache::Update () from
/usr/lib/libapt-pkg-libc6.3-5.so.3.3
#1 0x429c1cf4 in pkgDepCache::Init () from /usr/lib/libapt-pkg-libc6.3-5.so.3.3
#2 0x429e1f71 in pkgCacheFile::Open () from /usr/lib/libapt-pkg-libc6.3-5.so.3.3
#3 0x0805e03f in std::vector<pkgSrcRecords::Parser::BuildDepRec,
std::allocator<pkgSrcRecords::Parser::BuildDepRec> >::_M_insert_aux ()
#4 0x0805760d in ?? ()
#5 0x429b7871 in CommandLine::DispatchArg () from
/usr/lib/libapt-pkg-libc6.3-5.so.3.3
#6 0x0805d55e in ?? ()
#7 0x4102d7b6 in __libc_start_main () from /lib/tls/i686/cmov/libc.so.6
#8 0x41147edc in ?? () from /lib/tls/i686/cmov/libc.so.6
#9 0x41015020 in ?? () from /lib/ld-linux.so.2
#10 0x0805f58c in
std::__uninitialized_copy_aux<__gnu_cxx::__normal_iterator<pkgSrcRecords::Parser::BuildDepRec*,
std::vector<pkgSrcRecords::Parser::BuildDepRec,
std::allocator<pkgSrcRecords::Parser::BuildDepRec> > >,
__gnu_cxx::__normal_iterator<pkgSrcRecords::Parser::BuildDepRec*,
std::vector<pkgSrcRecords::Parser::BuildDepRec,
std::allocator<pkgSrcRecords::Parser::BuildDepRec> > > > ()
Previous frame inner to this frame (corrupt stack?)

Tags: metabug
Revision history for this message
Matt Zimmerman (mdz) wrote :

This generally means that your package database is corrupt.

sudo rm -f /var/cache/apt/*.bin

Revision history for this message
Michael Vogt (mvo) wrote :

Did removing the cache file of the package database help fix the problem (no
worries about the rm command, it's only cached data that will be rebuild on the
next run of apt)?

Revision history for this message
samsaga2 (samsaga2) wrote :

The rm solution works fine but instead a segmentation fault it would be nice a
message error.

Revision history for this message
Michael Vogt (mvo) wrote :

This is not easy to do because the cache is mmaped directly. If any
filesystems/memory problems comes up libapt will just crash. A possible fix
would be to write a checksum (md5/sha1) when the cache is closed and re-check
that checksum when the cache is opened again. The performance hit is probably
pretty bad though.

Michael Vogt (mvo)
Changed in apt:
assignee: mvo → nobody
importance: Wishlist → Medium
Revision history for this message
Celso Providelo (cprov) wrote : Re: [MASTER] aptitude/apt-get segmentation fault on currupted cache

I have similar problems in jaunty/x86_64, http://pastebin.ubuntu.com/147947/

It gets fixed for a while when I remove and rebuilt the caches, but it happens again in a short interval.

Revision history for this message
greenmoss (ktyubuntu) wrote :

It is unreasonable to expect users to manually fix apt-get's cache files when they become corrupted. As a long-time apt-get user who has encountered this bug frequently, I can attest to its age, and am surprised it has not been fixed after all these years. For proof, simply see this bug report from 2001: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=93453

At minimum, apt-get should catch the corruption and exit gracefully whilst informing the user how to fix it.

Better would be for apt-get to warn of the corrupted file and fix it automatically.

Best would be to fix the bug(s) causing the corruption, thus avoiding this issue entirely.

As to its difficulty: that just makes it more challenging and rewarding to fix, right? ;)

Revision history for this message
Storm Knight (stormyk-2000) wrote :

This problem also occurs when the apt-cache directory is shared by multiple machines when more than one machine is trying to update. It clears once thew problem is down to single access.

And before its said --Yes setting up a repository is a preferred option,

Revision history for this message
pirx67 (pirx67) wrote :

In a similar way my apt-get crashes also. I described my problems in bug report https://bugs.launchpad.net/ubuntu/+source/apt/+bug/94590 with comment #6.

This kind of bug needs a higher priority for fixing it judging from the lots of duplicates attached to the related bugs like
https://bugs.launchpad.net/ubuntu/+source/apt/+bug/119707 and
https://bugs.launchpad.net/ubuntu/+source/apt/+bug/94590

papukaija (papukaija)
tags: added: metabug
Changed in apt (Debian):
status: Unknown → Incomplete
Revision history for this message
rusivi2 (rusivi2-deactivatedaccount) wrote :

Thank you for reporting this bug.

Does this occur in Lucid?

Changed in apt (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
hackel (hackel) wrote :

I just experienced this issue on 10.10... Sadly, this very serious bug is still very much alive. I hope to see the priority raised. I have retained the corrupt package cache files if any further information is required.

Revision history for this message
Julian Andres Klode (juliank) wrote :

Not a bug in apt. While we could write a checksum or do stuff like that this completely defeats the purpose of the cache (performance).

Changed in apt (Ubuntu):
status: Incomplete → Invalid
Revision history for this message
greenmoss (ktyubuntu) wrote :

It's a bug *somewhere*. Since the problem still exists, the bug report is not "invalid".

Changed in apt (Ubuntu):
status: Invalid → Incomplete
Revision history for this message
Julian Andres Klode (juliank) wrote :

There is no bug anywhere. In the best case, this would be a feature request. And we have a feature in planning involving a footer on the cache and storing the cache size in the header. This way, we will know if the cache is too small or if the last 4 bytes are corrupt. We cannot check that the whole cache is correct, this is impossible. We could add a checksum, but we have tested it and it is much too slow (could be added as an apt-cache check option)

The new features may land in oneiric, but this is no promise. We have hundreds of bug reported against APT, and this thing is one of the least important ones I could imagine.

Changed in apt (Ubuntu):
status: Incomplete → In Progress
Revision history for this message
Julian Andres Klode (juliank) wrote :

As you're so aggresively trying to keep this bug open, I will just mark all other segfaults potentially caused by invalid caches as duplicates.

Revision history for this message
greenmoss (ktyubuntu) wrote :

Please leave your snarks at the door. This problem has been around for ages, as I pointed out earlier. It should be fixed. Arguing about the semantics is unhelpful. Any constructive suggestions you can offer to get it fixed would be appreciated.

Revision history for this message
Sjors Gielen (sgielen) wrote :

The feature suggested by Julian Andres Klode sounds like a good fix for this, invalidating the cache if one of those values (magics or sizes) in header or footer are incorrect. However, I'm wondering, how does invalid cache lead to a segmentation fault? Are memory locations or even binary instructions saved to the cache, or are the values simply used without checking them? I know that this is a cache, but wouldn't some extra checks cost only a little bit of performance when related to this and many related crashes being fixed?

Revision history for this message
Julian Andres Klode (juliank) wrote :

The cache is mmap()ed, it becomes part of the programs memory space. Such crashes happen very very very very very very very very rarely. They can only happen if the cache was not completely written, i.e. power loss during update, filesystem bug, hardware bug. Sometimes, they might be caused by apt issues, but that's probably less than 0.01% of the cases.

Adding the extra checks is a matter of minutes, but it might take some months until we do this, as they require an ABI break.

summary: - [MASTER] aptitude/apt-get segmentation fault on currupted cache
+ [MASTER] aptitude/apt-get segmentation fault on corrupted cache
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (4.1 KiB)

This bug was fixed in the package apt - 0.8.16~exp5ubuntu1

---------------
apt (0.8.16~exp5ubuntu1) oneiric; urgency=low

  * merged new version from debian/experimental, this includes
    a ABI break and two new library packages

apt (0.8.16~exp5) UNRELEASED; urgency=low

  * apt-pkg/makefile:
    - install sha256.h compat header
  * apt-pkg/pkgcachegen.{cc,h}:
    - use ref-to-ptr semantic in NewDepends() to ensure that the
      libapt does not segfault if the cache is remapped in between
      (LP: #812862)

apt (0.8.16~exp4) experimental; urgency=low

  [ Julian Andres Klode ]
  * apt-pkg/pkgcache.h:
    - [ABI break] Add pkgCache::Header::CacheFileSize, storing the cache size
  * apt-pkg/pkgcachegen.cc:
    - Write the file size to the cache
  * apt-pkg/pkgcache.cc:
    - Check that cache is at least CacheFileSize bytes large (LP: #16467)

  [ Michael Vogt ]
  * merged latest fixes from debian-sid
  * apt-pkg/cdrom.{cc,h}:
    - cleanup old ABI break avoidance hacks
  * [ABI break] apt-pkg/acquire-item.{cc,h}:
    - cleanup around OptionalIndexTarget and SubIndexTarget
  * [ABI break] merged patch from Jonathan Thomas to have a new
    RecordField() function in the pkgRecorder parser. Many thanks
    Thomas
  * [ABI break] merge patch from Jonathan Thomas to speed up the
    depcache by caching the install-recommends and install-suggests
    values
  * apt-pkg/contrib/fileutl.{cc,h}:
    - add GetModificationTime() helper
  * apt-pkg/pkgcachegen.cc:
    - regenerate the cache if the sources.list changes to ensure
      that changes in the ordering there will be honored by apt
  * apt-pkg/sourcelist.{cc,h}:
    - add pkgSourceList::GetLastModifiedTime() helper

apt (0.8.16~exp3) experimental; urgency=low

  [ David Kalnischkies ]
  * apt-pkg/pkgcache.h:
    - readd All{Foreign,Allowed} as suggested by Julian to
      remain strictly API compatible
  * apt-pkg/acquire*.{cc,h}:
    - try even harder to support really big files in the fetcher by
      converting (hopefully) everything to 'long long' (Closes: #632271)
  * ftparchive/writer.cc:
    - generate all checksums in one run over the file for Release
  * cmdline/apt-get.cc:
    - add an --assume-no option for testing to say 'no' to everything
  * apt-pkg/deb/debmetaindex.cc:
    - add trusted=yes option to mark unsigned (local) repository as trusted
      based on a patch from Ansgar Burchardt, thanks a lot! (Closes: #596498)

  [ Michael Vogt ]
  * merge fixes from the debian/unstable upload
  * merge lp:~mvo/apt/sha512-template to get fixes for the
    sha1/md5 verifiation (closes: #632520)

apt (0.8.16~exp2) experimental; urgency=low

  [ David Kalnischkies ]
  * [ABI-Break] Implement EDSP in libapt-pkg so that all front-ends which
    use the internal resolver can now be used also with external
    ones as the usage is hidden in between the old API
  * provide two edsp solvers in apt-utils:
    - 'dump' to quickly output a complete scenario and
    - 'apt' to use the internal as an external resolver
  * apt-pkg/pkgcache.h:
    - clean up mess with the "all" handling in MultiArch to
      fix LP: #733741 cleanly for everyone now
  * apt-pkg/depcache.cc:
    - use a boolean instea...

Read more...

Changed in apt (Ubuntu):
status: In Progress → Fix Released
Changed in apt (Debian):
status: Incomplete → New
Changed in apt (Debian):
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.