high memory consumption when indexing library

Bug #801700 reported by ocknoz
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Mixxx
Fix Released
Medium
Daniel Schürmann
1.11
Fix Released
Medium
Daniel Schürmann

Bug Description

When launching Mixx for the first time after installation I let it import my music library. During the import process Mixx continued to use up more and more memory (almost 4gb).
It didn't release any memory, even after importing was done.
When the import was finished I restarted Mixx and it used only 120mb.

Mixx 1.9.0 x64
Library: 34,500 tracks, 200gb.
System: Win7 64bit, i7 950 3GHz, 12gb RAM (see also attached DxDiag.txt)
Memory consumption: http://img13.imageshack.us/img13/228/mixxram.png

Revision history for this message
ocknoz (chef-007) wrote :
Revision history for this message
Daniel Schürmann (daschuer) wrote :

The bug has something to do with the patch from Bug #604018

The library scanner collects all tracks in ram and commits them to the database at one.
The scanner itself releases the memory after commit. Bud there is also a track cache, maybe this still has the memory.

My valgrind analysis seems not to report a report a memory leak from this issue.

I think it is a good idea to limit the memory usage of the library scanner.

How does the track cache work? Is there also a memory limit?

Changed in mixxx:
status: New → Confirmed
Revision history for this message
Daniel Schürmann (daschuer) wrote :

The Track cache does not hold the memory because it is simply not used. I don't know why the memory is not reported as freed after the scanner finished. Maybe it is caused on Win7 by a kind of page allocator which holds the memory for later use.

But anyway, with the attached patch, the library scanner allocates only ram for one track without worth performance.
It even performs 7% faster on my eeePC with ubuntu lucid.

I have also added a window icon for the scanner dialog.

Changed in mixxx:
status: Confirmed → In Progress
Revision history for this message
Daniel Schürmann (daschuer) wrote :

I have reviewed the trackDao and added my results to the patch from #3. My additional changes are not relay a bug so I have decided not to create a new bug Id for it.

Following changes are incuded:
* Tracks are now removed from the weak cache if they are deleted.
* The list of dirty tracks is removed from the trackDao because no one is using it.
* I have made the weak cache static, because I think there should be only one weak cache to share tracks between several instances of trackDao. (Although, today only the track collection is using the cache)
* If a caller gets the track from the weak cache, it is now added to the strong cache again.

Changed in mixxx:
assignee: nobody → Daniel Schürmann (daschuer)
RJ Skerry-Ryan (rryan)
Changed in mixxx:
importance: Undecided → Medium
tags: added: library memory performance
tags: added: scanner
Revision history for this message
Daniel Schürmann (daschuer) wrote :

Attached you find an updated version of the patch for lp:mixxx #2876.
Note: It also suffers bug #870544

Revision history for this message
Daniel Schürmann (daschuer) wrote :

Attached you find an updated and extended patch for lp:mixxx #2915

Following changes are incuded:
* Tracks are now removed from the weak cache if they are deleted.
* The list of dirty tracks is removed from the trackDao because no one is using it.
* I have made the weak cache static, because I think there should be only one weak cache to share tracks between several instances of trackDao. (Although, today only the track collection is using the cache)
* If a caller gets the track from the weak cache, it is now added to the strong cache again.
* BaseTrackCache is updated from LibraryScanner per TrackPointer, smarter fix for bug #870544.
* Moved Tracks are also updated in the BaseTrackCache
* Tracks outside the library path are now verified too, fixes bug #239883.
* Up to 30% faster scanning a new library on my Netbook

Revision history for this message
Daniel Schürmann (daschuer) wrote :

Update v4, v3 was incomplete.

Revision history for this message
Daniel Schürmann (daschuer) wrote :

I have missed to undo the patch from bug #870544.

Revision history for this message
Daniel Schürmann (daschuer) wrote :

There was an unused statement causes a segfault when dragging a track.

RJ Skerry-Ryan (rryan)
Changed in mixxx:
milestone: none → 1.11.0
Revision history for this message
Max Linke (max-linke) wrote :

Does this still occur in the recent alpha version?

I've tried to reproduce it on debian (adding ~5000 songs) and my memory consumption was increased by 10mb.

Revision history for this message
Daniel Schürmann (daschuer) wrote :

Hi Max,

I think the original problem from the bug description consuming almost all RAM is long ago solved and might has nothing to do with my patch. We had a similar issue with the waveforms, maybe that was the original cause.

When locking for a solution for this bug, I have made some useful improvements for Library scanner and track dao.
see #3 #6.

I am afraid the last version of this patch is now conflicting but I have a resolved version in lp:~daschuer/mixxx/daschuers_trunk.
If you like, I can make a fresh patch.

This patch is a Milestone 1.11 patch. I am just waiting for a code review by someone else before commit it to trunk. Will you?

Tank you,

Daniel

Revision history for this message
Max Linke (max-linke) wrote :

Hi

I'd be happy to look over your code. It be nice if you could give me a fresh patch for that.

best Max

Revision history for this message
Daniel Schürmann (daschuer) wrote :

here it is:

Revision history for this message
Max Linke (max-linke) wrote :

I've changed some comments and used QMutexLocker instead of mutex.lock where the mutex doesn't get unlocked immediately, this should be easier to maintain over time. TrackDAO::addTracksAdd now checks if the querys exists and
exits with an error message if they are missing.

But I have some questions to the code in BaseTrackCache?

       if (id > 0) {
  QVector<QVariant>& record = m_trackInfo[id];
  record.resize(numColumns);

  for (int i = 0; i < numColumns; ++i) {
   record[i] = getTrackValueForColumn(pTrack, i);
  }
 }

why has record to be resized? Shouldn't it already have the right size?

Other from that it looks good to me. I haven't done any performance measurements but it looks faster to me.

best Max

Revision history for this message
Daniel Schürmann (daschuer) wrote :

Hi Max,

I like the Idea of using QMutexlLocker, it is always good to use the benefit of c++!
Now the mutex is unlocked later than before in some cases. Please consider to put the critical section in an own code block and use expicit QMutexLocke::unlock() if required.

like
{ // Critical section
     QMutexLocker locker(&m_sTracksMutex);
     ... critical code
}
... non critical code

-------------------

The QVector<QVariant>& record is resized to allocate the needed memory for the columns at once.
So we can later copy the content using record[i] = ...;
Otherwise we had to use record.append which is slower, I think!

I should have put commends in the code.

An additional thought:
record[i] = ...; preforms a deep copy of the QVariant object (QString in most cases).
This should be the most time consuming task in the loop.
You could improve this by giving getTrackValueForColumn a pointer to the just created QVariant record[i], instead of crate a new QVariant on stack and copy its content (twice).

Kind regards,

Daniel

Revision history for this message
Max Linke (max-linke) wrote :

I don't think it is necessary to unlock the mutex manually or put them in an extra scope in the context I'm using them in.

In the new version addTracksPrepare checks if there are unfinished transactions and closes them if they are found and I overloaded getTrackValueForColumn. The new getTrackValueForColumn now gets as additional parameter a reference to the QVariant. I left the old on also in because other functions in BTC also use it and changing them to the new getTrackValueForColumn results in empty rows in the library (I'm not sure why).

Revision history for this message
Daniel Schürmann (daschuer) wrote :

Ok I think with the mutex you are right. They will act only in rare cases and the functions are short.

----
getTrackValueForColumn:

I am not sure if I am right or if the compiler will optimize it anyway:
* QVariant(pTrack->getArtist());
** creates a QVariant object and copies the QString to it.
* track = QVarian(...) copies the Object from the stack to the location of the reference.
So you have still two copies. Please correct me if I am wrong!

I would prefer using
trackValue.setValue(pTrack->getArtist());

(I have changed track to trackValue, because the variable track is a Trackpointer in other functions)

I cannot understand why it does not work for other functions. It should or there is an other error.

Revision history for this message
Max Linke (max-linke) wrote :

Shame on me, I was wrong with the mutexLocker. I have to scope it in TrackDAO::getTrack() so that mixxx won't freeze.

This patches uses setValue,which is indead nicer) and all functions now use getTrackForColumn with the reference call, I probably made an error earlier.

Revision history for this message
RJ Skerry-Ryan (rryan) wrote : Re: [Bug 801700] Re: high memory consumption when indexing library

One other thing to point out -- the original code puts the adding of tracks
in a transaction via ScopedTransaction. Now that the adding happens over
many function calls you will need to either manually start and commit the
transaction or possibly pull the ScopedTransaction into the library scanner
and out of the DAO. (I much prefer ScopedTransaction to manually managing
the transaction because it eliminates the possibility that you've forgotten
to commit or rollback and leave the database in a bad state).

On Thu, Jun 28, 2012 at 12:09 PM, Max Linke <email address hidden>wrote:

> Shame on me, I was wrong with the mutexLocker. I have to scope it in
> TrackDAO::getTrack() so that mixxx won't freeze.
>
> This patches uses setValue,which is indead nicer) and all functions now
> use getTrackForColumn with the reference call, I probably made an error
> earlier.
>
> ** Patch added: "library_scanner_track_dao_v10.patch"
>
> https://bugs.launchpad.net/mixxx/+bug/801700/+attachment/3206846/+files/library_scanner_track_dao_v10.patch
>
> --
> You received this bug notification because you are a member of Mixxx
> Development Team, which is subscribed to Mixxx.
> https://bugs.launchpad.net/bugs/801700
>
> Title:
> high memory consumption when indexing library
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/mixxx/+bug/801700/+subscriptions
>

Revision history for this message
Daniel Schürmann (daschuer) wrote :

Oh yes RJ, I think you have actually found a bug. This must have been happen during my merges.

The sequence of this functions is a transaction:

addTracksPrepare();
addTracksAdd(pTrack, unremove);
addTracksFinish();

So its wrong to use a ScopedTransaction in addTracksPrepare and a pure m_database.commit() in addTracksFinish();

A function like addTracksCancel() which manages the rollback and frees the Queries is not prepared. Do we need one?

A perfect solution for me is to create a own Class "ScopedTracksAddTransaction" which handles these thee function like ScopedTransaction and introduces a addTracksCancel() . But I a agree it is just fine to use ScopedTransaction in each caller and remove transactions from the functions itself. But you must be still careful anyway, because an early return (a missing addTracksFinish) will result in leaking memory.

Revision history for this message
Max Linke (max-linke) wrote :

Now TrackDAO has a pointer to a ScopedTransaction and addTracksPrepare starts the transaction and addTracksFinish commits all changes. If addTracksPrepare discovers any open transactions it will call addTracksFinish to commit and close them. If TrackDAO is deleted any open transactions will also be closed.

I like it this way because when a libraryScan is canceled that way everything that has already been scanned will land in the database. I can also implement stuff that would to a rollback but I don't see a case where we would want to rollback all the changes. The calls of addTracksPrepare and Finish in TrackDAO itself should be uncritical and a rollback after a canceled library scan does not make sense for me.

Revision history for this message
Max Linke (max-linke) wrote :

I should also note that an initial scan of my music was 2 minutes faster with this patch. SSD's should make this even faster, something I doubt for trunk right now since it opens and closes querys a lot during a scan.

Revision history for this message
Daniel Schürmann (daschuer) wrote :

Commited to lp:mixxx/1.11 #3298
Thank you for review!

Changed in mixxx:
milestone: 1.11.0 → none
status: In Progress → Fix Committed
RJ Skerry-Ryan (rryan)
Changed in mixxx:
status: Fix Committed → Fix Released
Revision history for this message
Swiftb0y (swiftb0y) wrote :

Mixxx now uses GitHub for bug tracking. This bug has been migrated to:
https://github.com/mixxxdj/mixxx/issues/5935

lock status: Metadata changes locked and limited to project staff
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.