Make wiki pages not load every wiki page

Bug #1357402 reported by Paul Everitt
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
KARL3
Won't Fix
Low
Nat Katin-Borland

Bug Description

Shane wrote:

It seems that wiki pages load the entire wiki on every page load. That's obviously expensive and could be low hanging fruit.

Changed in karl3:
status: New → In Progress
Revision history for this message
Chris Rossi (chris-archimedeanco) wrote :

Right. It took some staring, but I found it. At some point, years ago, we were asked to use fuzzy matching for wiki links. So, the name someone might type for a link might differ from the name of an actual wikipage in some ways but still match. Unfortunately the code to handle this happens when rendering the wiki page--for every link it iterates over all of the pages in the wiki until it finds a match and inserts a link to the actual wiki page.

Arguably, this feature wasn't a great idea to begin with. I'm not aware of other wiki's that are as flexible in this way.

This may not really be super low hanging fruit, because it would involve rolling back that feature such that the name used in the link must exactly match the name/id of the wiki page being linked to. That would be fantastic for performance, but not so great in terms of changing how wikis work and making them somewhat harder to use. It would have been much easier, politically, to not allow the flexibility in the first place than it would be now to take it back.

The alternative is to brainstorm ways to render the wiki pages more efficiently. Perhaps we could store link->document mapping behind the scenes somewhere, so we only have to connect a link to a document once, and then that information is stored. There may (probably) be edge cases that I haven't thought of with such a scheme. But, probably, that's what I would try to do.

Revision history for this message
Paul Everitt (paul-agendaless) wrote : Re: [Bug 1357402] Re: Make wiki pages not load every wiki page

Hmm, I originally thought: "We could move the work for this to an Ajax request after page load, which would be faster but we'd still have some ugly object cache pain." This was going to be our next step on the FILES "Move To" thing. But I suspect this work is being done in the page template, so the work would increase to throw jQuery at it.

I think I should assign this to Nat and see how he feels about the feature-removal-for-performance tradeoff. Agree?

--Paul

On Aug 20, 2014, at 3:11 PM, Chris Rossi <email address hidden> wrote:

> Right. It took some staring, but I found it. At some point, years ago,
> we were asked to use fuzzy matching for wiki links. So, the name
> someone might type for a link might differ from the name of an actual
> wikipage in some ways but still match. Unfortunately the code to handle
> this happens when rendering the wiki page--for every link it iterates
> over all of the pages in the wiki until it finds a match and inserts a
> link to the actual wiki page.
>
> Arguably, this feature wasn't a great idea to begin with. I'm not aware
> of other wiki's that are as flexible in this way.
>
> This may not really be super low hanging fruit, because it would involve
> rolling back that feature such that the name used in the link must
> exactly match the name/id of the wiki page being linked to. That would
> be fantastic for performance, but not so great in terms of changing how
> wikis work and making them somewhat harder to use. It would have been
> much easier, politically, to not allow the flexibility in the first
> place than it would be now to take it back.
>
> The alternative is to brainstorm ways to render the wiki pages more
> efficiently. Perhaps we could store link->document mapping behind the
> scenes somewhere, so we only have to connect a link to a document once,
> and then that information is stored. There may (probably) be edge cases
> that I haven't thought of with such a scheme. But, probably, that's
> what I would try to do.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1357402
>
> Title:
> Make wiki pages not load every wiki page
>
> Status in KARL3:
> In Progress
>
> Bug description:
> Shane wrote:
>
> It seems that wiki pages load the entire wiki on every page load.
> That's obviously expensive and could be low hanging fruit.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/karl3/+bug/1357402/+subscriptions

Revision history for this message
Chris Rossi (chris-archimedeanco) wrote :

Sure. Caching the links would be my second choice. With the caveat we'd
still have operations that iterate over and wake up all of the wiki pages,
just not as often, since we'd save the result.

Chris

On Wed, Aug 20, 2014 at 4:37 PM, Paul Everitt <email address hidden> wrote:

> Hmm, I originally thought: "We could move the work for this to an Ajax
> request after page load, which would be faster but we'd still have some
> ugly object cache pain." This was going to be our next step on the FILES
> "Move To" thing. But I suspect this work is being done in the page
> template, so the work would increase to throw jQuery at it.
>
> I think I should assign this to Nat and see how he feels about the
> feature-removal-for-performance tradeoff. Agree?
>
>

Revision history for this message
Paul Everitt (paul-agendaless) wrote :

Hi Nat. Chris has anaylsis on this in comment #1. Basically:

- Big wikis are slow because loading one wiki page means a database request to load all wiki pages
- Why do we do that? Because a feature we were asked to add (fuzzy matching).

Choices:

- Get rid of that feature and gain a speedup
- Implement an index to get the speedup (just a guess of 6 hours to make the index and the migration script for it)
- Do nothing

Changed in karl3:
assignee: Chris Rossi (chris-archimedeanco) → Nat Katin-Borland (nborland)
Revision history for this message
Chris Rossi (chris-archimedeanco) wrote :

I thought it might be useful to do some impact analysis as well. There is currently one community with over 1000 pages. That wiki would, obviously, benefit from doing something. There are 18 communities with over 100 and less than 1000 pages. That's not generally considered a huge number of objects, but object retrieval seems to be awfully expensive, so there may even be an impact at that level. There are 43 communities with over 30 and less than 100 pages. 354 communities have more than one page and less than 30. It might be useful to look at performance of the one large wiki after the tags portlet change hits, to see if we really have a performance problem worth fixing.

Revision history for this message
Chris Rossi (chris-archimedeanco) wrote :
Download full text (4.3 KiB)

Raw size data for all community wikis (from staging):

[1296, 849, 312, 295, 268, 255, 222, 200, 168, 153, 152, 148, 139, 137, 116, 113, 111, 107, 101, 99, 91, 88, 83, 80, 72, 69, 63, 59, 59, 59, 56, 53, 53, 50, 50, 47, 47, 47, 42, 42, 42, 42, 42, 41, 40, 39, 39, 39, 39, 39, 39, 37, 37, 36, 35, 34, 34, 34, 31, 31, 31, 31, 30, 30, 29, 29, 29, 29, 28, 28, 28, 26, 26, 26, 26, 25, 24, 24, 23, 23, 23, 23, 22, 22, 22, 22, 22, 21, 21, 21, 21, 20, 20, 20, 19, 18, 18, 18, 18, 18, 18, 18, 18, 17, 17, 17, 17, 17, 17, 16, 16, 16, 16, 15, 15, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 13, 13, 13, 13, 12, 12, 12, 12, 12, 12, 12, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 10, 10, 10, 10, 10, 10, 10, 10, 10, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,...

Read more...

Revision history for this message
Paul Everitt (paul-agendaless) wrote : Re: [Bug 1357402] Make wiki pages not load every wiki page

Those numbers really help a lot, thanks. Without cache affinity, we don't get as much help as we'd like when people go to a wiki and pay the price the first time. Now that we have 180k objects in the cache, after a week or so past a restart, we're ok. But of course, restarts might have once a week. :(

I agree that, once tags portlet hits, we can review it. If the most pathological cases are down under 10 seconds for load, then we probably can't justify it.

--Paul

On Aug 21, 2014, at 9:52 AM, Chris Rossi <email address hidden> wrote:

> I thought it might be useful to do some impact analysis as well. There
> is currently one community with over 1000 pages. That wiki would,
> obviously, benefit from doing something. There are 18 communities with
> over 100 and less than 1000 pages. That's not generally considered a
> huge number of objects, but object retrieval seems to be awfully
> expensive, so there may even be an impact at that level. There are 43
> communities with over 30 and less than 100 pages. 354 communities have
> more than one page and less than 30. It might be useful to look at
> performance of the one large wiki after the tags portlet change hits, to
> see if we really have a performance problem worth fixing.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1357402
>
> Title:
> Make wiki pages not load every wiki page
>
> Status in KARL3:
> In Progress
>
> Bug description:
> Shane wrote:
>
> It seems that wiki pages load the entire wiki on every page load.
> That's obviously expensive and could be low hanging fruit.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/karl3/+bug/1357402/+subscriptions

Revision history for this message
Paul Everitt (paul-agendaless) wrote :

We decided to not change this until it becomes a bigger problem.

Changed in karl3:
status: In Progress → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.