New approach to big sampledata

Bug #444502 reported by Paul Everitt
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
KARL3
Fix Released
Low
Shane Hathaway

Bug Description

At the beginning of KARL3, we wanted to get an idea of the performance when you had some data in the site. "Performance" was somewhat amorphous: reads vs. writes on various screens, searches, memory footprint, using ab to pound the site, whatever. I wrote a console script and checked in some sample data that would bulk-load the site.

This console script had gotten bit rotted and it was somewhat dumb to have large XML files checked into the src/osi customization package for testing purposes. So I removed this data. I haven't yet removed the console script.

Spec
=======

- Console script that uses ZEO to create small, medium, and large example sites

- Use the fake-view method from src/osi/osi/run.py's populate function to stay closer in sync with changes in code

- Wire up unit tests to see if the basics of the sample data loader get out of sync. Design the script with this testability in mind.

- Find some way (lorem ipsum module, ispell dict) to get large amounts of sample data that contains unique words

- Use some small .doc and .pdf files for files

- Try to get a representative sample of content types into communities (blog entries, files, wiki pages, calendar events)

Tasks
=====

- Read the src/osi/osi/scripts/bulkloader.py and get an idea what it was trying to do

- Then, svn remove it and remove the entry point in src/osi/setup.py

- Create in src/karl somewhere (or karlsample) a new bulksample.py or some other named script

- Document in the main README.txt

Changed in karl3:
assignee: Carlos de la Guardia (cguardia) → Shane Hathaway (shane-hathawaymix)
Revision history for this message
Paul Everitt (paul-agendaless) wrote :

Shane, I know you're out today and tomorrow. When you get a chance, could you give an update on this?

Revision history for this message
Shane Hathaway (shane-hathawaymix) wrote :

I created a script called "samplegen". It produces communities containing blog entries, wiki pages, calendar events, and files. It pulls random words and text from Alice in Wonderland, a public domain book. I added a bit about the new script to README, then I removed bulkloader.py and its entry point.

Changed in karl3:
status: New → Fix Committed
Revision history for this message
Shane Hathaway (shane-hathawaymix) wrote :

I also added tests of samplegen, as specified.

Revision history for this message
Paul Everitt (paul-agendaless) wrote : Re: [Bug 444502] Re: New approach to big sampledata

On Oct 9, 2009, at 3:20 AM, Shane Hathaway wrote:

> I created a script called "samplegen". It produces communities
> containing blog entries, wiki pages, calendar events, and files. It
> pulls random words and text from Alice in Wonderland, a public domain
> book. I added a bit about the new script to README, then I removed
> bulkloader.py and its entry point.

Hi Shane. Do you think you could extend this to create the sample
users affliliate1-9 and staff1-9 if they don't exist?

--Paul

Revision history for this message
Shane Hathaway (shane-hathawaymix) wrote :

Paul Everitt wrote:
> Hi Shane. Do you think you could extend this to create the sample
> users affliliate1-9 and staff1-9 if they don't exist?

Yes. I will later.

Shane

Revision history for this message
Shane Hathaway (shane-hathawaymix) wrote :

samplegen now creates the sample users you listed.

Changed in karl3:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.