Allow faster indexing of elasticsearch via cli script
Bug #1732565 reported by
Robert Lyon
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Mahara |
Fix Released
|
Wishlist
|
Unassigned |
Bug Description
When one re-indexes a large site it can take hours before the site is fully re-indexed. This is because even though we index via the bulk system it is restricted by number of records we can read into memory and speed of cron run.
A way we could speed this up is via a fast index CLI script that allows us to fire off the next cron run for elasticsearch indexing immediately after previous one finishes
This way we would save the 'dead time' between runs waiting for the server clock to tick over to the next minute
Changed in mahara: | |
status: | In Progress → Fix Committed |
tags: | added: nominatedfeature |
Changed in mahara: | |
status: | Fix Committed → Fix Released |
To post a comment you must log in.
Also it looks like if you drop down the number to index for each run on a large site the indexing goes faster as the bulk of the slowness looks to be computing information into memory before passing it to the index.
eg
420,000 records at 10,000 records per run took 90mins
420,000 records at 5,000 records per run took 45mins