Silva

Catalog indexes XML element names

Bug #101756 reported by Samuel Schluep on 2006-11-09

Affects		Status	Importance	Assigned to	Milestone
	Silva	Fix Released	Medium	Unassigned	Silva 1.6

Bug Description

The Silva catalog's full-text index contains the XML element names, such as
'doc', 'p', 'path', 'image', 'link', 'em', etc. In my opinion this is a bug. The
full-text index should only contain XML contents not the XML element and XML
attribute names.

Tags:

Revision history for this message

Martijn Faassen (faassen) wrote on 2006-11-16:

This is indeed a design flaw in the way fulltext indexing takes place right now.
It would not be very hard to flatten this XML and leave out the tags, though
this needs to be carefully done and with automatic tests to make sure we don't
accidentally leave something out.

Revision history for this message

Daniel Nouri (daniel.nouri) wrote on 2007-02-08:

Fixed in r23270 by using a regular expression to strip out tags (test in r23269).

Andy Altepeter (aaltepet) on 2007-11-02

Changed in silva:
milestone:	none → 1.6

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.