Search by "sentence"
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
ANNIS |
Triaged
|
Medium
|
Unassigned |
Bug Description
Many users want to search treebanks with the context "sentence". Unfortunately the annotation that marks sentences is arbitrary (e.g. cat="S", cat="ROOT", seg="sent"...)
To make sentence search possible, we must be able to configure for each corpus (e.g. in the resolver) the annotation name which marks sentences. When sentence-based searching is activated, this annotation is then automatically added to the search as follows:
sentence-anno: cat="ROOT"
original AQL: cat="NP" & "Hund" & #1 > #2
modified AQL: cat="NP" & "Hund" & #1 > #2 & cat="ROOT" & #3 _i_ #1
The modified AQL is not displayed in the AQL box but sent in the background. Context is automatically set to zero on both sides when sentence-based search is enabled. There should be a checkbox to enable this called something like:
"Search for whole sentences (where available)"
This checkbox can be hidden by the admin if no corpora in the current instance support sentence-based searching.
tags: | added: database frontend service |
Changed in annis: | |
milestone: | 3.0.0-beta2 → 3.0.0 |
Since we now support multiple segmentation we could also add artificial "sentence" segmentations with a Pepper manipulator and use this information. Since you can use the segmentation layer in the context stuff like "show me the match + 2 sentences left and right" would be possible.