fn:tokenize() doesn't stream
Bug #898074 reported by
William Candillon
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Zorba |
Fix Released
|
Critical
|
Matthias Brantner |
Bug Description
The following query:
let $content := file:read-
return tokenize($content, "\s")
doesn't stream the result.
Related branches
lp:~zorba-coders/zorba/tokenize
- Matthias Brantner: Approve
- William Candillon: Approve
- Paul J. Lucas: Pending requested
-
Diff: 605 lines (+368/-2)25 files modifiedChangeLog (+2/-0)
modules/com/zorba-xquery/www/modules/CMakeLists.txt (+1/-1)
modules/com/zorba-xquery/www/modules/string.xq (+21/-1)
src/functions/pregenerated/func_strings.cpp (+23/-0)
src/functions/pregenerated/func_strings.h (+15/-0)
src/functions/pregenerated/function_enum.h (+1/-0)
src/runtime/spec/strings/strings.xml (+31/-0)
src/runtime/strings/pregenerated/strings.cpp (+42/-0)
src/runtime/strings/pregenerated/strings.h (+52/-0)
src/runtime/strings/strings_impl.cpp (+130/-0)
src/runtime/visitors/pregenerated/planiter_visitor.h (+5/-0)
src/runtime/visitors/pregenerated/printer_visitor.cpp (+14/-0)
src/runtime/visitors/pregenerated/printer_visitor.h (+3/-0)
test/rbkt/ExpQueryResults/zorba/string/tokenize01.xml.res (+1/-0)
test/rbkt/ExpQueryResults/zorba/string/tokenize02.xml.res (+1/-0)
test/rbkt/ExpQueryResults/zorba/string/tokenize03.xml.res (+1/-0)
test/rbkt/ExpQueryResults/zorba/string/tokenize04.xml.res (+1/-0)
test/rbkt/Queries/zorba/string/token01.txt (+1/-0)
test/rbkt/Queries/zorba/string/token02.txt (+1/-0)
test/rbkt/Queries/zorba/string/token03.txt (+1/-0)
test/rbkt/Queries/zorba/string/token04.txt (+1/-0)
test/rbkt/Queries/zorba/string/tokenize01.xq (+5/-0)
test/rbkt/Queries/zorba/string/tokenize02.xq (+5/-0)
test/rbkt/Queries/zorba/string/tokenize03.xq (+5/-0)
test/rbkt/Queries/zorba/string/tokenize04.xq (+5/-0)
Changed in zorba: | |
importance: | Undecided → Critical |
Changed in zorba: | |
assignee: | nobody → Matthias Brantner (matthias-brantner) |
Changed in zorba: | |
status: | New → Fix Committed |
Changed in zorba: | |
status: | Fix Committed → Fix Released |
To post a comment you must log in.
Making tokenize stream would be really difficult. The reason being that the pattern can be an arbitrary regular expression which is a lot of cases would require to materialize the entire stream anyway.
Hence, I'm suggestion that we define our own string:tokenize function which accepts only separators instead of regular expressions and make this work in a streaming fashion. Would that also do the job?