We need to decide whether a document can exist twice in the same index with the same index key, or not, and then enforce that behaviour with tests.
Ways a document might show up twice:
create_index("words", "splitwords(colours)")
document: { colours: "red red red green blue" }
Should this document be in the index three times (one for each of "red", "green", "blue") or five?
create_index("name", "names")
document: { names: [ "aaron", "aaron", "andrew" ] }
Should this document be in the index twice (one for each of "aaron" and "andrew" ) or three times?
I personally think that indexes should be deduped: there should not be two identical entries (that is, two identical (indexkey, doc) pairs) in an index.
We need to decide whether a document can exist twice in the same index with the same index key, or not, and then enforce that behaviour with tests.
Ways a document might show up twice:
create_ index(" words", "splitwords( colours) ")
document: { colours: "red red red green blue" }
Should this document be in the index three times (one for each of "red", "green", "blue") or five?
create_ index(" name", "names")
document: { names: [ "aaron", "aaron", "andrew" ] }
Should this document be in the index twice (one for each of "aaron" and "andrew" ) or three times?
I personally think that indexes should be deduped: there should not be two identical entries (that is, two identical (indexkey, doc) pairs) in an index.