Activity log for bug #2009492

Date Who What changed Old value New value Message
2023-03-06 13:17:33 eMTee bug added bug
2023-03-06 13:20:01 eMTee bug task added adchpp
2023-04-29 11:41:05 eMTee description <eMTee> So not getting a result for a changed file (same path/different content) in the share after re-hashing is because the hub requesting a new bloom filter only if the number of shared files are changed in the INF coming from the client. In common examples like when you share an updated binary or change a text file and reindex this would not happen at all. <eMTee> Bloom request is only triggered by an SF and not SS in the INF. See https://sourceforge.net/p/adchpp/code/ci/default/tree/plugins/Bloom/src/BloomManager.cpp#L98 <eMTee> And with adding SS to the check there we're still not completly out of water since if the share change is a same path, same size, different content change then it still sucks. Minor editing of a text file or change of a fix-sized metadata e.g. an MP3 IDv1 tag resulting exactly this scenario. <eMTee> You can change even all of your share in this special way and if you don't change the sizes and number of files then you won't provide hits at all until you do some other kind of share change or reconnect the hub. [2023-02-28 09:03] <eMTee> So Blom request is based on an inadequate signal that's not enough for all cases. [2023-02-28 09:07] <eMTee> SS also should be hooked on at the very least but a perfect solution would be something that is signalling the share change in general or the number of re-hashes in the current client session. Or the last rehash timestamp. These signals would be adequate for requesting a new Bloom filter in all cases when it is needed to. [2023-02-28 09:11] <eMTee> Of course the client could force to send an INF SF after all rehashes in case it supports Blom, but it's pretty ugly to implement in DC++ and, more importantly, it is against the protocol since you send INFs only if some values change and in these special cases we investigate this would mean sending multiple INF SF's with the same value. [2023-02-28 09:13] <eMTee> "Each time this is received, it means that the fields specified have been added or updated." in https://adc.sourceforge.io/ADC.html#_inf [2023-02-28 09:17] <eMTee> If an extension is allowed to specify new INF fields then a last rehash timestamp field would probably be the cleanest solution for this both protocol and implementation wise... Within the currently defined standards another possibility is to do some client side trickery, an ugly hack to slightly fake SF or SS (eg. by incrementing one of them by 1) in each of this special share change case so then that triggers a Bloom update. Update: rephrase and clarify the initial report. -------------- There is a problem of not getting a search result for any number of changed files (same path/different content) in the share after re-hashing in an ADC client connected to an ADC hub with Bloom filter support of TTH searches. The issue is because the hub requesting a new bloom filter only if the number of shared files are changed in the INF SF coming from the client. In common examples like when you share an updated binary or change a text file and reindex this would obviously not happen. For example changing of a fix-sized metadata e.g. an MP3 IDv1 tag resulting exactly this scenario. So the filter request is based on an inadequate signal that's not enough for all common use cases. A solution would be something that is signalling the share change in general or also provided the number of re-hashes in the current client session or maybe the last rehash timestamp. These signals would be adequate for requesting a new Bloom filter in all cases when files changed in a client's share. Of course a BLOM supporting client could force to send an INF SF after all re-hashes when there is a content change in the share but it is against the protocol since INFs allowed to send only if any of the flag values changed and in these special case this would mean sending multiple INF SF's with the same SF value (see "Each time this is received, it means that the fields specified have been added or updated." in https://adc.sourceforge.io/ADC.html#_inf ). If an extension is allowed to specify new INF fields then a new flag ("SC"?) optionally with parameters containing more data for the hub about the actual share change, like a last rehash timestamp and number of changed files. This would probably be the cleanest solution but it needs a protocol update for the BLOM ADC extension. Within the currently defined standards another possibility is to do some client side trickery, an ugly hack to slightly fake SF (eg. by incrementing it by 1) in each of this special share change casees so then that'd trigger a BLOM request for an updated filter.
2023-04-29 11:41:12 eMTee summary Certain type of changes in the share do not trigger a Bloom filter update which makes such changed files temporarily unsearchable Certain type of changes in the share do not trigger a Bloom filter update which makes such changed files temporarily unsearchable by TTH