Entry ID in Publish ping function
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
subhub |
New
|
Undecided
|
Unassigned |
Bug Description
Hi Ivan
This is not a bug report, rather a question
We're thinking of using subhub for several django sites that need to synchronize their data in "real-time"
It's not only articles, but objects that are passed as a JSON-LD payload in the <summary> XML tag.
Your concept of "private" hub fits well in our project too
We tested before a hosted hub (SuperFeeder) and when pinging the hub to tell it there was new content, we only passed the feed URL (the "topic") and the hub was then in charge of notifying suscribers, depending probably on a diff with the previous version of the feed.
If I understand well your code,
- One DistributionTask is created for each entry and for each suscriber
- when processed, the notifications are grouped to be delivered together to the suscriber callback URL
does it means that there's no use for the suscriber to have a different callback URL for each subscription, as all their subscriptions will be delivered together ?
We currently use the djpubsubhubbub module for the suscriber part, and there's a unique callback URL per subscription:
https:/
(and at first, we found it strange)
description: | updated |
Hi Dominique,
Distribution tasks are grouped not by a subscriber callback URL alone but by a pair of (callback, topic). The purpose of it is to group together several new entries of one topic intended for a single subscriber. This is merely a network optimization and it doesn't influence the design of a subscriber. It can either have a separate callback URL for different topics or have a single URL and determine the topic from <link rel="self"> in the AtomPub payload (or infer it from any other metadata in the system). Anyway, SubHub won't send entries from different topics in one payload.
As for the new entries, yes, SubHub indeed doesn't keep track of the latest published entry itself. Instead, the user is supposed to call `subhub.publish()` for every new entry, usually right at the time when it's created. This call is not cheap, though, as it is immediately tries to process all the distribution tasks in a blocked fashion. If you don't want to block for distribution then instead of calling `subhub.publish()` you can just add a new distribution task:
Distributio nTask.objects. add(topic, entry_id)
… and then send a signal to some separate process that would call `manage.py subhub_maintenance --distribute` from the shell or `DistributionTa sk.objects. process( )` directly if it's a Python environment.