Full-Text RSS: New York Times feeds don't work

Bug #659525 reported by atonaldenim
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Five Filters
New
Undecided
Unassigned

Bug Description

Full-Text RSS
self-hosted
v2.1
PHP 5.2.6-1+lenny9 (server info: http://news.rethinkmedia.org/feeds/ftr_compatibility_test.php )

It seems like feeds from the New York Times don't properly fetch the full text of articles. Tried a few different feeds, author and section, most items return the original feed snippet with [unable to retrieve full-text content] at the beginning.

Perhaps NYT is blocking it in some way?

http://topics.nytimes.com/top/reference/timestopics/people/b/peter_baker/index.html?rss=1
http://feeds.nytimes.com/nyt/rss/World

Strangely, the Five Filters hosted version seems to work fine with both those examples.

Yours:
http://fivefilters.org/content-only/makefulltextfeed.php?url=http%3A%2F%2Ffeeds.nytimes.com%2Fnyt%2Frss%2FWorld&key=&max=4&links=preserve&submit=Create+Feed

Ours:
http://news.rethinkmedia.org/feeds/makefulltextfeed.php?url=http%3A%2F%2Ffeeds.nytimes.com%2Fnyt%2Frss%2FWorld&max=4&links=preserve&submit=Create+Feed

Thanks!

Revision history for this message
Keyvan (keyvan) wrote :

Thanks for the report. I think this has to do with HTTP redirects. After the 2.1 release a few people reported that certain sites didn't work and in most of the cases it turned out to be an issue with the server not following redirects. Some feeds contain item URLs which redirect to the actual source (you'll see if you hover over the links in the NYT feed that they show something like http://feeds.nytimes.com/click.phdo... but when you click they redirect somewhere else). Some servers, although very few that I've come across, appear to have this simple redirect following disabled. I'm not sure that that's the cause, but I've attached an updated compatibility test file which tries to check if the server supports HTTP redirects. Can you try it and tell me what it shows?

Revision history for this message
atonaldenim (atonaldenim) wrote :

Hi Keyvan,

Thanks very much for the assistance. Here's the compatibility test page you posted:
http://news.rethinkmedia.org/feeds/ftr_compatibility_test.1.php

The HTTP redirects line says Enabled. At first it reported that Tidy was disabled, so I installed the php5-tidy Debian package and restarted Apache, and now it reports Enabled, but the NYT feeds don't seem to work any better.

a phpinfo() for good measure...
http://news.rethinkmedia.org/feeds/info.php

Thanks again!

Revision history for this message
Keyvan (keyvan) wrote :

Thanks for the update. That's very strange. I just tried running a single NYT page (URL taken directly from one of the feeds that doesn't work) through the code and it works okay: http://news.rethinkmedia.org/feeds/makefulltextfeed.php?url=http%3A%2F%2Ffeeds.nytimes.com%2Fclick.phdo%3Fi%3Deaf0cecd2f22658086f627b4af53d0f6&max=5&links=preserve&submit=Create+Feed

I'll have to look into it some more to try and find out why the it doesn't work properly - perhaps the URL filtering isn't working properly, or perhaps SimplePie is returning the URL slightly differently...

Revision history for this message
cdemoulins (cdemoulins) wrote :

I have similar problem with French feeds like :
http://www.lemonde.fr/rss/une.xml
http://www.clubic.com/articles.rss

I check the compatibility test and all is ok.

Revision history for this message
Keyvan (keyvan) wrote :

Oh, I forgot all about this issue. Can either of you possibly give me FTP access so I can have a look and test a few changes? (If so, please email <email address hidden>)

summary: - New York Times feeds don't work
+ Full-Text RSS: New York Times feeds don't work
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.