The RSS Blog

News and commentary from the RSS and OPML community.

Tim Bray: We’re getting real, real close to sending the Atom data-format draft off for general IETF review. [cut] Via Technorati and PubSub, I subscribe to a bunch of synthetic feeds based on various keywords searches and URL linkages. They are infested with duplicates.

Randy: I don't see how Atom is gonna solve the Technorati duplicate problem. What Tim seemingly doesn't realize is that Technorati scrapes more than just the RSS feed, they also scrape the HTML. Most of the hits I get from Technorati are not present in the RSS, but rather in the HTML (blogrolls, sidebar lists, etc.) and the duplicates, from what I see, come from this HTML scraping, not the RSS.

Reader Comments Subscribe

You can't capture links in blogrolls and templates from the RSS.

Randy

Agreed. I'm exclusively using  Bloglines and Feedster now.

Randy

Type "339":