PSA: Feedster is broken
I'm aware that those in glass houses shouldn't throw stones, but as far as I can tell, the programmers at Feedster are suffering some kind of rock shortage, so I'm only too happy to help. Here's something I spotted while debugging the next version of Shrook (due very soon, sorry for the delay).
A bit of history first. Early versions of RSS didn't include any way of identifying individual items. Thus, when an RSS reader revisited a site, the only way it could tell if there were new items was to compare the text of every item it found with every item it stored, and see if any didn't match. This was obviously very fragile, because whenever an item's text was edited, it would appear as a completely different entry (you still see this occasionally in Shrook), or if two items had similar text they could easily be mistaken for one another. This problem was solved with the GUID (Globally Unique Identifier), an extra bit of info attached to each item that was unique to it and never changed, regardless of any changes to the text. Kind of like a serial number.
Anyway, today I noticed how Feedster generates the GUIDs in its RSS feeds. There were two separate items in the Feedster search results that Shrook was getting confused about, both with the title "Aggregators". I thought, "How can this be? Feedster uses GUIDs." Then I looked at the GUIDs, and both items had the same GUID, "8a6d96ecd2a44fe6259b704364c08c5c." How could two different items end up with the same GUID when the only thing they had in common was their title? Surely no one was stupid enough to take something that's entire purpose for existing was to be independent from the text itself, and then calculate it from the text itself? Alas:
MD5 ("Aggregators") = 8a6d96ecd2a44fe6259b704364c08c5cThe real kicker is that Shrook and other aggregators would actually work better if the GUID was entirely absent, since they switch to a more robust system than these "GUIDs" provide.-- Graham, October 16th, 2005 10:59 PM.