Is FlickrFan’s AP Feed Proprietary RSS?

January 25th, 2008 by EyeOnWiner

Dave thinks of RSS as one format to rule them all. He wants everyone to use RSS (and not ATOM!) so that everything works together. But is he displaying a new form of hypocrisy with the AP photos feed?

Looking at the comments to a recent post here at EOW you can see that some odd things are afoot.

For starters, a commenter found the AP feed and started using it without FlickrFan. He posted about that here on EOW. Shortly thereafter, the photo feed stopped being updated… but FlickrFan kept getting new pictures. How? Dave changed the address of the feed and rolled out an update silently to FlickrFan clients.

Security through obscurity.

Furthermore, it was noticed that the reason that so many RSS readers were having problems reading this raw feed was due to malformed HTTP URLs, in violation of the RSS spec. Essentially, Dave was using an encrypted URL (albeit a very weak encryption) which required an algorithm (albeit a very simple one) to read the URLs. Some readers did that by default, some did not. In either case, though, the feed was not properly formed RSS (by the spec).

Once that little catch was posted here at EOW, the feed mysteriously changed again. This time with more variations on the “http://” part of the URL… he changed the encryption algorithm.

Is he doing this to make sure that iPhoto can never read his feeds? Is he doing it to break third-party photo-downloaders? Are these just typos?

It’s hard to tell, but one thing is certain: what Dave is putting out is not well-formed RSS. If he’s borking the RSS to break other readers… isn’t that a little hypocritical?

Tags: ,

15 Responses to “Is FlickrFan’s AP Feed Proprietary RSS?”

  1. Wine-a-lot says:

    It remains to be seen whether this was an attempt at obfuscation (a better term than encryption).

    As it stands right now, all enclosures with http:// URIs are in entries more recent than Thursday, 24 Jan 2008, 18:35 (that is, the top part of the RSS file).

    To be generous to Dave, he might have noticed the slash error (independently, or perhaps after reading the comments here) and attempted to fix it by regenerating the entire feed, accidentally introducing a new typo in his software in the process (“htp://”). Before 18:35, he could have corrected the typo, but this time not regenerating the feed. (I assume when his software updates a feed it adds new items and removes the old, rather than building it from scratch.)

    I think a scenario something like this is actually more plausible than the conspiracy to break other readers you describe, as much as I would like to believe that’s what Dave did. For one, I bet the OPML editor would also refuse to fetch an “htp” URI. Someone could check, but I don’t want to take the time to install it.

  2. Jon says:

    I would be interested to know what library he’s using to download files (I’m assuming something that’s part of Frontier) since it shouldn’t be able to download from the URLs he’s putting in the feed. cURL correctly chokes on both the original http:/// and new htp://. Those should never have worked in the first place.

    And FWIW I’d like to point out that using the feed outside the FlickrFan script is perfectly legit. That’s what RSS feeds are for :) .

  3. Bullshit Mancuso says:

    This is a hilarious discovery. It says a lot that Winer’s changing the URL to keep people like McD from rolling their own AP photo screensavers. Could it really be possible that his software doesn’t support HTTP authentication?

  4. EyeOnWiner says:

    I’m not sure that “obfuscation” is a “better” way to describe what’s going on. On the one hand, it is a more obvious description. On the other, the term “encryption” is, in fact, correct and serves as a more striking example — for example, if I “obfuscated” the URLs with PK Encryption, would Dave support that, or would he whine that it’s not “real” RSS.

    I, for one, question the “typo” example for a few reasons. The first of which is… why on earth would he be hand-coding the RSS feed. Second, even if he was hand-coding the feed, why would he be re-typing the URLs instead of copying and pasting.

    Then again, as I pointed out, there’s no evidence of malice here… but do you really think the “///” was a repeated typo?

  5. Jon says:

    The odds of an original typo are slim (he has been using the web for well over a decade, /// would stick out). The odds of repairing that typo with another quite obvious one is even more slim. I’m betting that he just didn’t want it to work with every other app that does the same thing as his. A win for FlickrFan.

    And the fact that all the items aren’t the same is odd–it should be coming from a template. The way these things normally work is the feed is re-written with each update (it’s more work programmatically to find the right place to insert the new item than to just republish the template).

    In any event, it’s of no real concern because he’s going to use RSS to get these images into the OPML app and that means it will always be available for repurposing.

  6. EyeOnWiner says:

    Actually, to nerd out for a moment, by my way of thinking, doing the text editing programmatically is going to be more difficult than keeping a list of URLs in an array and plugging them into a template. In the former case you’re looking at, probably, a few dozen lines of code to do the XML manipulation. In the latter you’re looking at two or three lines, not including some echo/print/write statements for the urls and such.

    That said, if he’s hand-entering the URLs into a database, this problem would be replicated. Although if it were me, I wouldn’t store the “http://” in the database. It’s just wasted space.

  7. McD says:

    What I suspect Dave has done is use a real AP feed that he has purchased or been authorized to use.

    He uses that feed to upload photos to his S3 service.

    Then he creates an RSS feed with enclosures to his cache of AP photos.

    His RSS feed has the triple /// issue making it non-standard.

    His cache of AP photos are accessible at static.flickrfan.org which is a domain/server that points to amazon’s S3 service.

    He can legally download the photos, I assume. It’s the passing out of the photos to FlickrFan users (and anyone that can interpret his RSS feed and download the photos) that is suspect.

    I love the images. I’d like to have them keep coming but it probably won’t last to scale beyond a few hundred users of the service. Too bad.

    Most of Dave’s new product dreams are intended to route around content controls and ownership.

    He’d love to be a source for photos, audio (music and podcasts) and video but the realities of those media types are that the really good stuff is commercial. The non-commercial stuff tends to be worth something to some people. It’s the long tail effect that makes open source software and open media so damn tricky.

  8. Wine-a-lot says:

    I don’t deny that it’s possible Dave was trying to break other RSS readers, but I’m not convinced. (Actually, I wish I was. It would be great to catch him in some nefarious act.)

    I didn’t mean to suggest that Dave at any point hand-coded his RSS feed, but I do agree that the way I described my scenario, in which the feed is created by “textediting programmatically” (i.e. removing old items from the bottom, adding new at the top) doesn’t seem likely. Evidently the feed is generated by OPML Editor, and it would be simplest for the feed generating script to draw from its database the most recent n items and build a new file each time.

    Keeping in mind that the RSS feed is generated this way, and given that the top (most recent) entries of the RSS file are correct, I would modify my original scenario: Some time after our discussion on Wednesday, Dave makes a wholesale change to his database, a find & replace operation, changing all AP photo URIs from “http:///…” to “htp://…”. This would explain why older items which earlier on Wednesday had the triple-slash now have “htp”. Then he corrected his script that processes the images he downloads from AP, so that URI for each new item he puts into his database has the proper double slash.

    Incidentally, since my curiosity has been raised, I downloaded FlickrFan and did a few tests. OPML Editor is quite generous when interpreting URIs. In my own feed, a URI like “crap:////” was treated just like “http://”. In fact, “ftp://” was also treated as an HTTP URI. It’s not just the interpretation of feeds, either. Submit a messed up URI to the FlickrFan ‘app’ when adding your own feed to the list and it works fine there, too. It would appear this is a ‘feature’ of OPML Editor, not just the FlickrFan script. (Dave would say it’s Postel’s law in action, right? Be liberal in what you accept? Geez, I need to just stop reading him!)

  9. EyeOnWiner says:

    That’s certainly possible and it’s possibly even likely… although I really have a hard time believing that the original feed with the three slashes wasn’t intentional… even if the “htp” was the result of a find-and-replace typo.

    Probably the most likely scenario, in my mind, is that he figured out he could break iPhoto with a slightly tweaked feed… but then when he got caught he silently changed it. I also say this because of the bizarre change from /ap/ to /ap2/.

    There’s nothing I can find about the feed change in the changelog for FlickrFan itself, although it’s clear that Dave is trying to hide the feed.

  10. Jon says:

    Ha, crap://, that’s rich. What an interesting library that apparently strips the URI scheme and replaces it with http.

  11. EyeOnWiner says:

    My guess is that the library just ignores everything before and including :/* when it’s reading in what it “knows” to be a URL. I wonder if the scheme is even required.

  12. Brian says:

    EoW, you wouldn’t store http:// in the database? How would you handle local files, then? Or any of the many ways to access a resource that the generic URI syntax gives you? (I mean, it’d be insane to have a separate application trawl my own hard drive for pictures and generate a local rss file with file-local URI … but if I were so insane,I’d expect it to work)

  13. EyeOnWiner says:

    If you’re just going to ignore everything before the :// why store it in the DB?

  14. Wine-a-lot says:

    When I was playing with FlickrFan yesterday I saw that the whole URI (including “http://” or “http:///” or “htp://” or “whatever:///////”) is stored in the database, actually, at least as far as the photoFan script is concerned.

    I must say, the UI for OPML Editor is not particularly intuitive. I suspect there is a simple way to inspect the database and the scripts, but in my short time using the program it was certainly not obvious. Fortunately after opening and closing the program a few times I inadvertently caused errors (like a table somehow disappearing) that brought up windows to inspect the OPML of one of the running scripts and the database. I don’t have the program on my computer anymore, but one of the more confusing things is the way ‘app’ script functions are mixed into the menus with scripting environment actions.

  15. McD says:

    The enclosures URL’s have been changed… The triple slashes are gone.

    I need to update my blog. There are more new AP photos and Dave got a positive review from Lance Knoble. “No Bull” Knoble… he likes it.

    But what they always rave about are the photos. AP photos in near realtime… for free.

    Knoble wants every school to shows those images. I would expect many schools would like all the images. The world AP covers is not always that pleasing… it’s based upon a lot of conflict and political disorder.

    A feed of just great photography would be much less interesting for most. We typically get some nice photos already with a screensaver.