Sunday, 25 September 2011

Why Facebook's 'Frictionless Sharing' violates HTTP

Facebook has this new feature, whereby the act of simply reading a web page, under certain conditions, gets it posted to your news feed, for your friends to see. Here's how ReadWriteWeb puts it
With these apps you're automatically sending anything you read into your Facebook news feed. No "read" button. No clicking a "like" or "recommend" button. As soon as you click through to an article you are deemed to have "read" it and all of your Facebook friends and subscribers will hear about it. That could potentially cause you embarrassment and it will certainly add greatly to the noise of your Facebook experience. 
Facebook calls this 'frictionless sharing'. This has raised all sorts of ‘creepy’ flags, and rightfully so. A big reason for this is that it breaks a fundamental contract of web interaction, in place since the beginnings of the web, that users have come to rely upon. This contract is the fact that merely browsing a webpage (Executing a GET in HTTP talk) should not cause effects that you, the visitor, are responsible for. Posting to your news feed is a side-effect, is a direct side-effect of your reading the article. You take no extra step to authorize this. 

This violates a convention that is not there by accident. The HTTP Specification defines GET as a ‘safe’ operation, with certain guarantees. This line has been skirted for a very long time, but never by a company of this size, so publicly, and so blatantly. This is what the HTTP Spec has to say on the matter: 
9.1.1 Safe Methods
Implementors should be aware that the software represents the user in their interactions over the Internet, and should be careful to allow the user to be aware of any actions they might take which may have an unexpected significance to themselves or others. In particular, the convention has been established that the GET and HEAD methods SHOULD NOT have the significance of taking an action other than retrieval. These methods ought to be considered "safe". 
[…] Naturally, it is not possible to ensure that the server does not generate side-effects as a result of performing a GET request; in fact, some dynamic resources consider that a feature. The important distinction here is that the user did not request the side-effects, so therefore cannot be held accountable for them. (emphasis mine) 
I don’t think it gets any clearer than that. It’s as if the HTTP committee had looked into the future and was personally addressing Mr. Zuckerberg. Now, the HTTP spec has no teeth. There is no enforcement body that goes around and metes out fines and punishment to the violators. It is a gentlemen’s agreement and the contract that good citizens of the web should keep. As such, I think it merits at least a mention when large companies find new and ‘frictionless’ ways to undermine the foundation upon which they (and everyone else) is building on.


Update: A number of people are pointing out the fact that the user authorizes the side effects by installing the app on facebook. However, I assume Facebook also agrees to the HTTP Spec by implementing it. Does getting user authorization allow you to violate HTTP? I don't see any such language in the spec. I think the safeness of GET is one of those rights that you shouldn't be able to give away, even if you wanted to, as doing so undermines the web for everyone else.



If you read this far, consider following me on twitter

18 comments:

mrjohnsly said...

Correct me if I'm wrong about this, but wouldn't the user have to initial agree to the 'frictionless sharing'? So the user would of requested the side effects?

As I understand it, a website would need to implement the new Open Graph features and then the user would have to agree to the 'frictionless sharing' for each site that implemented it.

johndburger said...

But the user =did= request the side-effects, when they gave the app those permissions.

Rob K said...

The "gentlemen’s agreement and the contract that good citizens of the web should keep" went out the window years ago when ICANN & Network Solutions allowed squatters to reserve every last viable URL then hold them hostage for exorbitant prices.

b00giZm said...

This posting is total BS!

Have you ever read the Open Graph developer documentation? An app still has to call Facebook via an POST request, if a user has taken an action that should be published to his stream.

https://developers.facebook.com/docs/beta/opengraph/

I don't see why this should be a violation

Keith Moore said...

You're conflating two very different concepts here. Simply issuing an HTTP GET request on a non-Facebook page will not magically inform Facebook of your reading habits. "Frictionless sharing" is implemented via rich, scriptable browsers and various higher-level constructs.

Try reading a non-Facebook website using CURL or WGET and see if Facebook gets notified.

There is no HTTP violation here.

Anonymous said...

You realize by having google analytics on your site, you are violating HTTP in exactly the same fashion?

Alexandros Marinos said...

@mrjohnsly & @johndburger: see the update to the article

@b00giZm: If it was that simple to circumvent the restriction, it wouldn't be written there to begin with.

@Keith Moore: Actually, simply GETting the like button while having their cookie lets facebook know you read the article. The actual sharing may need javascript, but who says javascript is allowed to circumvent HTTP's guarantees?

@Anonymous: No, I'm not. The side effects GA causes are not ones you are responsible for. The spec is really clear on this, give it a read.

Anonymous said...

I agree with the author. There is certainly legitimate cause for concern here. How are we to know how clairvoyant Facebook is when requesting the necessary app permissions? This is just a recipe for disaster, or best case scenario, having your News Feed polluted with needless information.

Oliver said...

Two points:
1. Facebook could make a single, trivial change to their JavaScript and make a POST request instead of a GET (assuming that you are correct in stating that they currently implement this using GET). There would be no impact to users, and no observable impact in general, other than those few letters being sent across the wire. Do you really think this is an important point? You seem to be confusing the user-initiated GET request for an HTML page (which does nothing to communicate with Facebook) with subsequent JavaScript-initiated requests made to Facebook after the fact.

2. People have been abusing HTTP for years. This is nothing new, and is certainly not unique to Facebook. See Google Analytics, and many, many web forms and/or URLs that put actions in query parameters. Not necessarily a good idea, for reasons the RFC points out, but very commonly done, nonetheless (those who put links in their pages with targets like ...?action=delete do so at their own risk). To begin waving around the RFC as if violating it is like breaking the law is to ignore the greater part of the protocol's practical usage to date. In fact, nearly every GET request on the web has a side-effect. Most web servers log each request at some level. The only difference here is what's done with that information.

Phil Hollows said...

I think you're wrong. The spec for HTTP and its safety refers to that atomic operation, that one sepecific, not the semantics of a web page (many, many HTTP operations) or any other application using the transport. That's why it is a *transport* spec.

What you're concerned about - with some reason, but I think the case is overblown - is the application layer and HTML and Facebook's use of the data provided to it by the web page itself. This has nothing to do with transport specs, client-server ineteractions, HTTP or any other transport layer protocol. It has everything to do with application space semantics, user expectations and privacy. The prior commenter who argues that you are conflating different things is correct. You've muddied your argument with this hysterical hokum about HTTP.

Subbu Allamaraju said...

The last commenter is right. The is no HTTP "violation" here.

Anonymous said...

[Disclaimer: I'm an engineer at Facebook. The views here are my own, and do not represent those of Facebook. Posting anonymously only because I don't want people pestering me.]

There are numerous examples of websites saving state based solely on a user's browsing habits. Witness Amazon's "Recently Viewed" products, Google's "Search History", etc

Second, the actual request that updates your facebook feed is a POST request, generated by whatever application the user happens to be using on Facebook at the time.

I don't think this debate can be resolved by quoting the HTTP spec. While I agree this is an interesting issue, it's best treated as an ethical discussion, not a technical one.

The real question is, "is it okay for sites to track (and publish!) your activity?" Hard-liners would say, "no, it's not", but I don't believe that's very realistic. There are clearly times when this is beneficial to users (e.g. people that *want* to share their online activity with their friends.) So what are the constraints within which sites like Facebook and Facebook App developers should operate?

Facebook goes to great lengths to make sure users understand what permissions they are and aren't granting when they use a Facebook app, precisely because this is such a sensitive issue.

These are the ideas and opinions I'm most interested in. I'd certainly welcome any [thoughtful] comments on this topic. :)

Alexandros Marinos said...

Hi anonymous Facebook employee, thanks for your comment.

I've been seeing comments like these all of yesterday and I have had the chance to sleep on them.

I think it is useful to classify side-effects.

1. Using electricity/server capacity - a side-effect of all GET requests in history. Not a problem

2. Logging on the server - most anyone will do this, including the W3C I suppose. Clearly not a problem

3. Outsourcing logging to Google Analytics or similar - If 2. is not a problem, I can't see how this could be.

4. Visit counters on a page - Remember these? These could be construed as violations, but they are so trivial as to be ignorable.

5. Amazon's similar items, Google's search history - Now we're taking the logging to a new (user-visible) dimension. These are side-effects that the user gets to see. I could understand how this could be seen as a violation. As far as I am concerned, it's not as it is not an action taken on the user's behalf.

6. Then we have Frictionless Sharing. This is only a step further than #5, but It's where the line is getting crossed by my judgement. I don't see how one can claim that there is not an action being taken on the users' behalf. There is a technical argument being made (that a POST follows GET), which is definitely intriguing. See below.

7. Here I will put what is the obvious violation, as far as I can tell. a website that uses URIs of the form http://ex.org/resource/5?method=delete. I *think* people agree this is a violation, but am not sure anymore.

If one was to implement #7 as an html page with a javascript payload that deterministically deleted the resource, would that make it OK? I am not sure, but I think not. It feels like a legalistic workaround, one that wouldn't really convince anyone that wasn't already convinced. If using javascript voids the guarantees of HTTP, this too should be known. If the defense doesn't work for 7, it can't work for 6. If it does, we've broken new ground in our understanding of HTTP and REST. [...]

Alexandros Marinos said...

[cont'd]

Overall, I can see how the arguments being made against my position can sound convincing. But if one takes them to be correct, they make the concept of safeness meaningless, as far as I can tell. Maybe they reduce it to idempotentcy, but then why have two properties defined in the HTTP spec? Perhaps there was an error? If so, it'd be good to establish this.

But there is another argument being made: What about those who *want* to share everything by default? This is a good one and has given me pause for thought. I think the best thing to do is create a user agent extension that, on GET of certain resources, would perform the appropriate POST, to facebook or elsewhere. If this was the case, I think people would be much less creeped out, and there would be a much clearer spearation between the layers. It is clear how to enable and disable an extension. It is not as clear how to opt out of frictionless. See people on the HN thread noticing a number of apps on their facebook account that they never authorized. Even if it is done correctly now, we have reduced an iron-clad guarantee into one that depends on some implementer getting it right when getting permission. You know it's going to go wrong, sooner or later.

Finally, I want to say I am not making a moral argument against frictionless sharing. I am making a technical one. Am I pissed off about it? Maybe slightly miffed because I can see how my mother could get duped into sharing more than she would like. But this is not my point here.

My point is that there used to be a guarantee that reading and writing were separate on the web. Yes, analytics and other things have skirted the line for years, I say that in the post. As far as I can see the wall is getting mowed down by this new concept. Again, I am not making a moral argument, just an observation, maybe a bit of sadness in the 'this is why we can't have nice things' sort of way. But I know not to blame the player when the game is flawed.

Not sure if this is making any more sense than the original post. I should probably clean it up and put it up as a separate post.

Thanks for spurring me to write this out.

Anonymous said...

Isn't this illegal in the EU?

Selo Banya info portal said...
This comment has been removed by a blog administrator.
diet pills said...
This comment has been removed by a blog administrator.
Anonymous said...

What is your concern here? That the HTTP spec is being violated, or that your don't like facebook posting things to your wall automatically?

It very much looks like its the latter, but for some reason you are focusing on the unimportant fact that the spec might not being followed word for word... (which happens all the time btw, no big deal)

Also there is ZERO difference between what Amazon/Google are doing with your browsing history and what facebook are doing, if you are coming at it from the 'browsing to a site should never generate content under my name' angle.