6 Jun 2012

x-www-form-urlencoded VS json - Pros and Cons. And Vulns.

In this short post I want to remind you how agile HTTP requests are. By "requests" we all mean GET and POST - these are the majority. POST contains "message", which is encoded in Internet media type etc - on wiki

By default <form method="post"> tag submits request with this header:
Content-Type:application/x-www-form-urlencoded
This is the default encoding format among HTTP requests. It was suffice just a few years ago - when all people have been sending nothing bigger and more complex than  "email=my@mail.com&name=John".

It's 2012 now, web became much more comprehensive, more rich and data sets are huge now. Developers scope related params in hashes/arrays - in a "tricky" way. If you want to have user["email"] on the server side you are supposed to send
<input name="user[email]">

but if you want user[emailS"] - array of emails, you should send
<input name="user[emails][]">

Application accumulates all params one by one and put them in the corresponding variables. This attitude is full of bugs and incompatibilities. Let me give you a hence.



Advantages of using JSON as format of your POST body.
  • jQuery encoding issue:
$.post('', {arr: [ [1,2,1,2] ]})
This code produces
arr[0][]:1
arr[0][]:2
arr[0][]:1
arr[0][]:2
Which is totally different from original array because '0' is a string.
$.post('', {passengers:[{hi:1}, {hi:2}] })
Produces
passengers[0][hi]:1
passengers[1][hi]:2

But supposed to send
passengers[][hi]:1
passengers[][hi]:2

and so on. You have to care about JSON encoding process yourself - jQuery works properly only on small data sets.
  • Default encoded string is longer and also looks ugly and barely readable.  
{"passengers":[{"name":"Egor", "role":"pilot"},{"name":"DHH", "role":"2pilot"}]}

is much nicer than

passengers[][name]=Egor&passengers[][role]=pilot&passengers[][name]=DHH&passengers[][role]=2pilot
  • It's not a new attitude. Cool teams already use it!
Some popular sites which I like and respect are in favor of JSON. Google wallet, Google+, etc

Vulns:
I am very proud of you if your application uses JSON as default data format. But I'm gonna be disappointed if you really think it mitigates CSRF. It does if you have whitelisted Content-Type of all requests to application/json. But if you just decode any input postBody - here is (I found it myself but it's already known) a work around.

Showcase with csrf-ed following - typepad.com

<form method=post enctype="text/plain" action=http://profile.typepad.com/services/json-rpc><input name='{"a' value='":1,"method":"People.Create","params":[{"other_user_id":"6p00d8341c914353ef","ugroup_id":[10]}]}'></form>

Recap:
  • I strongly encourage you to use JSON format in body of POSTs everywhere and get rid of poor URI encoded strings and PHP/Python use Ruby Luke
  • suits if your objects are complicated, contains nested structures.
  • It allows you to send any kind of object e.g. [[1,2],[3,4]] - Array(array,array..) This is just impossible to send in uri-encoded string. 
  • It's not comfortable if you send simple "name=egor&type=lulzsec" since it will look "verbose". 
  • be aware that not all browsers have JSON built-in so far. Use json2/3 library if you want to make it work.
  • used in good libs. http://emberjs.com/ is awesome, Ruby on rails parses JSON body automatically - just send content-type: application/json.
  • different from default format of body doesn't prevent CSRF. You can submit fake JSON and XML using name/value-splitting tricks and enctype=text/plain unless application whitelisted content-type. Authenticity token is MUST have.
  • You CAN omit CSRF tokens with this! Yes, those ugly long useless damn stupid buggy w3c-made-web-insecure tokens. You just need to whitelist Content-Type = application/json. :)
Do you use JSON format in body? Are you happy with that? Welcome to comments!