Python JSON modules benchmark
We are currently working on a project that loads and dumps very big JSON objects. We recently had some slowness issues and though giving a try to some C compiled JSON modules could give a boost to our API.
After some googling we found out a few exsting modules and decided to setup a benchmark to decide which one to use. The selected modules are:
We wrote a little code snippet to automate our benchmark. It is using the timeit standard module.
from timeit import timeit
modules = ['json', 'cjson', 'jsonlib', 'simplejson', 'ujson', 'yajl']
NUMBER = 20
for module in modules:
loads = "loads"
dumps = "dumps"
if module == "cjson":
loads = "decode"
dumps = "encode"
print "[%s]" % module
load_time = timeit("obj = json.%s(data)" % loads, setup="import %s as json; data = open('file.json').read()" % module, number=NUMBER)
dump_time = timeit("json.%s(obj)" % dumps, setup="import %s as json; obj = json.%s(open('file.json').read())" % (module, loads), number=NUMBER)
print "Load time: ", load_time
print "Dump time: ", dump_time
Notes:
- The file.json is 15Mb.
- cjson is not a drop-in replacement of the standard json module. We have to patch the loads and dumps functions.
- Loading and dumping are done 20 times for a better accuracy
And the winners are:
[json] Load time: 12.7440698147 Dump time: 6.37574481964 [cjson] Load time: 6.24644708633 Dump time: 11.6356570721 [jsonlib] Load time: 13.1087989807 Dump time: 15.4686369896 [simplejson] Load time: 4.67061305046 Dump time: 6.65470814705 [ujson] Load time: 5.84970593452 Dump time: 5.00060105324 [yajl] Load time: 6.7509560585 Dump time: 16.4374480247
simplejson for loading and ujson for dumping. Choose one or the other depending if your program is more loading or dumping data.